Machine based human action recognition has become very popular in the last decade. Automatic unattended surveillance systems, interactive video games, machine learning and robotics are only few of the areas that involve human action recognition. This thesis examines the capability of a known transform, the so-called Trace, for the task of human action recognition and proposes two new feature extraction methods based on the specific transform. The first method, extracts Trace transforms from binarized silhouettes representing different stages of a single action period. A final history template composed from the above transforms, represents the whole sequence containing much of the valuable spatiotemporal information contained in a human action. The second, involves Trace for the construction of a set of invariant features that represents the action sequence and can cope with variations usually appeared in video capturing. The specific method takes advantage of the natural specifications of the Trace transform, to produce noise robust features that are invariant to translation, rotation, scaling and are effective, simple and fast to create. As a follow-up to the above developments, a new technique has been developed, which extends the last referred method to the 3D domain, creating for the first time in bibliography a 3D form of the Trace, named 3D Cylindrical Trace transform. Combined with the spatiotemporal interest points (STIPs), it was applied for the extraction of robust features from videos, both for the scenarios of human action recognition and human fall detection. Classification experiments on five popular and demanding datasets, using SVMs of RBF kernel, provided impressive results indicating the potentials of the proposed techniques. Finally, trying to give prominence to the challenges in the action recognition field, a new dataset named THETIS was created and proposed. The experimental evaluation of THETIS indicated a very challenging dataset which has already attracted researcher’s interest.
|