Deep Learning Solutions for Continuous Action Recognition Using Fusion of Inertial and Video Sensing and For Far Field Video Surveillance



Journal Title

Journal ISSN

Volume Title



This dissertation addresses deep learning solutions for two applications. The first application involves performing continuous human action recognition by simultaneous utilization of inertial and video sensing. The objective in this application is to achieve a more robust continuous action recognition compared to using a single sensing modality by simultaneously utilizing a video camera and a wearable inertial sensor. A deep learning solution is developed that differs from the action recognition approaches reported in the literature in two ways: (i) The detection and recognition of actions are carried out for continuous action streams and not on segmented actions, which is the assumption normally made in existing action recognition approaches. (ii) It provides the first attempt at using video and inertial sensing together or simultaneously in order to achieve continuous action recognition. As part of this effort, a Continuous Multimodal Human Action Dataset (named C-MHAD) is collected and made publicly available. The second application involves detecting persons and the load they carry in far field video surveillance data. The objective in this application is to detect persons and to classify the load carried by them from video data captured from distances several miles away via high-power lens video cameras. A deep learning solution is developed to cope with the following two major challenges: (i) Far field video data suffer from various noises caused by wind, heat haze, and the camera being out of focus thus generating blurriness of persons appearing in video images. (ii) The available dataset is small and lack no frame-level labels. The results obtained indicate the effectiveness of the developed deep learning solutions.



Machine learning, Video surveillance, Human activity recognition