Combined static and motion features for deep-networks-based activity recognition in videos

Loading...
Thumbnail Image

Date

2019

Journal Title

Journal ISSN

Volume Title

Publisher

IEEE

Abstract

Activity recognition in videos in a deep-learning setting—or otherwise—uses both static and pre-computed motion components. The method of combining the two components, whilst keeping the burden on the deep network less, still remains uninvestigated. Moreover, it is not clear what the level of contribution of individual components is, and how to control the contribution. In this work, we use a combination of CNNgenerated static features and motion features in the form of motion tubes. We propose three schemas for combining static and motion components: based on a variance ratio, principal components, and Cholesky decomposition. The Cholesky decomposition based method allows the control of contributions. The ratio given by variance analysis of static and motion features match well with the experimental optimal ratio used in the Cholesky decomposition based method. The resulting activity recognition system is better or on par with existing state-of-theart when tested with three popular datasets. The findings also enable us to characterize a dataset with respect to its richness in motion information.

Description

Keywords

Activity recognition, Fusing features, Convolutional Neural Networks (CNN), Recurrent Neural Networks (RNN), Long Short-Term Memory (LSTM)

Citation

Ramasinghe, S., Rajasegaran, J., Jayasundara, V., Ranasinghe, K., Rodrigo, R., & Pasqual, A. A. (2019). Combined static and motion features for deep-networks-based activity recognition in videos IEEE Transactions on Circuits and Systems for Video Technology, 29(9), 2693–2707. https://doi.org/10.1109/TCSVT.2017.2760858