Video classification using compacted dataset based on selected keyframe

Reza Fuad Rachmadi, Keiichi Uchimura, Gou Koutaki

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

7 Citations (Scopus)


Shared human actions in the video are the biggest problem for video classification system. For example, long jump sports video will share a running action with the long jump or running sports video. In this paper, we present a video classification system by combining the keyframe extractor system and convolutional neural network (CNN) classifier. The visual attention modeling was used to build the keyframe extractor system and top k frames with the highest saliency value is chosen for the classification process. By using the top k keyframe with the highest saliency value, it may reduce the shared action of the video and makes the classifier easier to classify the video by using only the spatial features. The keyframe extracted from video summarization method was used for training process, which in our system proved very efficient and speed up the training process. As a result, our system is effective and the average accuracy is increased compared with the system without using the keyframe extractor system. Our proposed method also outperforms the system using video summarization method as keyframe extractor system by around 3%.

Original languageEnglish
Title of host publicationProceedings of the 2016 IEEE Region 10 Conference, TENCON 2016
PublisherInstitute of Electrical and Electronics Engineers Inc.
Number of pages6
ISBN (Electronic)9781509025961
Publication statusPublished - 8 Feb 2017
Event2016 IEEE Region 10 Conference, TENCON 2016 - Singapore, Singapore
Duration: 22 Nov 201625 Nov 2016

Publication series

NameIEEE Region 10 Annual International Conference, Proceedings/TENCON
ISSN (Print)2159-3442
ISSN (Electronic)2159-3450


Conference2016 IEEE Region 10 Conference, TENCON 2016


Dive into the research topics of 'Video classification using compacted dataset based on selected keyframe'. Together they form a unique fingerprint.

Cite this