Garuda - Garba Rujukan Digital

International Journal of Electrical and Computer Engineering

Vol 12, No 3: June 2022

Dang Thi Phuc (Industrial University of Ho Chi Minh City)
Tran Quang Trieu (Industrial University of Ho Chi Minh City)
Nguyen Van Tinh (Industrial University of Ho Chi Minh City)
Dau Sy Hieu (Viet Nam National University HCMC)

Publish Date
01 Jun 2022

With the development of today's society, demand for applications using digital cameras jumps over year by year. However, analyzing large amounts of video data causes one of the most challenging issues. In addition to storing the data captured by the camera, intelligent systems are required to quickly analyze the data to correct important situations. In this paper, we use deep learning techniques to build automatic models that describe movements on video. To solve the problem, we use three deep learning models: sequence-to-sequence model based on recurrent neural network, sequence-to-sequence model with attention and transformer model. We evaluate the effectiveness of the approaches based on the results of three models. To train these models, we use microsoft research video description corpus (MSVD) dataset including 1970 videos and 85,550 captions translated into Vietnamese. In order to ensure the description of the content in Vietnamese, we also combine it with the natural language processing (NLP) model for Vietnamese.

Citation Download

EndNote, Reference Manager, ProCite

Latex, Jabref

Check in Google Scholar

Journal Info

International Journal of Electrical and Computer Engineering

Website

Abbrev

IJECE

Publisher

Institute of Advanced Engineering and Science

Subject

Computer Science & IT Electrical & Electronics Engineering

Description

International Journal of Electrical and Computer Engineering (IJECE, ISSN: 2088-8708, a SCOPUS indexed Journal, SNIP: 1.001; SJR: 0.296; CiteScore: 0.99; SJR & CiteScore Q2 on both of the Electrical & Electronics Engineering, and Computer Science) is the official publication of the Institute of ...

Article Info

Abstract

Video captioning in Vietnamese using deep learning

Article Info

Abstract