Garuda - Garba Rujukan Digital

International Journal Software Engineering and Computer Science (IJSECS)

Vol. 5 No. 2 (2025): AUGUST 2025

Karthika, V. (Unknown)
Siva Ganesh, A. (Unknown)

Publish Date
01 Aug 2025

The research develops a technology-driven solution to enhance Over-The-Top (OTT) services for Smart TVs by leveraging advanced speech recognition, video analysis, and natural language processing technologies. The system incorporates TransNetV2 for AI-based scene boundary detection, Porcupine for hotword detection, and cutting-edge Automatic Speech Recognition (ASR) engines including Vosk, Whisper, and DeepSpeech for real-time speech-to-text conversion. Natural Language Processing (NLP) employs BERT and spaCy to interpret user intent and temporal commands from spoken instructions. Video content undergoes processing through FFmpeg and OpenCV for frame manipulation and visualization, while implementing intelligent content classification and scene understanding via YOLO and ResNet. The platform architecture combines Flutter for cross-platform deployment across Smart devices with a Python Flask backend ensuring seamless module integration and operational functionality. Testing results demonstrate the system's capability to execute real-time, hands-free media control while delivering an intuitive and accessible user experience for contemporary OTT applications.

Citation Download

EndNote, Reference Manager, ProCite

Latex, Jabref

Check in Google Scholar

Journal Info

International Journal Software Engineering and Computer Science (IJSECS)

Website

Abbrev

ijsecs

Publisher

Lembaga Komunitas Informasi Teknologi Aceh

Subject

Computer Science & IT

Description

IJSECS is committed to bridge the theory and practice of information technology and computer science. From innovative ideas to specific algorithms and full system implementations, IJSECS publishes original, peer-reviewed, and high quality articles in the areas of information technology and computer ...

Article Info

Abstract

Hands-Free Video Player: Enhancing Accessibility with Voice-Controlled Navigation

Article Info

Abstract