In the era of globalization, information and communication technology has rapidly developed, driving changes in various aspects of life, including the way people communicate. One of the main challenges is cross-linguistic communication, particularly in understanding foreign languages such as Japanese, which uses characters that are different from the Latin alphabet. This study develops a web-based automatic voice translation application that can recognize speech, translate it automatically, and generate human-like speech based on the translation results. The application utilizes three main technologies: Speech to Text, Machine Translation, and Text to Speech. Speech to Text and Text to Speech are implemented using the Web Speech API, while Machine Translation is implemented using the Google Translation API. The Web Speech API uses Recurrent Neural Network (RNN), and the Google Translate API uses Transformer, both of which are methods from Deep Learning algorithms. This application is designed to facilitate cross-lingual communication without the need for typing, manually translating, or directly speaking in a foreign language.
Copyrights © 2025