Claim Missing Document
Check
Articles

Found 2 Documents
Search

Advancing Voice Anti-Spoofing Systems: Self-Supervised Learning and Indonesian Dataset Integration for Enhanced Generalization Prihasto, Bima; Nur Farid, Mifta; Al Khairy, Rafid
Brilliance: Research of Artificial Intelligence Vol. 4 No. 2 (2024): Brilliance: Research of Artificial Intelligence, Article Research November 2024
Publisher : Yayasan Cita Cendekiawan Al Khwarizmi

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47709/brilliance.v4i2.5182

Abstract

This study examines how self-supervised learning and a novel Indonesian language dataset enhance anti-spoofing systems. Results show improved model performance, with a lower Equal Error Rate (EER) during training, indicating effective learning from diverse audio samples. Using weighted cross-entropy analysis highlights the model's robustness in minimizing training errors. Comparisons with baseline models using English data reveal the proposed approach's superiority, achieving a significantly lower EER due to the incorporation of language-specific data. The unique phonetic features of Indonesian languages provide valuable training material, boosting the system's defence against spoofing attacks. The dataset improves generalization across dialects and recording conditions by including diverse speech samples. This integration enhances the anti-spoofing systems' adaptability, which is vital for real-world applications where recording variability affects performance. The experimental setup used a balanced dataset of genuine and spoofed utterances from male and female speakers, ensuring high-quality input. The training configuration splits the dataset into training, development, and testing sets on a high-performance computing setup. Results showed the proposed model achieved an EER of 0.33, compared to 7.65 for the traditional sinc-layer model and 0.82 for the wav2vec 2.0 model with English data. Overall, this research advances anti-spoofing solutions and emphasizes the need for diverse datasets and advanced learning approaches to improve automatic speaker verification systems in practical applications. The incorporation of the Indonesian dataset is vital for addressing linguistic diversity challenges in biometric security, paving the way for future advancements in this area.
Advancing Voice Anti-Spoofing Systems: Self-Supervised Learning and Indonesian Dataset Integration for Enhanced Generalization Prihasto, Bima; Nur Farid, Mifta; Al Khairy, Rafid
Brilliance: Research of Artificial Intelligence Vol. 4 No. 2 (2024): Brilliance: Research of Artificial Intelligence, Article Research November 2024
Publisher : Yayasan Cita Cendekiawan Al Khwarizmi

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47709/brilliance.v4i2.5182

Abstract

This study examines how self-supervised learning and a novel Indonesian language dataset enhance anti-spoofing systems. Results show improved model performance, with a lower Equal Error Rate (EER) during training, indicating effective learning from diverse audio samples. Using weighted cross-entropy analysis highlights the model's robustness in minimizing training errors. Comparisons with baseline models using English data reveal the proposed approach's superiority, achieving a significantly lower EER due to the incorporation of language-specific data. The unique phonetic features of Indonesian languages provide valuable training material, boosting the system's defence against spoofing attacks. The dataset improves generalization across dialects and recording conditions by including diverse speech samples. This integration enhances the anti-spoofing systems' adaptability, which is vital for real-world applications where recording variability affects performance. The experimental setup used a balanced dataset of genuine and spoofed utterances from male and female speakers, ensuring high-quality input. The training configuration splits the dataset into training, development, and testing sets on a high-performance computing setup. Results showed the proposed model achieved an EER of 0.33, compared to 7.65 for the traditional sinc-layer model and 0.82 for the wav2vec 2.0 model with English data. Overall, this research advances anti-spoofing solutions and emphasizes the need for diverse datasets and advanced learning approaches to improve automatic speaker verification systems in practical applications. The incorporation of the Indonesian dataset is vital for addressing linguistic diversity challenges in biometric security, paving the way for future advancements in this area.