Claim Missing Document
Check
Articles

Found 23 Documents
Search

Multi Kelas Speaker Recognition Menggunakan Deep Learning dengan CN-Celeb Dataset Martulandi, Adipta; Zahra, Amalia
Building of Informatics, Technology and Science (BITS) Vol 4 No 3 (2022): December 2022
Publisher : Forum Kerjasama Pendidikan Tinggi

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47065/bits.v4i3.2467

Abstract

Speaker recognition has been widely applied in various fields of human life such as Siri from Apple, Cortana from Microsoft, and Voice Assistant by Google. One of the problems when creating speaker recognition is related to the dataset used for the modeling process. The dataset used for creating the speaker recognition model is mostly data that cannot represent real-world conditions. The result is when implemented in the real-world conditions are not optimal. This study develops a speaker recognition model using deep learning (LSTM) with the CN-Celeb dataset. The CN-Celeb dataset is data taken directly from the real world so there is a lot of noise. The hope of using this dataset is that it can represent real world conditions. Model development uses 2 stacked LSTM for multi-class speaker recognition tasks. In addition, this study performs tuning hyperparameters with a grid search method to obtain the most optimal model configuration. The results showed that the EER value of the LSTM model was 10.13% better than the reference baseline paper of 15.52%. In addition, when compared with other studies that also used the CN-Celeb dataset but using different models, it was found that the LSTM model had promising results. From the results of study that has been carried out and also compared with other people's research, it was found that the LSTM model gave promising performance. The LSTM model is compared with the x-vectors, PLDA, TDNN, and transformers models
Recommendation mobile antivirus for Android smartphones based on malware detection Saputra, Hendra; Zahra, Amalia; Faldi, Faldi; Fadzlul Rahman, Ferry; Harits, Sayekti; Joko Pranoto, Wawan; Rahman, Fathur
IAES International Journal of Artificial Intelligence (IJ-AI) Vol 13, No 3: September 2024
Publisher : Institute of Advanced Engineering and Science

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.11591/ijai.v13.i3.pp3559-3566

Abstract

The proliferation of smartphone malware attacks due to a lack of vigilance in app selection raises serious concerns. Built-in smartphone security features often must be improved to protect devices from these threats. Although numerous articles recommend top-tier antivirus solutions, there need to be more reliable data sources that raise suspicions about undisclosed promotional motives. This research endeavors to establish a ranking of antivirus efficacy to provide optimal recommendations for Android smartphone users. The research methodology entails a meticulous comparison of malware detection and labeling outcomes between various antivirus programs within Virustotal and the labeling system employed by the Euphony application. The comparative results are categorized into three groups: antivirus solutions proficient in identifying specific malware types, those detecting malware presence without categorization, and antivirus software failing to detect malware effectively. The experimental findings present the five leading antivirus solutions, ranked from the highest to lowest scores, as Ikarus, Fortinet, ESET-NOD32, Avast-Mobile, and SymantecMobileInsight. Based on the comprehensive assessment conducted in this study, these solutions are recommended as the top antivirus choices. These recommendations are poised to significantly aid users in selecting the most suitable antivirus protection for their Android smartphones.
Multimodal music emotion recognition in Indonesian songs based on CNN-LSTM, XLNet transformers Sams, Andrew Steven; Zahra, Amalia
Bulletin of Electrical Engineering and Informatics Vol 12, No 1: February 2023
Publisher : Institute of Advanced Engineering and Science

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.11591/eei.v12i1.4231

Abstract

Music carries emotional information and allows the listener to feel the emotions contained in the music. This study proposes a multimodal music emotion recognition (MER) system using Indonesian song and lyrics data. In the proposed multimodal system, the audio data will use the mel spectrogram feature, and the lyrics feature will be extracted by going through the tokenizing process from XLNet. Convolutional long short term memory network (CNN-LSTM) performs the audio classification task, while XLNet transformers performs the lyrics classification task. The outputs of the two classification tasks are probability weight and actual prediction with the value of positive, neutral, and negative emotions, which are then combined using the stacking ensemble method. The combined output will be trained into an artificial neural network (ANN) model to get the best probability weight output. The multimodal system achieves the best performance with an accuracy of 80.56%. The results showed that the multimodal method of recognizing musical emotions gave better performance than the single modal method. In addition, hyperparameter tuning can affect the performance of multimodal systems.
Classifying possible hate speech from text with deep learning and ensemble on embedding method Caprisiano, Ebenhaiser Jonathan; Ramadhansyah, Muhammad Hafizh; Zahra, Amalia
Bulletin of Electrical Engineering and Informatics Vol 13, No 3: June 2024
Publisher : Institute of Advanced Engineering and Science

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.11591/eei.v13i3.6041

Abstract

Hate speech can be defined as the use of language to express hatred towards another party. Twitter is one of the most widely used social media platforms in the community. In addition to submitting user-generated content, other users can provide feedback through comments. There are several users who intentionally or unintentionally provide negative comments. Even though there are regulations regarding the prohibition of hate speech, there are still those who make negative comments. Using the deep learning method with the long short-term memory (LSTM) model, a classifier of possible hate speech from messages on Twitter is carried out. With the ensemble method, term frequency times inverse document frequency (TF-IDF) and global vector (GloVe) get 86% accuracy, better than the stand-alone word to vector (Word2Vec) method, which only gets 80%. From these results, it can be concluded that the ensemble method can improve accuracy compared to only using the stand-alone method. Ensemble methods can also improve the performance of deep learning systems and produce better results than using only one method.
Speech emotion recognition with optimized multi-feature stack using deep convolutional neural networks Fadhil, Muhammad Farhan; Zahra, Amalia
Bulletin of Electrical Engineering and Informatics Vol 13, No 6: December 2024
Publisher : Institute of Advanced Engineering and Science

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.11591/eei.v13i6.6044

Abstract

The human emotion in communication plays a significant role that can influence how the context of the message is perceived by others. Speech emotion recognition (SER) is one of a field study that is very intriguing to explore because human-computer interaction (HCI) related technologies such as virtual assistant that are implemented nowadays rarely considered the emotion contained in the information relayed by human speech. One of the most widely used ways to perform SER is by extracting features of speech such as mel frequency cepstral coefficient (MFCC), mel-spectrogram, spectral contrast, tonnetz, and chromagram from the signal and using a one-dimensional (1D) convolutional neural network (CNN) as a classifier. This study shows the impact of implementing a combination of an optimized multi-feature stack and optimized 1D deep CNN model. The result of the model proposed in this study has an accuracy of 90.10% for classifying 8 different emotions performed on the ryerson audio-visual database of emotional speech and song (RAVDESS) dataset.
Enhancing speech emotion recognition with deep learning using multi-feature stacking and data augmentation Al Mukarram, Khasyi; Mukhlas, M. Anang; Zahra, Amalia
Bulletin of Electrical Engineering and Informatics Vol 13, No 3: June 2024
Publisher : Institute of Advanced Engineering and Science

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.11591/eei.v13i3.6049

Abstract

This study evaluates the effectiveness of data augmentation on 1D convolutional neural network (CNN) and transformer models for speech emotion recognition (SER) on the Ryerson audio-visual database of emotional speech and song (RAVDESS) dataset. The results show that data augmentation has a positive impact on improving emotion classification accuracy. Techniques such as noising, pitching, stretching, shifting, and speeding are applied to increase data variation and overcome class imbalance. The 1D CNN model with data augmentation achieved 94.5% accuracy, while the transformer model with data augmentation performed even better at 97.5%. This research is expected to contribute better insights for the development of accurate emotion recognition methods by using data augmentation with these models to improve classification accuracy on the RAVDESS dataset. Further research can explore larger and more diverse datasets and alternative model approaches.
Multimodal speech emotion recognition optimization using genetic algorithm Michael, Stefanus; Zahra, Amalia
Bulletin of Electrical Engineering and Informatics Vol 13, No 5: October 2024
Publisher : Institute of Advanced Engineering and Science

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.11591/eei.v13i5.7409

Abstract

Speech emotion recognition (SER) is a technology that can detect emotions in speech. Various methods have been used in developing SER, such as convolutional neural networks (CNNs), long short-term memory (LSTM), and multilayer perceptron. However, sometimes in addition to model selection, other techniques are still needed to improve SER performance, namely optimization methods. This paper compares manual hyperparameter tuning using grid search (GS) and hyperparameter tuning using genetic algorithm (GA) on the LSTM model to prove the performance increase in the multimodal SER model after optimization. The accuracy, precision, recall, and F1 score improvement obtained by hyperparameter tuning using GA (HTGA) is 2.83%, 0.02, 0.05, and 0.04, respectively. Thus, HTGA obtains better results than the baseline hyperparameter tuning method using a GS.
Perbandingan Efisiensi Proses CI/CD Multi-Lingkungan melalui Implementasi Paralel dan Berurutan: Efficiency Comparison of Multi-Environment CI/CD Processes Through Parallel and Sequential Implementations Setyoko, Andreas Dimas; Zahra, Amalia
MALCOM: Indonesian Journal of Machine Learning and Computer Science Vol. 4 No. 3 (2024): MALCOM July 2024
Publisher : Institut Riset dan Publikasi Indonesia

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.57152/malcom.v4i3.1334

Abstract

Penelitian ini mengatasi masalah pengembangan aplikasi di PT. Astra International Tbk. dengan menggunakan sistem otomatis Continuous Integration/Continuous Deployment (CI/CD). Astra saat ini menghadapi masalah kompilasi dan distribusi yang dilakukan secara manual dimana proses yang dilakukan memakan waktu yang lama dan seringkali terjadi kesalahan konfigurasi terlebih terdapat berbagai macam environment dalam tiap aplikasi. Solusi yang diusulkan adalah implementasi CI/CD untuk otomatisasi proses kompilasi dan distribusi untuk setiap environment aplikasi. CI/CD adalah salah satu praktik DevOps yang digunakan untuk pengembangan perangkat lunak menjadi lebih terorganisir. Dengan memanfaatkan CI/CD, tim pengembang dapat merasakan manfaat dari proses kompilasi dan distribusi aplikasi yang lebih cepat. Penelitian ini membandingkan implementasi CI/CD berurutan dengan CI/CD paralel. Hasil penelitian menunjukkan bahwa CI/CD berurutan dapat mengurangi waktu yang diperlukan sebesar 33% dari proses manual, sedangkan CI/CD paralel dapat mengurangi waktu yang diperlukan sebesar 79% dari proses manual.
Bitcoin Price Prediction Model Development Using Convolutional Neural Network (CNN) and Long Short-Term Memory (LSTM) Cahyadi, Jonathan; Zahra, Amalia
Journal La Multiapp Vol. 5 No. 2 (2024): Journal La Multiapp
Publisher : Newinera Publisher

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.37899/journallamultiapp.v5i2.1070

Abstract

Cryptocurrency is a virtual currency that can be used as a financial or economic standard, foreign currency reserve, and as a means of payment in some countries. The value that goes up and down every time is not easy to predict using logic. This is a problem for investors, besides that investors lack knowledge about the direction of crypto money movement. In addition, there is no system that can predict the price of Bitcoin, so this can cause investors to take the wrong steps in transactions and can cause losses. To avoid this risk, a system is needed that can predict bitcoin prices using data mining techniques, namely forecasting, the algorithms used are CNN and LSTM. The data used is Bitcoin closing price data from January 1, 2017, to April 26, 2023. The data is divided into 80% training data and 20% testing data. The prediction results are evaluated using MAPE which gets a MAPE value of 0.037 or 3.7% in the CNN algorithm, while the LSTM algorithm gets a value of 0.065 or 6.5%. The MAPE results of the two algorithms are in the MAPE range <10%, so it can be said that the ability of the forecasting model is very good so that it can be used as a reference to determine the prediction of bitcoin prices in the next few periods.
Enhancing detection of zero-day phishing email attacks in the Indonesian language using deep learning algorithms Roesmiatun Purnamadewi, Yasinta; Zahra, Amalia
Bulletin of Electrical Engineering and Informatics Vol 14, No 1: February 2025
Publisher : Institute of Advanced Engineering and Science

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.11591/eei.v14i1.8759

Abstract

Email phishing is a manipulative technique aimed at compromising information security and user privacy. To overcome the limitations of traditional detection methods, such as blacklists, this research proposes a phishing detection model that leverages natural language processing (NLP) and deep learning technologies to analyze Indonesian email headers. The primary objective is to more efficiently detect zero-day phishing attacks by focusing on the unique linguistic and cultural context of the Indonesian language. This enables the development of models capable of recognizing phishing attack patterns that differ from those in other language contexts. Four models are tested, combining Indonesian bidirectional encoder representation of transformers (IndoBERT) and FastText feature extraction techniques with convolutional neural network (CNN) and long short-term memory (LSTM) deep learning algorithms. The results indicate that the combination of FastText and CNN achieved the highest performance in accuracy, precision, and F1-score metrics, each at 98.4375%. Meanwhile, the FastText model with LSTM showed the best performance in recall, with a score of 98.9583%. The research suggests exploring deeper into email content or integrating analysis between headers and email content in future studies to further improve accuracy and effectiveness in phishing email detection.