cover
Contact Name
Huzain
Contact Email
huzain.azis@umi.ac.id
Phone
+628114484875
Journal Mail Official
ijodas.journal@gmail.com
Editorial Address
Jln. Paccerakkang, Kel. Berua, Kec.Biringkanaya, Kota Makassar, Propinsi Sulawesi Selatan, 90241
Location
Unknown,
Unknown
INDONESIA
Indonesian Journal of Data and Science
Published by yocto brain
ISSN : -     EISSN : 27159930     DOI : -
Core Subject : Science, Education,
IJODAS provides online media to publish scientific articles from research in the field of Data Science, Data Mining, Data Communication, Data Security and Data Representation
Articles 159 Documents
Zero-Shot Sentiment Analysis Of DeepSeek AI App Reviews Using DeepSeek-R1 Pamungkas, Restu sri; Erfina, Adhitia; Warman, Cecep
Indonesian Journal of Data and Science Vol. 6 No. 3 (2025): Indonesian Journal of Data and Science
Publisher : yocto brain

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.56705/ijodas.v6i3.303

Abstract

This study aims to evaluate the effectiveness of the Zero-Shot Learning (ZSL) approach using the DeepSeek-R1-Distill-Qwen-1.5B model in performing sentiment classification on Indonesian-language reviews of the DeepSeek AI application from the Google Play Store. A total of 2,000 unlabeled user reviews were collected and processed through instructional prompts to guide the model in classifying sentiments into three categories: positive, negative, and neutral. The model operates without fine-tuning and relies entirely on Zero-Shot Learning using Indonesian-language prompts. Out of 2,000 reviews, 1,348 were successfully classified with valid sentiment labels. Of these, 1,131 reviews (83.9%) were labeled as positive, 211 reviews (15.7%) as negative, and only 6 reviews (0.4%) as neutral. Evaluation results indicated an overall accuracy of 77.67%. The F1-Score for the positive class reached 86.66%, while the negative and neutral classes scored 33.56% and 16.66%, respectively, highlighting the performance disparity between dominant and underrepresented sentiment categories. These findings demonstrate that the DeepSeek-R1 model has strong potential in detecting positive sentiment in Indonesian without requiring additional training. However, its performance on negative and neutral sentiments remains limited, revealing the challenge of handling low-resource and imbalanced data in Zero-Shot settings. Future research should explore improved prompt engineering or multilingual adaptation to address the current limitations and enhance classification consistency across all sentiment categories
Sarcasm and Irony Detection in Lazada App Reviews Using IndoBERT Putri, Nabila; Erfina, Adhitia; Warman, Cecep
Indonesian Journal of Data and Science Vol. 6 No. 3 (2025): Indonesian Journal of Data and Science
Publisher : yocto brain

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.56705/ijodas.v6i3.307

Abstract

Digital technology has reshaped consumer behavior, particularly in e-commerce, where Google Play Store reviews provide rich feedback but often include sarcasm and irony that conventional sentiment models misread. This study proposes an Indonesian sarcasm–irony detection model using IndoBERT, a transformer pre-trained on Indonesian corpora. A dataset of 1,998 Lazada app reviews was collected via web scraping and preprocessed through text cleaning, tokenization, and stopword removal with the Sastrawi library. IndoBERT was fine-tuned to classify reviews into three classes: sarcasm, irony, and literal. Performance was assessed using accuracy, precision, recall, F1-score, and a confusion matrix. The model achieved 96.40% accuracy, with F1-scores of 0.9725 (sarcasm), 0.9675 (irony), and 0.9267 (literal). Word cloud visualizations revealed distinct lexical patterns across classes, supporting IndoBERT’s ability to capture contextual cues behind implicit sentiment. The findings indicate IndoBERT is effective for advanced opinion mining in Indonesian e-commerce, with potential applications in customer feedback monitoring, surfacing hidden complaints, and improving recommendation systems beyond surface polarity. Limitations include reliance on a single platform (Google Play) and text-only input, without modeling non-textual signals such as emojis or punctuation intensity. Future work should test cross-platform generalization, incorporate non-textual cues, and apply data augmentation to reduce class imbalance, particularly for the less frequent literal class, to improve robustness for real-world deployment
Application Of K-Means Clustering Algorithm to Identify the Best-Selling Digital Printing Services Fatahali Ramadhan, Ana; Saepudin, Sudin; Irawan, Carti; Mupaat
Indonesian Journal of Data and Science Vol. 6 No. 3 (2025): Indonesian Journal of Data and Science
Publisher : yocto brain

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.56705/ijodas.v6i3.316

Abstract

The digital printing industry in Indonesia is experiencing rapid growth thanks to the increasing demand from companies for printing services such as banners, stickers, brochures, and business cards. CV. Copy Paste is one of the companies operating in the digital printing industry that fulfills various printing orders every month. However, the company has difficulty identifying the most popular printing services, which makes it difficult to develop a targeted promotional strategy. In view of this problem, the aim of this study is to group digital printing services according to their popularity using the K-Means Clustering method. This study uses a quantitative approach, collecting sales data from the last 12 months, covering 160 types of services. The steps taken include preliminary data processing, namely attribute selection, data cleaning, and data transformation so that it can be effectively processed using the K-Means algorithm, implemented in the Python programming language. The test results show that digital printing services can be divided into three clusters: 115 less popular services (C1), 31 fairly popular services (C2), and 14 very popular services (C3). The results of this study provide information that can be used as a basis for strategic decisions regarding promotion and service management. In this way, the K-Means Clustering algorithm has proven effective in helping companies group products in a more objective and measurable way based on historical data.  
Indonesian Cross-Platform Sentiment Analysis: DANN Transfer from General Applications to TradingView Zulkifli, Muh. Rifqi; Purnawansyah; Darwis, Herdianti
Indonesian Journal of Data and Science Vol. 6 No. 3 (2025): Indonesian Journal of Data and Science
Publisher : yocto brain

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.56705/ijodas.v6i3.318

Abstract

Introduction: Cross-platform sentiment analysis for Indonesian language presents significant challenges when adapting models from general applications to specialized domains. Domain Adversarial Neural Networks (DANN) offer promising solutions for transfer learning, yet their effectiveness for Indonesian language remains largely unexplored, particularly under extreme class imbalance conditions common in trading platforms. Methods: This study investigates DANN effectiveness for transferring sentiment analysis knowledge from four strategically selected source domains to TradingView trading platform. The research utilizes 5,990 Indonesian reviews after preprocessing from an initial 6,000 samples, with source domains showing 66.5% positive sentiment while target domain exhibits 85.1% positive sentiment, creating an 18.7% distribution gap. Four experimental approaches were compared with statistical validation across multiple random initializations: Source-Only training, Multi-Domain training, Limited Target training, and DANN implementation. Results: DANN demonstrates stable cross-platform adaptation, achieving 87.77% ± 0.97% accuracy with consistent performance across initializations, outperforming Source-Only baseline (87.10% ± 0.84%) and Multi-Domain approach (86.98% ± 0.64%). While Limited Target baseline achieves higher accuracy (88.10% ± 2.23%), its high variance poses deployment risks. A-distance analysis reveals substantial domain gaps (193.00 ± 1.06), with DANN's adversarial training achieving modest domain separation reduction (72.90% ± 8.81% domain discrimination accuracy). Conclusions: This research contributes the first systematic evaluation of DANN for Indonesian cross-platform sentiment analysis, demonstrating that deployment consistency outweighs peak accuracy for production environments. The findings provide practical value for Indonesian fintech startups requiring robust sentiment analysis with limited labeled data. Future work should explore multi-target adaptation and optimization strategies for diverse Indonesian business domains
Smart Waste Bin Prototype for University Waste Management Fathrurahman, Fauzy; Dolly Indra; Tasrif Hasanuddin; Herdianti Darwis; Tanaka Kazuaki
Indonesian Journal of Data and Science Vol. 6 No. 3 (2025): Indonesian Journal of Data and Science
Publisher : yocto brain

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.56705/ijodas.v6i3.324

Abstract

Background: Waste mismanagement remains a critical issue in Indonesian campuses, where ineffective segregation and collection practices contribute to environmental pollution. Smart technologies offer opportunities to improve waste handling efficiency and monitoring in university environments. Methods: This study developed a smart waste bin prototype that integrates Internet of Things (IoT) sensors, machine learning–based image classification (MobileNetV2 with TensorFlow Lite), GPS tracking, and LoRa communication. The system was designed to classify three types of waste—plastic bottles, snack packaging, and cans—while enabling fill-level monitoring, automated sorting, and real-time location reporting. Results: Experimental results showed strong classification accuracy for plastic bottles (100%), but lower performance for snack packaging (53–80%) and cans (40–67%), especially in low-light conditions or with darker materials. The overall real-time testing accuracy reached 45.1%. LoRa communication provided long-range connectivity but was affected by electromagnetic interference, while GPS tracking was reliable in open areas but inconsistent indoors. Conclusions: The prototype demonstrates the feasibility of integrating AI and IoT for scalable campus waste management. Despite environmental and hardware limitations, it offers a modular framework that can be refined with improved lighting, EMI shielding, and enhanced datasets. This research contributes a practical model for smart campus initiatives and supports the adoption of sustainable waste management practices in higher education environments.
Comparative Analysis of Speech-to-Text APIs for Supporting Communication of the Deaf Community Handayani, Anik Nur; Hariyono, Hariyono; Nasih, Ahmad Munjin; Rochmawati, Rochmawati; Hitipeuw, Imanuel; Ar Rosyid, Harits; Ardiansah, Jevri Tri; Praja, Rafli Indar; Nurdiansyah, Ahmad; Azizah, Desi Fatkhi
Indonesian Journal of Data and Science Vol. 6 No. 3 (2025): Indonesian Journal of Data and Science
Publisher : yocto brain

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.56705/ijodas.v6i3.327

Abstract

Hearing impairment can have a profound impact on the mental and emotional state of sufferers, as well as hinder communication and delay in accessing information directly that relies on interpreters. Advances in assistive technology, especially speech recognition systems that are able to convert spoken language into written text (speech-to-text). However, its implementation faces various challenges related to the level of accuracy of each speech-to-text Application Programming Interface (API), thus requiring an appropriate deep learning model. This study serves to analyze and compare the performance of speech-to-text API services (Deepgram API, Google API and Whisper AI) based on Word Error Rate (WER) and Words Per Minute (WPM), to determine the most optimal API in a web-based real-time transcription system using the JavaScript programming language and Glitch.com. The three API services were tested by calculating their error rates and transcription speeds, then evaluated to see how low the error accuracy rate was and how high the transcription speed was. On average, Whisper AI had a WER of 0% across all word categories, but its speed was lower than the other two APIs. Deepgram API displayed the best balance between accuracy and speed, with an average WER of 13.78% and 67 WPM. Google API performed stably, but its WER value was slightly higher than Deepgram API. In conclusion, based on the results, Deepgram API was deemed the most optimal for live transcription, as it is capable of producing fast and error-free transcriptions, significantly increasing the accessibility of information for the deaf community.
A Hybrid Convolutional Neural Network and Bidirectional LSTM Architecture for Multi-Sector Export Forecasting: A Macroeconomic Time Series Analysis of Indonesia Anggreani, Desi; Nurmisba, Nurmisba; Abd Rahman, Aedah
Indonesian Journal of Data and Science Vol. 6 No. 3 (2025): Indonesian Journal of Data and Science
Publisher : yocto brain

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.56705/ijodas.v6i3.330

Abstract

Accurately predicting export values is key for a country in formulating its economic plans. Unfortunately, export data often exhibits complex time series patterns that are difficult to predict, characterized by non-linearity, high volatility, and complex temporal dependencies. This study offers a solution by testing a combined deep learning model, specifically a fusion of Convolutional Neural Networks (CNN) and Bidirectional Long Short-Term Memory (BiLSTM), to address the challenges of export time series forecasting. This study uses this approach to forecast Indonesia's monthly export time series data from 2016 to 2023, covering various sectors ranging from oil and gas, non-oil and gas, agriculture, industry, mining, and others. The core idea is to leverage the CNN's ability to identify hidden features within time series patterns, while the BiLSTM is tasked with understanding the temporal flow of data from both directions to capture the inherent long-term temporal dependencies within economic time series data. As a result, this combined model proved to be far superior to the standard BiLSTM model in handling the complexity of export time series. In the Non-Oil and Gas sector, the proposed model achieved a high level of accuracy with an MSE value of 3,330,239.74, an RMSE of 1,824.89, and an average prediction error (MAPE) of only 8.17%, representing a significant improvement of 69% over the baseline BiLSTM model. Similar success was also found in all other sectors, proving that this hybrid approach is highly promising for complex economic time series analysis
Integrating Clustering Models and RCA to Identify Emerging Textile Export Destinations for Indonesia Muhammad Glenn Yunifer; Samidi
Indonesian Journal of Data and Science Vol. 6 No. 3 (2025): Indonesian Journal of Data and Science
Publisher : yocto brain

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.56705/ijodas.v6i3.337

Abstract

This research investigates the strategic identification of new export destinations for Indonesian textile products by integrating international market segmentation and product competitiveness analysis. The study employs clustering techniques (K-Means, K-Medoids, and Hierarchical) validated through Silhouette and Davies-Bouldin indices to classify 149 countries based on trade indicators (import growth, trade balance, global market share), economic indicators (population, purchasing power parity, industrial proportion to GDP), and trade barrier indicators (logistics performance index, geographic distance, free trade agreements). Complementarily, the Revealed Comparative Advantage (RCA) framework is applied to evaluate Indonesia’s product-level competitiveness in the global textile market. The results reveal that export opportunities are can be concentrated in 20 countries across Europe, Asia, Africa, the Caribbean, and Melanesia, characterized by positive import growth, significant trade deficits, large market capacities, and relatively low trade barriers. Moreover, Indonesia demonstrates high comparative advantages in artificial and synthetic fibbers, wigs, and leather footwear, while apparel products such as suits, shirts, knitwear, and brassieres represent moderately competitive but globally demanded items. The study concludes that Indonesia’s export strategy should balance high purchasing power markets and emerging economies with high import dependency.
Sentiment Analysis of Public Opinion on Pi Network on Reddit Using FinBERT Wiguna, Sindy Indira; Erfina, Adhitia; Warman, Cecep
Indonesian Journal of Data and Science Vol. 6 No. 3 (2025): Indonesian Journal of Data and Science
Publisher : yocto brain

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.56705/ijodas.v6i3.342

Abstract

The rapid growth of blockchain technology has led to the emergence of new cryptocurrencies, including Pi Network, which emphasizes accessibility through mobile-based mining. This study aims to answer the research question of whether FinBERT, a financial domain-specific transformer model, can effectively classify public sentiment in informal Reddit discussions related to Pi Network. FinBERT was first evaluated on a labeled financial sentiment dataset to assess its performance in a structured financial context before being applied to Reddit data. Model performance was measured using accuracy, precision, recall, and F1-score. After validation, the model was used to analyze one thousand twenty Reddit comments discussing Pi Network. Text preprocessing included cleaning, case folding, tokenization, stopword removal, stemming, and sequence standardization. The evaluation results show that FinBERT achieved an accuracy of eighty-five point ninety-eight percent on the financial validation dataset, with strong precision and recall across sentiment classes. When applied to Reddit comments, neutral sentiment was the most dominant, followed by positive and negative sentiments. Pi Network was selected as the case study because, unlike more established cryptocurrencies, it is still in an early stage of development and relies heavily on community participation, making public opinion particularly important for understanding its adoption and credibility
Public Response on X to the Revocation of Indonesia’s 3-Kg LPG Retail Ban: A Support Vector Machine Study Wahyuni, Ni Nyoman Asti Sri; Sudipa, I Gede Iwan; Sastaparamitha, Ni Nyoman Ayu J.; Willdahlia, Ayu Gede; Aristamy, I Gusti Ayu Agung Mas
Indonesian Journal of Data and Science Vol. 6 No. 3 (2025): Indonesian Journal of Data and Science
Publisher : yocto brain

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.56705/ijodas.v6i3.349

Abstract

This study examines public responses on X to the 3-Kg LPG retail ban implemented on February 1, 2025, and revoked on February 4, 2025, which caused widespread shortages, long queues, and limited access, particularly for citizens living far from official distribution points. A total of 2,524 Indonesian-language tweets were collected via crawling and systematically processed through text cleaning, tokenization, normalization, stopwords removal, and stemming, followed by automatic labeling using the Indonesian Sentiment (InSet) Lexicon. After removing 229 neutral tweets, 1,405 tweets (61.2%) were classified as negative and 890 tweets (38.8%) as positive, with the study focusing on these two sentiment classes. Text features were extracted using TF-IDF, and classification was conducted using a linear-kernel Support Vector Machine (C = 0.1) with an 80:20 train-test split. The model achieved an overall accuracy of 84%, with precision, recall, and F1-score of 0.82, 0.94, and 0.88 for the negative class, and 0.87, 0.68, and 0.76 for the positive class. Results indicate that negative sentiment was dominated by criticism related to LPG shortages and insufficient policy communication, while positive sentiment reflected user relief over restored supply and hopes for fairer distribution in the future. These findings suggest that revoking the ban did not fully restore public perception, highlighting the necessity for more effective policy dissemination and stricter monitoring of 3-Kg LPG distribution. The study also emphasizes the importance of leveraging social media, particularly X, as a real-time source for monitoring public opinion and evaluating the effectiveness of energy distribution policies in Indonesia.