cover
Contact Name
Husni Teja Sukmana
Contact Email
husni@bright-journal.org
Phone
+62895422720524
Journal Mail Official
jads@bright-journal.org
Editorial Address
Gedung FST UIN Jakarta, Jl. Lkr. Kampus UIN, Cemp. Putih, Kec. Ciputat Tim., Kota Tangerang Selatan, Banten 15412
Location
Kota adm. jakarta pusat,
Dki jakarta
INDONESIA
Journal of Applied Data Sciences
Published by Bright Publisher
ISSN : -     EISSN : 27236471     DOI : doi.org/10.47738/jads
One of the current hot topics in science is data: how can datasets be used in scientific and scholarly research in a more reliable, citable and accountable way? Data is of paramount importance to scientific progress, yet most research data remains private. Enhancing the transparency of the processes applied to collect, treat and analyze data will help to render scientific research results reproducible and thus more accountable. The datasets itself should also be accessible to other researchers, so that research publications, dataset descriptions, and the actual datasets can be linked. The journal Data provides a forum to publish methodical papers on processes applied to data collection, treatment and analysis, as well as for data descriptors publishing descriptions of a linked dataset.
Articles 53 Documents
Search results for , issue "Vol 5, No 4: DECEMBER 2024" : 53 Documents clear
A Framework for Diabetes Detection Using Machine Learning and Data Preprocessing Abu-Shareha, Ahmad Adel; Qutaishat, Haneen; Al-Khayat, Asma
Journal of Applied Data Sciences Vol 5, No 4: DECEMBER 2024
Publisher : Bright Publisher

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47738/jads.v5i4.363

Abstract

People with diabetes are at an increased risk of developing other complications, such as heart disease and nerve damage. Therefore, diabetes prediction is crucial to reduce the severe consequences of this disease. This study proposed a comprehensive framework for diabetes prediction to maximize the information from available diabetes datasets, which include historical records, laboratory tests, and demographic data. The proposed framework implements a data imputation technique for filling in missing values and adopts feature selection methods to remove less important features for better diabetes classification. An oversampling technique and a parameter tuning approach were used to increase the samples and fine-tune the parameters for training the machine learning algorithms. Various machine learning algorithms, including Neural Networks, Logistic Regression, Support Vector Machines, and Random Forest, were used for the prediction. These algorithms were evaluated using both train-test split and cross-validation techniques. The experiments were conducted on the Pima Indian Diabetes dataset using various evaluation metrics, including accuracy, precision, recall, and F-measure. The results showed that the Random Forest algorithm, particularly when fine-tuned with Grid Search Cross Validation, outperformed other algorithms, achieving an impressive accuracy of 0.99. This demonstrates the robustness and effectiveness of the proposed framework, which outperformed the accuracy of state-of-the-art approaches.
Factors Influencing User Satisfaction with Mobile Applications for Promoting Thai Community Products Pislae-ngam, Kattakamon; Inmor, Sureerut; Pukrongta, Nisit
Journal of Applied Data Sciences Vol 5, No 4: DECEMBER 2024
Publisher : Bright Publisher

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47738/jads.v5i4.383

Abstract

This study investigates key factors affecting user satisfaction with mobile applications like Shopee, Lazada, and TikTok Shop, focusing on promoting community products in Pathum Thani Province, Thailand. As mobile applications gain significance for marketing local goods, the research aims to explore how various features influence satisfaction and trust among users. The study collected data from 400 local entrepreneurs between January and March 2024, all experienced in using mobile apps to sell products. A confirmatory factor analysis (CFA) was conducted to examine five critical factors: requirements, accessibility, accuracy, security, and trust. The findings indicate that accuracy (β = 0.75) and accessibility (β = 0.71) significantly impact user satisfaction, emphasizing the importance of precise content and ease of use. Additionally, security (β = 0.76) and trust (β = 0.72) play crucial roles in maintaining user confidence in app transactions. All model indicators were validated at the 0.01 significance level, indicating a good fit for the hypothesized relationships between factors. The study’s novelty lies in highlighting specific app features that enhance user experiences in promoting local products. By focusing on the essential aspects of mobile app functionality, this research provides valuable insights to developers and local businesses for creating effective platforms, ultimately supporting sustainable economic growth.
Stochastic Queuing System Model Design Based on Stakeholder Aspirations Widodo, Imam Djati; Parkhan, Ali; Qurtubi, Qurtubi
Journal of Applied Data Sciences Vol 5, No 4: DECEMBER 2024
Publisher : Bright Publisher

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47738/jads.v5i4.314

Abstract

A good queuing system will provide satisfaction and trust for consumers and operational cost efficiency for service providers. This study aims to obtain the optimal number of service facilities by considering the aspirations of stakeholders, namely customers and service providers. Using aspiration theory, this research contributes to obtaining a dynamic solution to the number of service facilities with reference to service operating costs that can be determined with certainty and waiting costs that vary based on customer profiles. The study began by designing sampling for arrival time and service time data based on simple random sampling. The probability distributions of arrival time and service time are determined based on the data collection results of the sampling design. Based on the queuing profile and distribution of the two data, a suitable queuing model is built. Poisson distribution-based multi-channel queue model is constructed ((M/M/c):(GD/∞/∞)), and an optimization analysis is carried out on the number of service facilities provided by considering the aspirations of the two stakeholders. The results showed that based on stakeholder aspirations, optimal conditions were achieved at the number of servers c = 2 if the waiting cost (C2): IDR 0/hour≤ C2 ≤ IDR 11,076/hour, and the number of servers c = 3 if the waiting cost (C2): IDR 11,076/hour ≤ C2 ≤ IDR 120,690/hour.  Given that there are two conditional alternatives, the company can decide subjectively to take preventive and adaptive actions proactively according to the customer's appreciation of the waiting time in the company. Flexibility in opening service facilities will require the availability of workers and facilities to be provided. Multi-skilled workers will significantly help the flexibility of the system being built. Future research certainly needs to conduct a more in-depth study related to monthly fluctuations in arrival and service times within that period.
Efficient Fruit Grading and Selection System Leveraging Computer Vision and Machine Learning Dewi, Deshinta Arrova; Kurniawan, Tri Basuki; Thinakaran, Rajermani; Batumalay, Malathy; Habib, Shabana; Islam, Muhammad
Journal of Applied Data Sciences Vol 5, No 4: DECEMBER 2024
Publisher : Bright Publisher

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47738/jads.v5i4.443

Abstract

Automated fruit grading is crucial to overcoming the time and accuracy challenges posed by manual methods, which are often limited by subjective human judgment. This study introduces an intelligent grading system leveraging computer vision and AI to improve speed and consistency in assessing fruit quality. Using high-resolution imaging and advanced feature extraction, including grayscale processing, binarization, and enhancement, the system achieves non-destructive, efficient sorting for fruits like apples, bananas, and oranges. Grayscale processing reduces image complexity while preserving essential details, binarization isolates the fruit from its background, and enhancement highlights critical features. Notably, the Edge Pixel method proved most effective, achieving 79.20% accuracy in grading, while the Grayscale Pixel method reached 93.94% accuracy for fruit types. Edge Pixel also achieved 80.32% in differentiating grading types, showcasing its ability to capture essential shapes and edges. Fruits are classified into four grades: Grade_01 (highest quality), Grade_02 (minor imperfections), Grade_03 (notable defects but consumable), and Grade_04 (unfit for consumption). A specialized dataset supports model training, ensuring practical real-world application. The study concludes that this automated system offers significant improvements over traditional grading, providing a scalable, objective, and reliable solution for the agricultural sector, ultimately enhancing productivity and quality assurance.
Analyzing Audience Sentiments in Digital Comedy: A Study of YouTube Comments Using LSTM Models Supriyono, Supriyono; Wibawa, Aji Prasetya; Suyono, Suyono; Kurniawan, Fachrul
Journal of Applied Data Sciences Vol 5, No 4: DECEMBER 2024
Publisher : Bright Publisher

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47738/jads.v5i4.393

Abstract

The main objective of this paper is to analyze audience sentiment towards stand-up comedy content on the YouTube platform, specifically comments on stand-up comedy videos from Kompas TV, using the Long Short-Term Memory (LSTM) method. This research contributes significantly to a deeper understanding of how audiences engage with humorous content through a sentiment analysis approach that uses the LSTM model, which can capture complex nuances in humorous content, such as sarcasm, irony, and cultural references. The research methodology involves crawling data from YouTube, where user comments are extracted and processed through several stages of data cleaning, such as removing duplicate content, text normalization, and irrelevant comments. Once the data is prepared, the LSTM model is trained to analyze positive, negative, and neutral sentiments with varying accuracy rates of 85% for positive sentiment, 80% for negative sentiment, and 78% for neutral sentiment. The main results show that the LSTM model successfully classifies sentiments, although it needs help handling the more ambiguous neutral sentiments. The figures and tables included in this study illustrate the relationship between the number of views, likes, and the sentiment classification of the comments. One notable finding is a strong positive correlation between the number of views and video likes. The conclusions of this study underscore the need for model improvements to handle neutral sentiment better and capture the complexity of humor content. The implications of this research are useful for content creators and digital marketers in understanding and responding to audience preferences more effectively. They also pave the way for further research in sentiment analysis on more specific content genres on digital platforms.
Health and Socio-Demographic Risk Factors of Childhood Stunting: Assessing the Role of Factor Interactions Through the Development of an AI Predictive Model Hariguna, Taqwa; Sarmini, Sarmini; Azis, Abdul
Journal of Applied Data Sciences Vol 5, No 4: DECEMBER 2024
Publisher : Bright Publisher

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47738/jads.v5i4.612

Abstract

Stunting is a significant global health problem, especially in developing countries such as Indonesia. This study aims to develop and evaluate an artificial intelligence (AI)-based predictive model to identify the risk of stunting in children using the CatBoost algorithm which is a combination of Weighted Apriori and XGBoost. This model is designed to utilize the advantages of each algorithm in handling data with variable weights to improve prediction accuracy. Feature analysis shows that "Height (cm) Age (months)" are the main indicators in classifying children's nutritional status. Model evaluation shows high accuracy of 94.85%, precision of 95%, recall of 94.85%, and F1 Score of 94.84%. Kappa Coefficient and Matthews Correlation Coefficient (MCC) reached 93.13% and 93.19%, respectively, while ROC-AUC reached 99.70%. These findings indicate that the CatBoost model can provide highly accurate results in detecting the risk of stunting and offer in-depth insights into risk factors that can improve the effectiveness of health interventions. This study fills the gap in the literature by integrating the Weighted Apriori and XGBoost algorithms, providing a significant contribution to early detection of stunting and supporting government efforts to reduce the prevalence of stunting in Indonesia and other regions.
Sentiment Analysis of the Kampus Merdeka Program on Twitter Using Support Vector Machine and a Feature Extraction Comparison: TF-IDF vs. FastText Afuan, Lasmedi; Hidayat, Nurul; Nofiyati, Nofiyati; As'ad, Mohamad Faris
Journal of Applied Data Sciences Vol 5, No 4: DECEMBER 2024
Publisher : Bright Publisher

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47738/jads.v5i4.436

Abstract

The Kampus Merdeka program, launched by the Indonesian Ministry of Education, Culture, Research, and Technology in 2020, aims to enhance students' skills through hands-on work experience. Considering the rising significance of social media, particularly Twitter, in gauging public opinion, this research focuses on analyzing the sentiment towards the Kampus Merdeka program. The primary objective is to classify the sentiments expressed in tweets related to the program and compare two feature extraction techniques—TF-IDF and FastText—to identify the best approach for transforming text data into numerical vectors. The sentiment classification model was built using the Support Vector Machine (SVM) algorithm, a machine learning technique known for its accuracy in text classification. A total of 16,730 tweets were collected and analyzed, yielding an accuracy of 73% for FastText and 72% for TF-IDF. Results show that FastText is more effective in capturing semantic relationships, leading to higher accuracy in sentiment classification. Findings indicate that the public sentiment towards the Kampus Merdeka program is predominantly positive (60.7%), with negative and neutral sentiments at 33.5% and 5.8%, respectively. The success of the FastText method underscores the importance of advanced feature extraction techniques in text classification. The novelty of this research lies in its use of FastText for educational policy evaluation, providing a new perspective on using sentiment analysis to assess public perception of educational programs.
Spam Feature Selection Using Firefly Metaheuristic Algorithm Abualhaj, Mosleh M; Hiari, Mohammad O; Alsaaidah, Adeeb; Al-Zyoud, Mahran; Al-Khatib, Sumaya
Journal of Applied Data Sciences Vol 5, No 4: DECEMBER 2024
Publisher : Bright Publisher

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47738/jads.v5i4.336

Abstract

This paper presents a novel method for improving spam detection by utilizing the Firefly Algorithm (FA) for feature selection. The FA, a bio-inspired metaheuristic optimization algorithm, is applied to identify the most relevant features from the ISCX-URL2016 dataset, which contains 72 features. By balancing exploration (searching for new solutions) and exploitation (focusing on the best solutions), FA is able to effectively reduce the feature space from 72 to 31 features. This reduction improves model efficiency without sacrificing performance, as only the most impactful features are retained for the classification task. The selected features were then used to train three machine learning classifiers: Decision Tree (DT), Gradient Boost Tree (GBT), and Naive Bayes (NB). Each classifier's performance was evaluated based on accuracy, with DT achieving the highest accuracy of 99.81%, GBT achieving 99.70%, and NB scoring 90.33%. The superior performance of the DT algorithm is attributed to its ability to handle non-linear relationships and high-dimensional data, making it particularly well-suited for the FA-selected features. This combination of FA for feature selection and DT for classification demonstrates significant improvements in spam detection performance, highlighting the importance of selecting the most relevant features. The results show that by reducing the dimensionality of the dataset, the FA algorithm not only accelerates the classification process but also enhances detection accuracy.
Data Visualization of Climate Patterns in Indonesia Using Python and Looker Studio Dashboard: A Visual Data Mining Approach Refianti, Rina; Mutiara, Achmad Benny; Ariyanto, Ananda Satria
Journal of Applied Data Sciences Vol 5, No 4: DECEMBER 2024
Publisher : Bright Publisher

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47738/jads.v5i4.420

Abstract

Climate has a significant impact on the lives of Indonesian people. Information about climate patterns, when presented visually and interactively, can greatly enhance understanding of climate conditions in Indonesia. This study aims to produce a visualization of climate pattern data in Indonesia that can be accessed online by the general public, serving as a valuable resource for climate information. The study highlights the ability to display historical trends for a 10-year period (2010-2020) through interactive visuals, which load information according to user-defined filters, enabling diverse presentations of data. The research employs the Visual Data Mining method, encompassing Project Planning, Data Preparation, and Data Analysis phases. Additionally, Exploratory Data Analysis techniques were utilized in the data analysis phase. The data was cleaned and processed using the Python programming language with libraries such as pandas, numpy, seaborn, and matplotlib. Visualizations were created using Looker Studio tools and published on a website, providing accessible climate pattern information in Indonesia via the Internet. The final results of this research indicate that the developed climate visualization dashboard successfully delivers detailed insights into sunlight duration, temperature, humidity, rainfall, and wind speed across various Indonesian regions. Users can effectively monitor climate trends and weather changes. The dashboard also demonstrates significant seasonal variations and differences in climate patterns between provinces. Performance metrics reveal that the dashboard meets Key Performance Indicators, achieving a click-through ratio of 40.1%, the average page position in search engines is 4.8 top positions, and receiving positive user experience scores. Further development and research on the Climate Pattern Dashboard in Indonesia still have room for enhancement. Important aspects include expanding data coverage to include multiple decades for observing significant climate patterns and applying sophisticated prediction methods like machine learning algorithms for future climate change projections.
Transforming Agriculture: An Insight into Decision Support Systems in Precision Farming Yi, Ding; Jun, Luo; Haodic, Gao; Xing, Zhang; Lie, Ye; Maidin, Siti Sarah; Ishak, Wan Hussain Wan; Wider, Walton
Journal of Applied Data Sciences Vol 5, No 4: DECEMBER 2024
Publisher : Bright Publisher

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47738/jads.v5i4.274

Abstract

Precision agriculture seamlessly incorporates advanced technologies and data analysis to improve farming efficiency and sustainability through immediate resource allocation. Therefore, this study aims to synthesize research findings related to agriculture, Decision Support Systems, and precision agriculture through a systematic literature review conducted in accordance with Preferred Reporting Items for Systematic Reviews and Meta-Analyses guidelines. The search was performed on the Scopus database, specifically focusing on publications published in English between the years 2017 and 2023. Out of 126 periodicals, a rigorous process was used to determine which publications met the specific criteria for inclusion and exclusion. As a result, only 8 relevant studies were chosen. The review emphasizes the substantial capacity of Decision Support Systems in precision agriculture, demonstrating that DSS has the capability to enhance crop yields by 15% and decrease water consumption by 20%. Through the utilization of big data, machine learning, and advanced technologies, Decision Support Systems has the potential to transform the agricultural industry by enhancing productivity, optimizing resource allocation, and enabling early identification of pests and diseases. The utilization of real-time data from Decision Support Systems empowers farmers to make well-informed choices, effectively managing production while upholding environmental sustainability. This, in turn, plays a crucial role in ensuring the economic viability of farms and enhancing global food security. However, addressing challenges like data privacy concerns, enhancing user-friendly interfaces, establishing robust data administration infrastructure, and providing adequate training and support for end-users is imperative for the successful implementation of data-driven Decision Support Systems in precision agriculture.