cover
Contact Name
Husni Teja Sukmana
Contact Email
husni@bright-journal.org
Phone
+62895422720524
Journal Mail Official
jads@bright-journal.org
Editorial Address
Gedung FST UIN Jakarta, Jl. Lkr. Kampus UIN, Cemp. Putih, Kec. Ciputat Tim., Kota Tangerang Selatan, Banten 15412
Location
Kota adm. jakarta pusat,
Dki jakarta
INDONESIA
Journal of Applied Data Sciences
Published by Bright Publisher
ISSN : -     EISSN : 27236471     DOI : doi.org/10.47738/jads
One of the current hot topics in science is data: how can datasets be used in scientific and scholarly research in a more reliable, citable and accountable way? Data is of paramount importance to scientific progress, yet most research data remains private. Enhancing the transparency of the processes applied to collect, treat and analyze data will help to render scientific research results reproducible and thus more accountable. The datasets itself should also be accessible to other researchers, so that research publications, dataset descriptions, and the actual datasets can be linked. The journal Data provides a forum to publish methodical papers on processes applied to data collection, treatment and analysis, as well as for data descriptors publishing descriptions of a linked dataset.
Articles 518 Documents
Unveiling Criminal Activity: a Social Media Mining Approach to Crime Prediction Armoogum, Sheeba; Dewi, Deshinta Arrova; Armoogum, Vinaye; Melanie, Nicolas; Kurniawan, Tri Basuki
Journal of Applied Data Sciences Vol 5, No 3: SEPTEMBER 2024
Publisher : Bright Publisher

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47738/jads.v5i3.350

Abstract

Social media platforms have become breeding grounds for abusive comments, necessitating the use of machine learning to detect harmful content. This study aims to predict abusive comments within a Mauritian context, focusing specifically on comments written in Mauritian Kreol, a language with limited natural language processing tools. The objective was to build and evaluate four machine learning models—Decision Tree, Random Forest, Naïve Bayes, and Support Vector Machine (SVM)—to accurately classify comments as abusive or non-abusive. The models were trained and tested using k-fold cross-validation, and the Decision Tree model outperformed others with 100% precision and recall, while Random Forest followed with 99% accuracy. Naïve Bayes and SVM, although achieving 100% precision, had lower recall rates of 35% and 16%, respectively, due to imbalanced data in the training set. Pre-processing steps, including stop-word removal and a custom Kreol spell checker, were key in enhancing model performance. The study provides a novel contribution by applying machine learning in a Mauritian context, demonstrating the potential of AI in detecting abusive language in underrepresented languages. Despite limitations such as the absence of a Kreol lemmatization tool and incomplete coverage of Kreol spelling variations, the models show promise for wider application in social media crime detection. Future research could explore expanding this approach to other languages and domains of social media crimes.
Knowledge Mapping of Digital Leadership and Research Agenda: The Open Knowledge Maps Perspective Zam, Efvy Zamidra; Amin, Shofia; Johannes, Johannes; Rosita, Sry
Journal of Applied Data Sciences Vol 5, No 2: MAY 2024
Publisher : Bright Publisher

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47738/jads.v5i2.207

Abstract

In today's technological era, digital leadership is necessary for organizational success in facing environmental and technological alteration. This study aims to map digital leadership research using the Open Knowledge Maps platform to find and build cluster visualizations. Moreover, using Open Knowledge Maps as a research analysis tool is rare. The data used is a research paper from 2015-2024 with high metadata quality. This study found 15 clusters related to digital leadership, and most research on digital leadership is carried out in the education field. In addition, this digital leadership study also searches for its effect on employee performance. This study implies that it can find research gaps that can be helpful for future research as the basis for further research.
Novel Battery Management with Fuzzy Tuned Low Voltage Chopper and Machine Learning Controlled Drive for Electric Vehicle Battery Management: A Pathway Towards SDG P, Vinoth Kumar; S, Priya; D, Gunapriya; Batumalay, M
Journal of Applied Data Sciences Vol 5, No 3: SEPTEMBER 2024
Publisher : Bright Publisher

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47738/jads.v5i3.236

Abstract

Electric vehicles have a significant impact on the SDGs, specifically climate action, affordable and clean energy, and responsible consumption and production patterns. The present work focuses on a battery management system to effectively utilize the power from the battery to drive the brushless DC motor (BLDC) by tuning the low-voltage buck boost converter as a chopper circuit with fuzzy. The photovoltaic system acts as an additional source to charge the battery when the battery is not connected to the load, and at running conditions, fuzzy logic control enhances efficiency and provides smooth, adaptive control under varying load conditions. Also, the machine learning technique is used for drive control and automation operations. The energy in the BLDC is regulated by managing the voltage and current in a photovoltaic-powered low-voltage chopper by tuning the proportional integral derivative (PID) controller for an ideal balance between reliability and a quicker reaction. The K- Nearest Neighbour (KNN) machine learning algorithm, due to its simplicity and effectiveness in classification, ensures the enhanced reliability and efficiency of the BLDC motor system with commutation and speed control. When fuzzy and the KNN machine learning algorithm are used, the development of systems for control and automation is expedited. The work also shows the results of a study that compared the interoperability of proportionate machine learning and fuzzy controlling algorithms developed with MATLAB. In order to do a critical analysis of the data, the results are compared with the graphs. The integration of the Internet of Things (IoT) and cloud technology with the use of KNN for BLDC motor control can enhance system proficiency with monitoring and display of the observed voltage, current values of the motor, sensorless control, fault diagnosis, and predictive maintenance. The work is also connected with the SDG and impacts due to the efficient operation of electric vehicles.
An Effective Investigation of Genetic Disorder Disease Using Deep Learning Methodology Vidhya, B.; Shivakumar, B. L.; Maidin, Siti Sarah; Sun, Jing
Journal of Applied Data Sciences Vol 5, No 3: SEPTEMBER 2024
Publisher : Bright Publisher

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47738/jads.v5i3.370

Abstract

This study evaluates the performance of four neural network models—Artificial Neural Network (ANN), ANN optimized with Artificial Bee Colony (ANN-ABC), Multilayer Feedforward Neural Network (MLFNN), and Forest Deep Neural Network (FDNN)—across different iteration levels to assess their effectiveness in predictive tasks. The evaluation metrics include accuracy, precision, Area Under the Curve (AUC) values, and error rates. Results indicate that FDNN consistently outperforms the other models, achieving the highest accuracy of 99%, precision of 98%, and AUC of 99 after 250 iterations, while maintaining the lowest error rate of 2.8%. MLFNN also shows strong performance, particularly at higher iterations, with notable improvements in accuracy and precision, but does not surpass FDNN. ANN-ABC offers some improvements over the standard ANN, yet falls short compared to FDNN and MLFNN. The standard ANN model, though improving with iterations, ranks lowest in all metrics. These findings highlight FDNN's robustness and reliability, making it the most effective model for high-precision predictive tasks, while MLFNN remains a strong alternative. The study underscores the importance of model selection based on performance metrics to achieve optimal predictive accuracy and reliability. 
Improving Publishing: Extracting Keywords and Clustering Topics Soekamto, Yosua Setyawan; Maryati, Indra; Christian, Christian; Kurniawan, Edwin
Journal of Applied Data Sciences Vol 5, No 2: MAY 2024
Publisher : Bright Publisher

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47738/jads.v5i2.199

Abstract

Humans, by nature, are inclined to share knowledge across various platforms, such as educational institutions, media outlets, and specialized research publications like journals and conferences. The consistent oversight and evaluation of these publications by ranking bodies serve to maintain the integrity and quality of scholarly discourse on a global scale. However, there has been a decline in the proliferation of such publications in recent times, partly attributed to ethical misconduct within specific segments of the scholarly community. Despite implementing systems such as the Open Journal System (OJS), publishers grapple with the formidable task of managing editorial and review processes. Compounding the multifaceted nature of scholarly content, manual review procedures often lead to considerable time investment. Thus, a pressing need exists for advanced technological solutions to streamline the article selection process, empowering publishers to prioritize articles for review based on topical relevance. This study advocates adopting a comprehensive framework integrating advanced text analysis techniques such as keyword extraction, topic clustering, and summarization algorithms. These tools can be implemented and integrated by connecting with the database of the existing system. By leveraging these tools with the expertise of editorial and review teams, publishers can significantly expedite the initial assessment of submitted articles. Given the rapid technological advancements, publishers must embrace robust systems that enhance efficiency and effectiveness, particularly in reviewer assignments and article prioritization. This research employs the neural network approach of BERT and K-Means clustering to perform keyword extraction and topic clustering. Furthermore, using BERT facilitates accurate semantic understanding and context-aware representation of textual data. Additionally, BERT's pre-trained models enable its fine-tuning capability to allow customization to specific domains or tasks. By harnessing the power of BERT, publishers can gain deeper insights into the content of scholarly articles, leading to more informed decision-making and improved publication outcomes.
The Determinant Factors For The Issuance Of Central Bank Digital Currency (CBDC) In Malaysia Using Machine Learning Framework Awang Abu Bakar, Normi Sham
Journal of Applied Data Sciences Vol 5, No 2: MAY 2024
Publisher : Bright Publisher

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47738/jads.v5i2.176

Abstract

In order to identify the factors influencing the establishment of the Centre Bank Digital Currency (CBDC) in Malaysia, this study leverages the machine-learning technique to determine the most critical factors leading to CBDC issuance in Malaysia. The overall Central Bank Digital Currency Project Index (CBDCPI) was selected as a target variable,while two machine learning algorithms, Random Forest and XGBoost were used to identify the determining variables. The accuracy obtained through the Random Forest is 83% and subsequently, 80% in XGBoost. This study explored a new research frontier by creating two machine-learning models that treated retail and wholesale CBDCPI as target variables. The data used in the process are gathered from various official sources such as the Bank for International Settlements (BIS), the International Monetary Fund (IMF), and the World Bank. The Circulation of Cash, Prevalence of Cryptocurrencies, Effect of CBDC on International Trade, the Search Interest, Financial Development Index, Innovation Value, and Trade Openness are some of the most critical factors determining whether CBDC will be issued in Malaysia. Generally, are identified as important factors determining whether CBDC will be issued in Malaysia. Eventually, the factors identified will be used to develop a framework for the implementation of CBDC in Malaysia.
Development and Research of an Autonomous Device for Sending a Distress Signal Based on a Low-Orbit Satellite Communication System Ondyrbayev, Nurbolat; Zhumagali, Sabyrzhan; Chezhimbayeva, Katipa; Zhumanov, Yelaman; Nurzhauov, Nursultan
Journal of Applied Data Sciences Vol 5, No 3: SEPTEMBER 2024
Publisher : Bright Publisher

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47738/jads.v5i3.289

Abstract

Due to the importance of providing reliable communication for sending distress signals, research on the development of an autonomous device via low-orbit satellites is becoming particularly relevant, offering innovative solutions capable of providing fast and reliable communication in extreme situations. The purpose of this study was to investigate a device capable of operating autonomously in emergency situations and providing fast transmission of a signal about the need for help. The comparative method, statistical method, and analysis were used in the framework of research. The results of the study showed the significant potential of Long-Range Wide Area Networks (LoRaWAN) technology in the field of wireless communication. It provides high stability and noise immunity of data transmission, which makes it an attractive choice for various applications. Due to its high scalability, LoRaWAN is capable of servicing tens and hundreds of thousands of devices, making it an ideal solution for large-scale projects. LoRaWAN can achieve data transmission rates between 0.3 kbps to 50 kbps, with power consumption as low as 1.2 µA in sleep mode and 28 mA in transmit mode, and communication ranges up to 15 km in rural environments. Because of its low power consumption, it is ideal for use in battery-powered devices such as smart and distress sensors. In addition, it was found that the use of EBYTE E32 modules in LoRaWAN devices ensures reliable and efficient data transfer. The study confirms the potential of LoRaWAN technology for developing efficient and reliable wireless communication systems for various Internet of Things applications, ensuring reliable data transmission under various conditions. The results obtained are of great practical importance for the creation and further improvement of autonomous devices for the rapid sending of distress signals, contributing to increased safety and responsiveness to emergency situations.
Multi-Algorithm to Measure the Accuracy Level of Diabetes Status Prediction Zulkifli, Zulkifli; Makkiyah, Feda Anisah; Antoni, Darius; Fitriana, Fitriana; Jamaan, Taufik; Taufik, Ahmad
Journal of Applied Data Sciences Vol 5, No 2: MAY 2024
Publisher : Bright Publisher

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47738/jads.v5i2.250

Abstract

Poor management of diabetes leads to damage in organs and body tissues, impacting crucial organs like the heart, kidneys, eyes, and nerves. Although there is no permanent cure for diabetes, early detection enables effective disease management, which researchers and medical professionals agree enhances recovery prospects. The rapid progress in information technology has facilitated early prediction and diagnosis of diseases through Machine Learning (ML), a subset of Artificial Intelligence (AI) comprising various algorithms such as Neural Network, Support Vector Machine (SVM), kNN, Random Forest, and Naïve Bayes. These algorithms serve as effective tools in handling predictive data. Early prediction of diabetes holds the potential to control the disease and save lives. Therefore, the focus of this research is to develop a predictive model for diabetes status by utilizing various algorithms, but the level of validation of this model still needs to be tested. The dataset utilized consists of information from several diabetic patients, including eight input variables (pregnancies, glucose levels, blood pressure, skin thickness, insulin levels, BMI, age, and diabetes pedigree function) and one output variable (diabetes status). Research findings indicate that the SVM algorithm exhibits superior accuracy (84%) in predicting diabetes status compared to other algorithms such as neural network, Random Forest, Naïve Bayes, and kNN.
Mitigating Healthcare Information Overload: a Trust-aware Multi-Criteria Collaborative Filtering Model Shambour, Qusai Y; Abualhaj, Mosleh M; Abu-Shareha, Ahmad; Hussein, Abdelrahman H; Kharma, Qasem M
Journal of Applied Data Sciences Vol 5, No 3: SEPTEMBER 2024
Publisher : Bright Publisher

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47738/jads.v5i3.297

Abstract

The rapid growth of online health information resources has made it difficult for users, as well as providers of healthcare, to cope with large volumes of information that are becoming increasingly complex. Hence, there is an urgent demand for developing new advanced recommendation techniques in the healthcare domain to enhance decision-making processes. However, most current health recommendation systems, which recommend personalized healthcare services and items such as diagnoses, medications, and doctors based on users' health conditions and needs, are hindered by the data sparsity issue that compromises the reliability of their recommendations. In this paper, we intend to address this issue by proposing a Trust-aware Multi-Criteria Collaborative Filtering model for recommendation services in the healthcare domain. This model leverages multi-criteria ratings and integrates user-item trust relationships to improve the precision and coverage of recommendations, thus facilitating more informed healthcare choices that align closely with their individual needs. Our empirical analysis on two healthcare multicriteria rating datasets, including those with sparse data, shows the proposed model's superior performance over existing baseline methods. On the RateMDs dataset, our model improved the average MAE by 24% and RMSE by 19% compared to baseline methods. For the WebMD dataset, it enhanced the average MAE by 6% and RMSE by 2%. In sparse data scenarios, the model boosted the average MAE by 18% and Coverage by 6% compared to baseline approaches.
Analyzing Factors that Influence Student Performance in Academic Hidayani, Nieta; Dewi, Deshinta Arrova; Kurniawan, Tri Basuki
Journal of Applied Data Sciences Vol 5, No 2: MAY 2024
Publisher : Bright Publisher

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47738/jads.v5i2.221

Abstract

Student performance analysis is a complex and popular study area in educational data mining. Multiple factors affect performance in nonlinear ways, making this topic more appealing to academics. The broad availability of educational datasets adds to this interest, particularly in online learning. Although previous studies have focused on analyzing and predicting students' performance based on their classroom activities, this study did not take into account student's outside conditions, such as sleep hours, extracurricular activities, and a sample of question papers that they had practiced.  These three variables are included among others in our study. In this paper, we describe an analysis of 10,000 student records, with each record containing information on numerous predictors and a performance index. The dataset intends to shed light on the relationship between predictor variables and the performance indicator. To create the correlation variable heatmap, we use both univariate and bivariate studies to produce a linear equation. Following that, we perform data preprocessing and modeling to facilitate predictive analysis. Finally, we showed the outcomes of actual and expected student performance using the model we constructed. The findings demonstrate that our prediction model was 98% accurate, with a mean absolute error of 1.62.