Alamsyah
Computer Science Department, Faculty Of Mathematics And Natural Sciences, Universitas Negeri Semarang, Indonesia

Published : 6 Documents Claim Missing Document
Claim Missing Document
Check
Articles

Found 6 Documents
Search

Implementation of the K-Nearest Neighbor Algorithm (KNN) with Principal Component Analysis to Diagnose Tuberculosis Yuliana Putri; Alamsyah Alamsyah
Recursive Journal of Informatics Vol. 3 No. 2 (2025): September 2025
Publisher : Universitas Negeri Semarang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.15294/rji.v3i2.5235

Abstract

Purpose: Tuberculosis (TB) is an infectious disease that attacks the respiratory organs, the lungs, and some can attack organs outside the lungs. Indonesia is one of the largest contributors to TB cases with around 320,000 new cases every year. Delays in diagnosing TB disease can cause a higher number of deaths due to errors in the treatment of sufferers. This makes the early diagnosis of TB disease important as early as possible. The research carried out aims to implement machine learning techniques to help diagnose TB disease. Methods: The research was carried out using the K-Nearest Neighbor (KNN) classification algorithm which was optimized with the Principal Component Analysis (PCA) feature selection technique. The dataset used consists of 577 data with 12 attributes labeled patients with tuberculosis and patients who do not have tuberculosis. Result: From the research that has been conducted, models that implement the KNN algorithm with PCA produce models with better performance than models that only implement KNN. The model that only uses KNN gets an accuracy of 92.528%, while the model that uses KNN and PCA gets an accuracy of 98.85%. This shows that the implementation of KNN and PCA is able to produce a good tuberculosis diagnosis model and can be used to assist in the early diagnosis of tuberculosis. Novelty: Using PCA in the feature selection process can reduce unnecessary attributes. It is a PCA that helps reduce the dimensionality, simplifies the visualization and interpretation of complex data sets. The use of PCA has been proven to be able to optimize the performance of the KNN algorithm for the detection of tuberculosis.  
Textual Entailment for Non-Disclosure Agreement Contract Using ALBERT Method Abdillah Azmi; Alamsyah Alamsyah
Recursive Journal of Informatics Vol. 3 No. 1 (2025): March 2025
Publisher : Universitas Negeri Semarang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.15294/rji.v3i1.9730

Abstract

Purpose: NDA (Non-Disclosure Agreement) is one type of contract letter. An NDA binds two or more parties who all agree that certain information shared or created by one party is confidential. This type of contract serves to protect sensitive information, maintain patent rights, or control the information shared. Reading and understanding a contract letter is a repetitive, time-consuming, and labor-intensive process. Nevertheless, the activity is still crucial in the business world, as it can bind two or more parties under the law. This problem is perfect for Artificial Intelligence using Deep Learning. Therefore, this research aims to test and develop a pretrained language model that is designed for understanding contract letters through Natural Language Inference task. Method The method used is to train model to perform the language inference task of textual entailment using CNLI (Contract NLI) dataset. ALBERT-base model version that has been tuned to perform textual entailment is used along with LambdaLR for early stopping and AdamW as optimizer. The model is pre-trained with CNLI dataset several times with multiple hyperparameter. Result: As a result, the ALBERT base model that was used showed an accuracy score of 85 and EM score up to 85.04 percent. Although this score is not the State of the Art of the CNLI benchmark, the trained model can outperform other base versions of model that based on BERT and BART, like SpanNLI BERT-base, SCROLLS (BART-base) and Unlimiformer (BART-base). Value: ALBERT is a model that focuses on memory efficiency and small size parameters while maintaining performance. This model is suitable for performing tasks that require long context understanding with minimum hardware requirements. Such a model could be promising for the future of NLP in the legal area.
Development of Digital Forensic Framework for Anti-Forensic and Profiling Using Open Source Intelligence in Cyber Crime Investigation Muhamad Faishol Hakim; Alamsyah Alamsyah
Recursive Journal of Informatics Vol. 2 No. 2 (2024): September 2024
Publisher : Universitas Negeri Semarang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.15294/7ytx8194

Abstract

Abstract. Cybercrime is a crime that increases every year. The development of cyber crime occurs by utilizing mobile devices such as smartphones. So it is necessary to have a scientific discipline that studies and handles cybercrime activities. Digital forensics is one of the disciplines that can be utilized in dealing with cyber crimes. One branch of digital forensic science is mobile forensics which studies forensic processes on mobile devices. However, in its development, cybercriminals also apply various techniques used to thwart the forensic investigation process. The technique used is called anti-forensics. Purpose: It is necessary to have a process or framework that can be used as a reference in handling cybercrime cases in the forensic process. This research will modify the digital forensic investigation process. The stages of digital forensic investigations carried out consist of preparation, preservation, acquisition, examination, analysis, reporting, and presentation stages. The addition of the use of Open Source Intelligence (OSINT) and toolset centralization at the analysis stage is carried out to handle anti-forensics and add information from digital evidence that has been obtained in the previous stage. Methods/Study design/approach: This research will modify the digital forensic investigation process. The stages of digital forensic investigations carried out consist of preparation, preservation, acquisition, examination, analysis, reporting, and presentation stages. The addition of the use of Open Source Intelligence (OSINT) and toolset centralization at the analysis stage is carried out to handle anti-forensics and add information from digital evidence that has been obtained in the previous stage. By testing the scenario data, the results are obtained in the form of processing additional information from the files obtained and information related to user names. Result/Findings: The result is a digital forensic phase which concern on anti-forensic identification on media files and utilizing OSINT to perform crime suspect profiling based on the evidence collected in digital forensic investigation phase. Novelty/Originality/Value: Found 3 new types of findings in the form of string data, one of which is a link, and 7 new types in the form of usernames which were not found in the use of digital forensic tools. From a total of 408 initial data and new findings with a total of 10 findings, the percentage of findings increased by 2.45%.
Comparison of Naive Bayes Classifier and K-Nearest Neighbor Algorithms with Information Gain and Adaptive Boosting for Sentiment Analysis of Spotify App Reviews Meidika Bagus Saputro; Alamsyah Alamsyah
Recursive Journal of Informatics Vol. 2 No. 1 (2024): March 2024
Publisher : Universitas Negeri Semarang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.15294/jkrk0n56

Abstract

Abstract. At this time, the development of technology are increase rapidly. One of the issue that appear with advance technology is data volume in the world has increase too. With the large data volumes that exist in the world it can be used to some purpose in many field. Entertainment is one of the field that have many interest from user in this world. Spotify is the example of entertainment apps that provided by Google Play Store to give online music streams to their users. Because that apps is provided by Google Play Store, many reviews of the user about the apps it can be classified to know the positive, negative, or neutral. One way to classified the review of user is make sentiment analysis. In this paper, to classify the review we use naïve Bayes classifier and k-nearest neighbors that will be compared with adding Information gain as feature selection and adaptive boosting as boosting algorithm of each classification algorithm that we used. The result of classification using naïve Bayes classifier with adding Information gain and adaptive boosting is 87.28% and k-nearest neighbor with adding information gain and adaptive boosting can perform accuracy of 80.35%. Purpose: Knowing the result each of accuracy from the naïve Bayes classifier and k-nearest neighbor algorithm with adding information gain and adaptive boosting that we used and know how to doing the sentiment analysis step by step with the methods that chosen in this study. Methods/Study design/approach: This study applied data preprocessing, lexicon based labelling with TextBlob, Normalization, Word Vectorization using TF-IDF, and classification with naïve Bayes classifier and k-nearest neighbor, information gain as feature selection, and adaptive boosting as boosting algorithm to boost the accuracy of classification result. Result/Findings: The accuracy of naïve Bayes classifier with adding information gain and adaptive boosting is 87.28%. Meanwhile, by k-nearest neighbor with adding information gain and adaptive boosting reach the accuracy of 80.35%. This result obtained by using 60.000 dataset with data splitting 80% as data training and 20% as data testing. Novelty/Originality/Value: Implementing information gain as feature selection and adaptive boosting as boosting algorithm to naïve Bayes classifier is prove that it can be increase the accuracy of classification, but not same when implementing in k-nearest neighbor. So, for the future research can applied another classification algorithm or feature selection to get better result.
Comparison of Probabilistic Neural Network (PNN) and k-Nearest Neighbor (k-NN) Algorithms for Diabetes Classification Diah Siti Fatimah Azzahrah; Alamsyah Alamsyah
Recursive Journal of Informatics Vol. 1 No. 2 (2023): September 2023
Publisher : Universitas Negeri Semarang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.15294/c363d161

Abstract

Purpose: This study aims to compare algorithms to determine the accuracy of the algorithm and determine the speed of the algorithm used for diabetes classification. Methods: There are two algorithms used in this study, namely Probabilistic Neural Network (PNN) and k-Nearest Neighbor (k-NN). The data used is the Pima Indians Diabetes Database. The data contains 768 data with 8 attributes and 1 target class, namely 0 for no diabetes and 1 for diabetes. The dataset has been divided into 80% training data and 20% testing data. Result: Accuracy is obtained after implementing k-fold cross validation with a value of k = 4. The accuracy results show that the k-Nearest Neighbor algorithm is superior and has better quickness compared to the Probabilistic Neural Network. The k-Nearest Neighbor algorithm obtains an accuracy of 74.6% for all features and 78.1% for four features Novelty: The novelty of this paper is optimizing and improving accuracy which is implemented with by focusing on data preprocessing, feature selection and k-fold cross validation in the classification algorithm
C4.5 Algorithm Optimization and Support Vector Machine by Applying Particle Swarm Optimization for Chronic Kidney Disease Diagnosis Lisa Ariyanti; Alamsyah Alamsyah
Recursive Journal of Informatics Vol. 1 No. 1 (2023): March 2023
Publisher : Universitas Negeri Semarang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.15294/cdnb8v88

Abstract

Abstract. Kidneys are one of the organs of the body that have a very important function in life. The main function of the kidneys is to excrete metabolic waste products. Chronic kidney disease is a result of the gradual loss of kidney function. Chronic kidney disease occurs when the kidneys are unable to maintain an internal environment consistent with life and the restoration of useless functions. Data mining is one of the fastest growing technologies in biomedical science and research. Purpose: In the field of medicine, data mining can improve hospital information management and telemedicine development. In the first stage of data mining process, data processing is done with pre-processing by handling missing values ​​and data transformation. Then, the feature selection stage is carried out using the Particle Swarm Optimization algorithm to find the best attributes. Next, it is done by classifying the dataset. Methods/Study design/approach: The algorithm used for classification is the C4.5 Algorithm and the Support Vector Machine. Both classifications are known as algorithms that have a fairly good level of accuracy. This study uses the chronic kidney disease dataset from the UCI Machine Learning Repository. Result/Findings: This research increases the accuracy by 100% for the C4.5 Algorithm and 98.75% for the Support Vector Machine by using 24 attributes and 1 class attribute. Novelty/Originality/Value: The purpose of this study was to determine the level of accuracy of the comparison between the C4.5 Algorithm and the Support Vector Machine after applying the Particle Swarm Optimization algorithm.