Claim Missing Document
Check
Articles

Found 32 Documents
Search

Application of Random Forest Method Classification for Glycosylation in Lysine Protein Sequences Fitriyana, Silfia; Syarif, Admi; Rossyking, Favorisen; Faisal, Mohammad Reza
Integra: Journal of Integrated Mathematics and Computer Science Vol. 1 No. 2 (2024): July
Publisher : Magister Program of Mathematics, Universitas Lampung

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.26554/integrajimcs.20241218

Abstract

Grouping glycosylated lysine proteins into groups according to the type of glycosylation seen in the lysine protein sequence is known as glycosylation in the lysine protein sequence. In this work, the sensitivity, specificity, accuracy, and Matthew’s correlation coefficient (MCC) of the random forest approach for classifying the glycosylation of lysine protein sequences were examined. With 214 positive and 406 negative data, the lysine protein dataset derived from benchmark data contains 620 total proteins with a protein length of 15 sequences. 90% of the dataset is used for training, while 10% is used for testing. Using the R package BioSeqClass version 1.44.0, feature extraction employed protein descriptors, specifically AA Index, CTD, and PseAAC, with a total of 60 features. The Random Forest classification algorithm was used to reprocess the results with Mtry values of 4, 8, and 16. The number of trees (ntree) was randomly set to 250, 500, 750, and 1000. The best results were achieved with a dataset split of 90% training data and 10% test data, using Mtry of 42 and 1000 trees, resulting in 89.97% sensitivity, 92.79% specificity, 80.76% MCC, and 90.42% accuracy. These results demonstrate that the combination of feature extraction and the Random Forest algorithm is effective in classifying lysine proteins.
Sentiment Analysis of Twitter Discussions About Lampung Robusta Coffee: A Comparative Study of Machine Learning Algorithms with SVM as The Optimal Model Yuniarthe, Yodhi; Syarif, Admi; Shofi, Imam Marzuki; Fatimah Fahurian
JURNAL TEKNIK INFORMATIKA Vol. 18 No. 2: JURNAL TEKNIK INFORMATIKA
Publisher : Department of Informatics, Universitas Islam Negeri Syarif Hidayatullah

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.15408/jti.v18i2.41316

Abstract

Lampung Robusta coffee is an important commodity in Indonesia, particularly in terms of local economic potential and global recognition. However, public perception of this product on social media, particularly Twitter, remains underexplored. This study addresses the need for a deeper understanding of consumer sentiment towards Lampung Robusta coffee, which could inform branding and marketing strategies. To approach this issue, we used five supervised machine learning algorithms-KNN, Naive Bayes, SVM, Decision Tree, and Logistic Regression-to perform sentiment classification on a dataset of tweets containing relevant keywords. The dataset was pre-processed using standard natural language processing techniques, including tokenization, stopword removal, and TF-IDF feature extraction. The SVM achieved the best performance on the unbalanced dataset for all metrics, with high and consistent accuracy and F1 scores. Logistic regression followed closely with similarly strong and stable results. Therefore, SVM is recommended as the final model. These results suggest that machine learning approaches can effectively classify sentiment in social media discussions about regional agricultural products and that random forest may provide the most robust performance in this context