Claim Missing Document
Check
Articles

Found 5 Documents
Search
Journal : Building of Informatics, Technology and Science

POS Tagger Improvisation with the Addition of Foreign Word Labels on Telkom University News Winkie Setyono; Donni Richasdy; Mahendra Dwifebri Purbolaksono
Building of Informatics, Technology and Science (BITS) Vol 4 No 2 (2022): September 2022
Publisher : Forum Kerjasama Pendidikan Tinggi

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47065/bits.v4i2.1983

Abstract

News is a medium of daily information usually obtained by the public. The news consists of a lot of information in it and is composed of sentence structures. Each language is unique with its own sentence structure, like Indonesian and other foreign languages. But nowadays, many media mix Indonesian with foreign languages, making the sentence structure different from Bahasa Indonesia. To classify these words, Part Of Speech Tagging needed to determine the class of words composed of sentences by learning from the Corpus of each language. With the new sentence structure, POS Tagger requires a larger Corpus to learn. The language structure can determine the results of tagging from the POS Tagger. If there are words that are not in the Corpus, it can reduce the accuracy of the POS Tagger. We conducted to enhance the research results by adding data with a different sentence structure from the Indonesian Language Corpus using sentences from online media. Added about 242 sentences with 7,043 tokens on Corpus focused on Foreign Word tags, which total 3819 tags. After doing some testing and scenarios, the results of the accuracy of POS Tagger show an accuracy of 94.7% using the Hidden Markov Model method with the F1-Score tag FW 78%.
Telkom University News Topic Modeling Using Latent Semantic Analysis (LSA) Method on Online News Portal Amala, Ihsan Ahsanu; Richasdy, Donni; Purbolaksono, Mahendra Dwifebri
Building of Informatics, Technology and Science (BITS) Vol 4 No 1 (2022): June 2022
Publisher : Forum Kerjasama Pendidikan Tinggi

Show Abstract | Download Original | Original Source | Check in Google Scholar | Full PDF (415.709 KB) | DOI: 10.47065/bits.v4i1.1584

Abstract

In this day and age, the development of online news portals regarding news is quite easy to access, online news portals are information that explains an event that has occurred or is happening with electronic media intermediaries, as well as news about Telkom University which is quite easily accessible through online news portals. A system has been designed that is capable of modeling Telkom University news topics. Modeling news topics is very interesting to be used as research material because the process of understanding each individual on the topics contained in the news is different, therefore topic modeling is needed to find out what topics are news about Telkom University. In this study, a Latent Semantic Analysis (LSA) model has been designed to carry out a topic modeling process that aims to make it easier for readers to understand news topics related to Telkom University, Latent Semantic Analysis (LSA) is a mathematical method in finding hidden topics by analyzing the structure semantics of the text. After doing several research scenarios, the best coherence score was 0.524 with a total of six topics.
Sentiment Analysis on Movie Review from Rotten Tomatoes Using Modified Balanced Random Forest Method and Word2Vec Nugraha, Mohamad Rizki; Purbolaksono, Mahendra Dwifebri; Astuti, Widi
Building of Informatics, Technology and Science (BITS) Vol 5 No 1 (2023): June 2023
Publisher : Forum Kerjasama Pendidikan Tinggi

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47065/bits.v5i1.3596

Abstract

The film industry is one of the impacts of the rapid development of technology. This causes the film industry to increase every year. In addition, technological developments also affect the public to make it easier to access various movies from various websites. With many choices of movies, people need to know the quality of various movies by knowing the reviews of these movies from other people. However, the large number of audience reviews of a movie makes it difficult for people to categorize good movies and bad movies. The solution to the problem is to perform sentiment analysis on movie reviews. In this research, the classification method used is Modified Balanced Random Forest. This method was chosen because it can overcome imbalanced data and can increase accuracy and reduce time complexity. In this research, Word2Vec is also used as feature extraction. This feature extraction was chosen because previous research explained that Word2Vec has the advantage of being able to show the contextual similarity of two words in the resulting vector. The best model produced from this research is a model built without using stemming in the preprocessing stage, using 300 dimensions in Word2Vec, and using the Modified Balanced Random Forest classification method which produces an f1-score of 84.15%.
Sentiment Analysis of Practo App Reviews using KNN and Word2Vec Farhan, Muhammad; Purbolaksono, Mahendra Dwifebri; Astuti, Widi
Building of Informatics, Technology and Science (BITS) Vol 5 No 1 (2023): June 2023
Publisher : Forum Kerjasama Pendidikan Tinggi

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47065/bits.v5i1.3598

Abstract

The development of technology and communication is used by the community to facilitate daily activities, one of which is in the field of health services. Health services are good enough, but there are still some obstacles that are commonly found, including not allowing to leave the house or a short schedule of doctor consultations. With the presence of health service applications, one of which is Practo, it makes it easier for people to consult online. This convenience makes a lot of reviews regarding the Practo healthcare application. The diversity of opinions on the internet, makes Practo app reviews varied. Therefore, sentiment analysis of Practo app reviews is necessary. In this study, the algorithm used was KNN. The KNN algorithm was chosen because it is very effective if the amount of data is large and easy to implement. The feature extraction used in this study is Word2Vec. Word2Vec was chosen as a feature extraction because it was considered good enough to use because it represented each word with a vector. This research produced the best model built when using stemming with Word2Vec dimensions of 300 and K = 3 values on the KNN parammeter, capable of producing an f1-score of 77.30%.
Sentiment Classification and Interpretation of Tokopedia Reviews: A Machine Learning, IndoBERT, and LIME Approach Mbake Woka, Adrian Yoris; Purbolaksono, Mahendra Dwifebri; Utama, Dody Qori
Building of Informatics, Technology and Science (BITS) Vol 7 No 2 (2025): September 2025
Publisher : Forum Kerjasama Pendidikan Tinggi

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47065/bits.v7i2.8072

Abstract

Sentiment classification of user reviews plays a vital role in business decision-making, especially on e-commerce platforms like Tokopedia. This study evaluates the performance of various sentiment classification models such as Logistic Regression LinearSVC, and BERT models, both baseline and fine-tuned. Evaluation metrics used include accuracy, precision, recall, and F1-score, applied to Tokopedia review data labelled based on user ratings. The result is fine-tuned BERT model has the best and consistent result, with 92% accuracy and 0.92 f1-score for each class. This shows that fine-tuned BERT can effectively capture the semantic context of user reviews. Its consistent performance across classes makes it suitable for reliable sentiment classification in real-world applications. Furthermore, fine-tune BERT model is visualized by Local Interpretable Model-agnostic Explanation to identify features – in this case is word – that indicates sentiment as positive or negative. It will show as color, orange for positive and blue as negative. This method will make the model more transparent and more reliable.