p-Index From 2020 - 2025
0.702
P-Index
This Author published in this journals
All Journal Jurnal Infra
Andre Gunawan
Program Studi Teknik Informatika, Universitas Kristen Petra Surabaya

Published : 4 Documents Claim Missing Document
Claim Missing Document
Check
Articles

Found 4 Documents
Search

Platform Big Data Analytic Berbasis Apache Spark Bagi Pemula Dalam Menyusun Data Analysis Workflow Daniel Jeremia; Henry Novianus Palit; Andre Gunawan
Jurnal Infra Vol 10, No 1 (2022)
Publisher : Universitas Kristen Petra

Show Abstract | Download Original | Original Source | Check in Google Scholar

Abstract

Data is a concrete foundation for decision-making. The development of technology, in turn, creates a problem in the number and complexity of data as it requires sophisticated methods to analyze. This calls for the need of big data analytics. Analyzing data quickly, simply, and robustly is now a very high requirement, especially for beginners.To combat this problem, a platform for big data analytics that is beginner-friendly is proposed in this research. This platform is created with the purpose of simplifying the process of analyzing data easily without the use of programming for beginners. Diagrams/workflows are designed to manipulate data in a drag-and-drop fashion to make it easier for beginners. Furthermore, this platform uses industry-leading technology such as Apache Spark to deal with the problems of big data analytic without being known by the user at all.A survey/demo of 12 people with 3 different backgrounds, namely commoners, beginners, and experts, is held. The obtained result indicates a positive experience in doing data analysis without programming. An average score of 4.4 out of 5 is given by the participants for how much this platform can simplify the work of data analysis. This big data analytic platform has a huge potential for beginners and professionals alike.
Analisis Sentimen Mahasiswa di Surabaya Terhadap Pelayanan Vaksinasi COVID-19 Menggunakan Beberapa Classifier Meliana Kusuma Pangkasidhi; Henry Novianus Palit; Andre Gunawan
Jurnal Infra Vol 10, No 2 (2022)
Publisher : Universitas Kristen Petra

Show Abstract | Download Original | Original Source | Check in Google Scholar

Abstract

Indonesia is one of the countries that are currently struggling to deal with the COVID-19 virus pandemic by providing vaccination. The government is currently trying to persuade the public to do vaccination by maximizing COVID-19 vaccination services. In reality, vaccination services still have problems with some aspects. To see various insights on vaccination services that have been implemented, therefore a research was conducted in the field of sentiment analysis to analyze public opinion. In this research, classifiers that will be used are Naïve Bayes, Support Vector Machine (SVM), Random Forest, and Light Gradient Boosting Machine (LGBM) to perform text classification and their performances will be compared with evaluation metrics. There are two types of datasets used, namely questionnaire dataset and social media dataset. The questionnaire model will be tested using a social media dataset, while the social media model will use social media dataset that will be split. The testing results show that the model trained with the social media dataset produces better performance than the questionnaire model. Of these four classifiers, the best model for aspect and sentiment classification is Random Forest
Prediksi Peringkat Mingguan Lagu Pada Spotify Amerika Serikat Menggunakan Multiple Charts Dataset Dengan Berbagai Metode Christianto Imanuel Aryanto; Henry Novianus Palit; Andre Gunawan
Jurnal Infra Vol 10, No 2 (2022)
Publisher : Universitas Kristen Petra

Show Abstract | Download Original | Original Source | Check in Google Scholar

Abstract

In 2020, the majority of the music industry's revenue, 62.1%, came from streaming music. As a result, many music business parties are striving for a hit song, particularly on Spotify US chart. However, this is difficult to achieve because nowadays, a song's performance is determined by its performance on various music charts, not by its quality. Due to that, a study in the field of hit song science will be conducted to forecast weekly song ranking on Spotify US using data from Spotify, Shazam, Airplay, and TikTok charts. Multipler linear regression, polynomial regression, gradient boosting tree, and random forest are the methods used in this study to create models, and each model will be compared using adjusted r-squared and mean absolute error (MAE) as evaluation metrics. Random forest produced the best model, with adjusted r-squared and MAE values of 93.133% and 11.687, respectively. The usage of music attribute had a negative impact on model performance. Shazam chart, on the other hand, has been shown to have a positive impact on model performance. Meanwhile, neither the Airplay nor the TikTok charts have a definite positive or negative impact. However, both have been shown to have a very weak relation with model performance. Overall, the dataset combination of Spotify, Shazam, Airplay, and TikTok chart produced the best model in this study.
Deteksi Plagiarisme pada Kode Bahasa Pemrograman Java menggunakan XGBoost Tomy Widjaja; Andre Gunawan; Liliana Liliana
Jurnal Infra Vol 10, No 2 (2022)
Publisher : Universitas Kristen Petra

Show Abstract | Download Original | Original Source | Check in Google Scholar

Abstract

With the ease of access to information and cloud server technology, it makes it easier for anyone to access the code data. Coupled with the industry 4.0 era, the number of informatics students is also increasing rapidly. This makes code plagiarism easier to do, especially in academic environment Manual checking of plagiarism is repetitive, difficult, and time-consuming task. Therefore, automation for high quality source code plagiarism detection is needed. The dataset used in this research was collected from “Dasar Pemrograman” class at Petra Christian University. After that the code will continue to tokenization preprocessing using java grammar stage. Then, the algorithm will calculate pairwise features using 3 main algorithms, namely levenshtein distance, greedy string tiling, and bigram which will produce 12 features and a collection of statistic features. Finally, the features will be used for the training and inference process on the XGBoost model. The test result shows that the proposed features have better performance metrics than previous research, it has f1-score of 99%. Implementation of preprocessing can also improve performance metrics on the features proposed in this study and in previous research.