Claim Missing Document
Check
Articles

Twitter Data Sentimen Analysis 2024 Presidential Candidate Using Algorithm Naïve Bayes Classifier By Methods K-Fold Cross Validation Aldi Prajela; Syafriandi Syafriandi; Dony Permana; Dina Fitria
UNP Journal of Statistics and Data Science Vol. 2 No. 1 (2024): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24036/ujsds/vol2-iss1/149

Abstract

Indonesia implements a democratic system by involving the public in General Elections (Pemilu) for specific political positions. The active community expresses opinions on social media, especially regarding the 2024 Presidential Election (Pilpres) and respective presidential candidates, which have become trending topics on Twitter. The analysis used to absorb these tweets into information is sentimen analysis using the Naïve Bayes Classifier algorithm with the K-fold Cross-Validation method. Through the stages of pre-processing, weighting, labeling, classification using NBC, and testing using a Confusion Matrix, The results of the classification from NBC showed that Anies got 80% positive tweets and 20% negative tweets from 1186 tweets, Prabowo Subianto got 78% positive tweets and 22% negative tweets from 1149 tweets, and Ganjar Pranowo got 77% positive tweets and 23% negative tweets from 1075 tweets. Testing process was carried out using the NBC algorithm with the K-Fold Cross Validation method using values k=1 to k=10. The function of K-Fold Cross Validation is to maximize the confusion matrix result. It can be conclude that Anies Baswedan has the highest score in iteration 4, namely a precision value of 90%, a recall value of 99%, and the accurary value of 91%. Furthemore, Ganjar Pranowo had the highest score in iteration 9, namely a precision value of 95%,a recall value of 97%, and an accuracy value of 92%. Meanwhile, Prabowo Subianto had the highest score in iteration 9, namely a precision value of 97%, a recall value of 99%, and an accuracy value of 94%.
Klasifikasi Karies Gigi di Rumah Sakit Gigi dan Mulut Baiturrahmah Menggunakan Metode Random Forest Martia Rosada; Zilrahmi; Syafriandi Syafriandi; Tessy Octavia Mukhti
UNP Journal of Statistics and Data Science Vol. 2 No. 2 (2024): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24036/ujsds/vol2-iss2/155

Abstract

The mouth cavity is the main gate through which germs and bacteria enter. Therefore, it is important to maintain oral hygiene. When dental and oral hygiene is not maintained it will cause dental and oral problems or diseases such as periodontitis, dental caries, tooth abscess, gingivitis and other dental and oral health problems. The dental and oral problems that many people experience are caries or cavities. West Sumatra itself has a fairly high prevalence of dental caries. Prevention of dental caries needs to be done by making the public aware of dental and oral hygiene in order to reduce the problem of dental caries in West Sumatra. Therefore, it is necessary to have a method that is able to classify dental caries based on its symptoms. The classification method is very useful for knowing the main factors that cause dental caries. One classification method that can be used is random forest. Random forest is an ensemble method, namely the development of several methods using bootstrap sampling. The results of this research use the smallest OOB level and the Variable Importance Measure (VIM). Random forest classification using dental and oral pain medical record data at Baiturrahmah Padang Hospital produces an OOB error rate of 32.08% or an accuracy rate of 67.92%. The optimal model is obtained using mtry=2 and ntree=200. From this research it can be concluded that dental plaque, age, and tooth brushing habits are the importance variables or main factors that influence dental caries.
Classification of Unemployment at West Sumatra Province in 2021 using Algorithm Classification and Regression Tree Nur Fadillah, Nur; Syafriandi Syafriandi; Nonong Amalita; Dony Permana
UNP Journal of Statistics and Data Science Vol. 2 No. 2 (2024): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24036/ujsds/vol2-iss2/166

Abstract

Unemployment is a problem that often occurs in developing countries. This is caused by the imbalance between the number of labor force and the number of working population. According to the Central Bureau of Statistics, West Sumatra Province in 2021 is the eighth province with a high open unemployment rate of 6,52%, which is higher than the average Indonesian open unemployment rate of 6,49%. The increase in unemployment has occurred from 2017 to 2021 which is caused by educated unemployment. This is due to the habit of job seekers who tend to pick and choose the types of jobs available, while business needs are very limited. The problem of unemployment will get higher if it is not resolved. As a result, unemployment can lead to poverty and other social problems. In this study, CART analysis is used to classify unemployment in West Sumatra Province in 2021 which aims to determine the factors that affect unemployment. CART is a decision tree that shows the relationship between the response variable and one or more predictor variables. The purpose of CART analysis is to obtain the right data group for classification purposes. Based on the analysis obtained, the variables that affect unemployment in West Sumatra Province in 2021 are marital status, gender, household status, education level, age, and place of residence with an accuracy value of 71,73%.
Classification of Program Keluarga Harapan Recipient Households in Padang Using K-Nearest Neighbors Yurivo Rianda Saputra; Syafriandi Syafriandi; Dony Permana; Zilrahmi
UNP Journal of Statistics and Data Science Vol. 2 No. 2 (2024): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24036/ujsds/vol2-iss2/167

Abstract

Program Keluarga Harapan (PKH) is a social assistance program from the government aimed at providing social protection in the central government's efforts to promote social welfareas. PKH provides benefits to poor families, especially pregnant women and children, by utilizing various health and education services available. PKH benefits also include people with disabilities and the elderly by maintaining their level of social welfare in accordance with the Constitution and the Nawacita of the Republic of Indonesia. The implementation of PKH that experiences distribution errors needs to be classified to ensure its proper distribution. Classification is performed by comparing the number of  neighbors (k) in K-Nearest Neighbors (KNN). The Synthetic Minority Oversampling Technique Edited Nearest Neighbors (SMOTEENN) is applied to balance classes in the target classification and Recursive Feature Elimination Cross Validation (RFECV) is applied to select attributes in the dataset used. The data source was obtained from SUSENAS 2023 data in Padang City. The research results show that KNN with k = 3 is a good algorithm for classifying households recieiving PKH using 10 attributes. KNN with k = 3 achieves an Accuracy of 91,12%, Precision of 89,29%, and Recall of 96,77%.
Drying Characteristics of Cacao Beans using Modified Solar Tunnel Dryer Type Hohenheim Khathir*, Rita; Kurniawan, Edi; Yunita, Yunita; Syafriandi, Syafriandi
Aceh International Journal of Science and Technology Vol 12, No 3 (2023): December 2023
Publisher : Graduate School of Universitas Syiah Kuala

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.13170/aijst.12.3.30246

Abstract

Drying cacao has been conducted by open-sun drying systems by farmers worldwide. To improve the cacao drying, the use of solar dryers can be applied. The objective of this study was to evaluate the drying characteristics of using a modified solar tunnel dryer type Hohenheim in drying cacao. As a comparison, the sun-drying method was also conducted. The parameters observed were temperature, relative humidity (RH), weight loss, moisture content, fat content, hardness, and drying rate. Results showed that the average temperature of the Hohenheim dryer was higher at about 10C than the ambient temperature. However, the Hohenheim dryer's drying temperature fluctuated due to the oscillation of solar irradiation. The drying process took time for 12h in 2 days. The humidity in the drying chamber was high, above 50%, representing that the dryer needed additional fans to improve its air circulation. The final moisture content of cacao dried using Hohenheim dryer and sun-drying was 12.7 and 17.4%, respectively. The drying rate of cacao dried using a Hohenheim dryer was double that of sun-drying. Therefore, the dryer can speed up the drying time and protect the cacao from contamination.
Pertunjukan Teater “Sumpah Satie Bukit Marapalam” Sebagai Media Promosi Pariwisata Puncak Pato Tanah Datar Eliza, Meria; Syafriandi, Syafriandi; Rachman, Fadlul
Journal of Tourism Sciences, Technology and Industry Vol 1, No 1 (2022): JTSTI-Journal of Tourism Sciences, Technology and Industry
Publisher : Institut Seni Indonesia Padangpanjang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.26887/jtsti.v1i1.2583

Abstract

Pariwisata telah menjadi salah satu kegiatan ekonomi global dan industri terpenting di dunia karena pariwisata mampu memberi kontribusi yang besar terhadap devisa Negara. Karena itu diperlukan media dan strategi promosi untuk mengkatkan sektor pariwisata. Pilihan terhadap teater sebagai media promosi pariwisata Puncak pato Tanah datar berdasarkan pertimbangan keunikan dan identitas wisata puncak pato yang serat dengan pesan sejarah dan budaya Minangkabau. Teater sebagai media baru dari formula pertunjukan menjadi media promosi pariwisata yang memiliki kekuatan teks dan konteks dan menjadi keunggulan teater sebagai media promosi pariwisata yang memberi solusi kontruktif terhadap dunia teater yang cenderung mengalami kesulitan menjangkau pasar global.  Teater sebagai media promosi pariwisata, mengetengahkan informasi dan pesan melalui pertunjukan teater yang dimainkan dan disampaikan aktor kepada publik perihal tentang keunikan pariwisata puncak pato dan sejarah penjanjian Sumpah Satie Bukit Marapalam yang melatarbelakanginya. Melalui pertunjukan teater Sumpah satie bukit marapalam, yang ditampilkan secara lansung di puncak pato dan disiarkan lansung secara virtual agar bisa dinikmati audience secara luas sebagai tujuan promosi wisata. Penulisan naskah lakon dengan berpijak pada teori Aristotelian yang menggambarkan struktur dramatik naskah dalam berapa bagian yaitu; eksposisi, aksi pendorong, krisis, klimaks dan resolusi. Dengan demikian fondasi dari ciptaan aktor adalah pemikiran yang matang mengenai struktur ini dan bagaimana karakter yang dimainkan memberi kontribusi kepada keseluruhanya. Stanislavsky menyebutnya dengan istilah “super-objektif  Katakunci: Teater, Media  Promosi, Pariwisata, Puncak Pato
Classification of Dropout Rates in West Sumatra Using the Random Forest Algorithm with Synthetic Minority Oversampling Technique Anita Fadila; Syafriandi Syafriandi; Yenni Kurniawati; Admi Salma
UNP Journal of Statistics and Data Science Vol. 2 No. 3 (2024): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24036/ujsds/vol2-iss3/183

Abstract

This study aims to classify school dropout rates in West Sumatra Province using the Random Forest algorithm with the Synthetic Minority Oversampling Technique (SMOTE). Based on 2021 data from the Ministry of Education, Culture, Research, and Technology (Kemdikbudristek), the dropout rate in West Sumatra is above the national average. Despite efforts to reduce dropout rates, results remain suboptimal. Therefore, this study seeks to identify the causes of student dropouts and compare the performance of the Random Forest algorithm with and without SMOTE. The study uses the 2021 dropout data from West Sumatra, which has a significant class imbalance. SMOTE is applied to balance the data. The dataset is split into training and testing sets in an 80%:20% ratio, and parameter tuning is performed to optimize mtry and the number of trees (ntree). The model is evaluated using a confusion matrix to compare performance. The results show that Random Forest with SMOTE outperforms the version without SMOTE, with improvements in precision, recall, and F1-score. The presence of the biological mother ( ) is identified as the most significant factor influencing student dropouts, based on the Mean Decrease Gini value. The study concludes that using SMOTE in the Random Forest algorithm helps reduce classification bias and enhances the model's ability to detect students at risk of dropping out.
Application of Extreme Learning Machine Algorithm (ELM) in Forecasting Inflation Rate in Indonesia Yonggi, Yonggi Septa Pramadia; Zamahsary Martha; Syafriandi Syafriandi; Tessy Octavia Mukhti
UNP Journal of Statistics and Data Science Vol. 2 No. 3 (2024): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24036/ujsds/vol2-iss3/194

Abstract

One indicator to determine the economic stability of a country can be seen from the inflation rate of a country. Inflation is an economic symptom in the form of a general increase in prices or a tendency to increase the prices of goods and services in general and continuously. In an effort to anticipate the impact of inflation in the future, an analysis is needed to find out how the development of the inflation rate is by forecasting. Extreme Learning Machine (ELM) is a feed-forward artificial neural network (ANN) algorithm with one hidden layer called Single Hidden Layer Neural Networks (SLFNs). Based on the research, forecasting the inflation rate in Indonesia using the Extreme Learning Machine algorithm obtained the best architecture  (12,48,1) with a MAPE value of 11%. These results show good forecasting because the resulting MAPE is relatively low.
Mixed Geographically Weighted Regression Modeling of Gender Development Index in Indonesia Nikma Hasanah; Dodi Vionanda; Syafriandi Syafriandi; Tessy Octavia Mukhti
UNP Journal of Statistics and Data Science Vol. 2 No. 3 (2024): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24036/ujsds/vol2-iss3/207

Abstract

The Gender Development Index (GDI) is one of the primary measures of gender equality in the field of human development. Indonesia's GDI statistics for 2023 show the development gap between men and women. Using Mixed Geographically Weighted Regression (MGWR), a blend of regression and Geographically Weighted Regression (GWR) models, to identify the factors influencing GDI is one approach to closing the gap. The results showed that when it came to value selection using the Akaike Information Criterion (AIC), the MGWR model outperformed the GWR model. Population with health complaints and adjusted per capita expenditure were found to be globally influential factors, while female participation in parliament, open unemployment rate, and labor force participation rate were found to be locally influential factors by the MGWR model with Adaptive Kernel Bisquare weights.
Implementation of CART Method with SMOTE for Household Poverty Classification in Mentawai Islands 2023 Dewi Adiningtiyas, Rheizma; Admi Salma; Syafriandi Syafriandi; Fadhilah Fitri
UNP Journal of Statistics and Data Science Vol. 2 No. 4 (2024): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24036/ujsds/vol2-iss4/232

Abstract

Poverty is a condition in which individuals or groups are unable to fulfill their basic needs due to economic pressure or limited resources. The Classification and Regression Trees (CART) method is a classification technique in the form of a classification tree, which describes the relationship between independent and dependent variables. Data imbalance can lead to low sensitivity values and area under curve (AUC) values. One method that can overcome unbalanced data is to perform Synthetic Minority Oversampling Technique (SMOTE). SMOTE is a technique with the addition of artificial data in the minority class at a stage before analyzing the data. The purpose of this research is to compare the model without and with SMOTE in CART method. The use of SMOTE is applied to balance the amount of data on each poor household. The accuracy value of the method without SMOTE is 89% while with the SMOTE method is 79%. However, the sensitivity value has increased by 80%. Meanwhile, the AUC value in the CART method with SMOTE increased by 31%. So in this study it can be concluded that CART classification analysis with SMOTE is able to provide better performance compared to CART classification analysis without SMOTE.