Claim Missing Document
Check
Articles

Sentiment Analysis about Anti-LGBT Campaign using the Naïve Bayes Classifier rios; Syafriandi Syafriandi; Dony Permana; Dina Fitria
UNP Journal of Statistics and Data Science Vol. 2 No. 1 (2024): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24036/ujsds/vol2-iss1/146

Abstract

Social media is growing so that the news that is discussed is also very fast to be known by everyone. The news or topic that is being discussed on social media is the anti-LGBT campaign. The conversation about the anti-LGBT campaign is expressed in the form of opinions that contain positive and negative feelings. The opinion is conveyed through Twitter. Twitter is a microblogging social media site that allows users to create short messages and share them easily and quickly. Opinions on Twitter are used to see whether the opinion rejects or supports the anti-LGBT campaign. The use of sentiment analysis helps to see the opinion supports or rejects the anti-LGBT campaign. The algorithm used to perform sentiment analysis is the Naïve Bayes Classifier. The purpose of this study is to determine the sentiment analysis of anti-LGBT campaign tweets on Twitter. This study using Phython as the tools. The dataset used is 3103 tweets with 80% training data and 20% test data. The sentiment analysis results obtained in this study show that Twitter users in Indonesia have more positive opinions. The use of the Naïve Bayes Classifier algorithm produces an accuracy of 68,75%, precision of 99,6%, and recall of 92,8%.
Twitter Data Sentimen Analysis 2024 Presidential Candidate Using Algorithm Naïve Bayes Classifier By Methods K-Fold Cross Validation Aldi Prajela; Syafriandi Syafriandi; Dony Permana; Dina Fitria
UNP Journal of Statistics and Data Science Vol. 2 No. 1 (2024): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24036/ujsds/vol2-iss1/149

Abstract

Indonesia implements a democratic system by involving the public in General Elections (Pemilu) for specific political positions. The active community expresses opinions on social media, especially regarding the 2024 Presidential Election (Pilpres) and respective presidential candidates, which have become trending topics on Twitter. The analysis used to absorb these tweets into information is sentimen analysis using the Naïve Bayes Classifier algorithm with the K-fold Cross-Validation method. Through the stages of pre-processing, weighting, labeling, classification using NBC, and testing using a Confusion Matrix, The results of the classification from NBC showed that Anies got 80% positive tweets and 20% negative tweets from 1186 tweets, Prabowo Subianto got 78% positive tweets and 22% negative tweets from 1149 tweets, and Ganjar Pranowo got 77% positive tweets and 23% negative tweets from 1075 tweets. Testing process was carried out using the NBC algorithm with the K-Fold Cross Validation method using values k=1 to k=10. The function of K-Fold Cross Validation is to maximize the confusion matrix result. It can be conclude that Anies Baswedan has the highest score in iteration 4, namely a precision value of 90%, a recall value of 99%, and the accurary value of 91%. Furthemore, Ganjar Pranowo had the highest score in iteration 9, namely a precision value of 95%,a recall value of 97%, and an accuracy value of 92%. Meanwhile, Prabowo Subianto had the highest score in iteration 9, namely a precision value of 97%, a recall value of 99%, and an accuracy value of 94%.
Comparison of Modeling Infant Mortality Rate in West Sumatra and West Java Province in 2021 Using Negative Binomial Regression Afdhal, Afdhal Rezeki; Fadhilah Fitri; Dodi Vionanda; Dony Permana
UNP Journal of Statistics and Data Science Vol. 2 No. 2 (2024): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24036/ujsds/vol2-iss2/156

Abstract

In Poisson regression analysis, there is an assumption that must be met, namely equidispersion (the variance value of the response variable is the same as the mean). In reality, conditions like this very rarely occur because overdispersion usually occurs (the variance value of the response variable is greater than the mean). One way to overcome this problem is to use the Negative Binomial regression method. The aim of this article is to obtain the best modeling results in Negative Binomial regression analysis to overcome overdispersion in cases of infant mortality in West Sumatra Province and West Java Province. The model obtained using Negative Binomial regression produces an AIC value in West Sumatra province of 192.65 which is smaller than the AIC value in West Java Province it was 283.47. Based on the Negative Binomial regression model equation obtained in West Sumatra Province, it can be explained that the number of health centers (X3) has a significant influence on the infant mortality rate and in West Java Province it can be explained that the number of medical personnel (X1) has a significant influence on the infant mortality rate.
Sentiment Analysis Using Support Vector Machine (SVM) of ChatGPT Application Users in Play Store Muthia Sakhdiah; Admi Salma; Dony Permana; Dina Fitria
UNP Journal of Statistics and Data Science Vol. 2 No. 2 (2024): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24036/ujsds/vol2-iss2/158

Abstract

The ChatGPT application is an Articial Intelligence (AI) technology that responds to conversations in form text and voice messages, and is accessible via smartphones or computers. The ChatGPT provides answers and solutions related to the problems asked, the speed and complexity of the answers are also added values of this application. However, there are negative impacts, one of which is the vulnerability of scientific papers to plagiarism. Because of this, there are many reviews from the community that assess this application. These reviews can be seen on the Play Store which can be a reference before downloading the ChatGPT application. How the community responds can be seen through sentiment analysis, which will classify positive and negative assessments. Making it easier for companies to evaluate products. Then classification is carried out using Support Vector Machine (SVM), the classification model obtained is used to classify user reviews of the ChatGPT application. The results showed an accuracy of 93.9% with a linear kernel, and the sentiment of people who use the ChatGPT application is more positive.
Classification of Unemployment at West Sumatra Province in 2021 using Algorithm Classification and Regression Tree Nur Fadillah, Nur; Syafriandi Syafriandi; Nonong Amalita; Dony Permana
UNP Journal of Statistics and Data Science Vol. 2 No. 2 (2024): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24036/ujsds/vol2-iss2/166

Abstract

Unemployment is a problem that often occurs in developing countries. This is caused by the imbalance between the number of labor force and the number of working population. According to the Central Bureau of Statistics, West Sumatra Province in 2021 is the eighth province with a high open unemployment rate of 6,52%, which is higher than the average Indonesian open unemployment rate of 6,49%. The increase in unemployment has occurred from 2017 to 2021 which is caused by educated unemployment. This is due to the habit of job seekers who tend to pick and choose the types of jobs available, while business needs are very limited. The problem of unemployment will get higher if it is not resolved. As a result, unemployment can lead to poverty and other social problems. In this study, CART analysis is used to classify unemployment in West Sumatra Province in 2021 which aims to determine the factors that affect unemployment. CART is a decision tree that shows the relationship between the response variable and one or more predictor variables. The purpose of CART analysis is to obtain the right data group for classification purposes. Based on the analysis obtained, the variables that affect unemployment in West Sumatra Province in 2021 are marital status, gender, household status, education level, age, and place of residence with an accuracy value of 71,73%.
Classification of Program Keluarga Harapan Recipient Households in Padang Using K-Nearest Neighbors Yurivo Rianda Saputra; Syafriandi Syafriandi; Dony Permana; Zilrahmi
UNP Journal of Statistics and Data Science Vol. 2 No. 2 (2024): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24036/ujsds/vol2-iss2/167

Abstract

Program Keluarga Harapan (PKH) is a social assistance program from the government aimed at providing social protection in the central government's efforts to promote social welfareas. PKH provides benefits to poor families, especially pregnant women and children, by utilizing various health and education services available. PKH benefits also include people with disabilities and the elderly by maintaining their level of social welfare in accordance with the Constitution and the Nawacita of the Republic of Indonesia. The implementation of PKH that experiences distribution errors needs to be classified to ensure its proper distribution. Classification is performed by comparing the number of  neighbors (k) in K-Nearest Neighbors (KNN). The Synthetic Minority Oversampling Technique Edited Nearest Neighbors (SMOTEENN) is applied to balance classes in the target classification and Recursive Feature Elimination Cross Validation (RFECV) is applied to select attributes in the dataset used. The data source was obtained from SUSENAS 2023 data in Padang City. The research results show that KNN with k = 3 is a good algorithm for classifying households recieiving PKH using 10 attributes. KNN with k = 3 achieves an Accuracy of 91,12%, Precision of 89,29%, and Recall of 96,77%.
Analisis Sentimen Pengguna Aplikasi X terhadap Konflik antara Israel dan Palestina Menggunakan Algoritma Support Vector Machine Carina, Fadhillah Meisya; Admi Salma; Dony Permana; Zamahsary Martha
UNP Journal of Statistics and Data Science Vol. 2 No. 2 (2024): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24036/ujsds/vol2-iss2/170

Abstract

The conflict between Israel and Palestine is the Middle East's longest-running conflict since 1917 and is still ongoing today. This is one of the international conflicts that involves many Arab countries and Western countries in the dispute. The conflict between Israel and Palestine has caused countries in the world to be divided into two camps, namely the pro Palestinian independence camp and the contra camp. The impact of this conflict also creates polarization among Indonesians and forms diverse public opinions on the social media application X. The purpose of this research is to find out how the classification of sentiment of X application users affects the conflict between Israel and Palestine. An analysis that is utilized to convert text-based public opinion data into information is sentiment analysis. The chosen algorithm to separate data classes is the Support Vector Machines algorithm, which can classify data by determining the best hyperplane to provide a separator between opinions that are pro Israel or pro Palestine. After the preprocessing stage, 1000 tweets data were obtained with 800 training data and 200 testing data. The accuracy rate is 93%, precision is 92.93%, recall is 100%, and f-measure is 96.33%. From the results of testing 200 data points, there were 198 pro Palestine opinions and 2 pro Israel opinions, so that it might be said that more individuals favor or support Palestinian independence in the conflict that occurred between Israel and Palestine.
Random Forest Implementation for Air Pollution Standard Index Classification in DKI Jakarta 2022 Hasna, Hanifa; Nonong Amalita; Dony Permana; Admi Salma
UNP Journal of Statistics and Data Science Vol. 2 No. 2 (2024): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24036/ujsds/vol2-iss2/173

Abstract

Air pollution is a serious challenge in various cities, including DKI Jakarta. Based on measurements of the Air Pollution Standard Index carried out by the DKI Jakarta Environmental Service, the air quality in DKI Jakarta is considered moderate to unhealthy. Deteriorating air quality in the Jakarta metropolitan area is very dangerous for humans and living things. Therefore, to prevent the problem, the classification of air quality based on pollutant content is carried out using Random Forest (RF). The application of RF will form several trees that can provide better predictions and are able to produce low errors. The result of this study obtained optimal tree formation, namely tree formation using a combination of mtry (any input variables randomly selected in one sorting node)=2 and ntree (number of trees in the forest) as many as 5000 trees. The resulting accuracy was 99.17% with an OOB error rate of 0.83%. This research identifies that particulate pollutants are the main factor causing air pollution in DKI Jakarta. Based on these results, it shows that RF is able to provide accurate predictions about the level of air pollution in DKI Jakarta and can be identify important factors that affect air pollution.
PEMODELAN INDEKS PEMBANGUNAN GENDER (IPG) PROVINSI JAWA BARAT DENGAN PENDEKATAN REGRESI NONPARAMETRIK DERET FOURIER RIZKIA, DHEA PUTRI; Fadhilah Fitri; Dony Permana; Dina Fitria
UNP Journal of Statistics and Data Science Vol. 2 No. 2 (2024): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24036/ujsds/vol2-iss2/174

Abstract

Gender equality is a development target in many countries. The ideal condition in human development that is expected is that male and female population groups have equal access to play a role in development, control over existing development resources, and receive benefits from development equally and fairly. The gender gap still occurs today in all aspects. The condition of the gender gap can be known by looking at the Gender Development Index . In observing the data curve, between the Gender Development Index and each independent variable does not form a certain pattern. In addition, the data patterns that are formed tend to repeat. Nonparametric regression analysis is the solution. Fourier series is a nonparametric analysis used for repetitive data. Modeling was performed using 1, 2, and 3 oscillation parameters. Of the three parameters, the best model resulted from the K=3 oscillation parameters with a GCV value of 2.8084 and a coefficient of determination of 42.39%.
Comparison of Estimate Method of Moment and Least Trimmed Squares in Models Robust Regression Tri Wahyuni Nurmulyati; Dony Permana; Nonong Amalita; Martha, Zamahsary
UNP Journal of Statistics and Data Science Vol. 2 No. 2 (2024): UNP Journal of Statistics and Data Science
Publisher : Departemen Statistika Universitas Negeri Padang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24036/ujsds/vol2-iss2/176

Abstract

The poverty line is the minimum income that a person must earn to be considered to have a decent standard of living in a particular area. In 2022, the poverty line in West Sumatra Province was higher than the poverty line in Indonesia as a whole. An analysis was conducted to identify the factors influencing the poverty line in West Sumatra Province. However, the observational data on the poverty line and its influencing factors contained outlier. Therefore, robust regression analysis was performed to address the data containing outlier, comparing two estimates: MM estimation and LTS estimation. By examining the value, the best estimate was found to be MM estimation, with significant factors being average net wages/salaries, TPT, APM, and AMH. If the average net wages/salaries, TPT, APM, and AMH increase, the poverty line in West Sumatra will rise. With an of 0.9582, the model can explain 95.82% of the variation in the poverty line, while the remaining variation is explained by other factors not included in the model.
Co-Authors 01, Riska Addini, Vidhiya Ade Eriyen Saputri Admi Salma Admi Salma Afdhal, Afdhal Rezeki Afifah Zafirah Ahmad Fauzan Aidillah, Kerin Hagia Alandra, Cindy Resha Aldi Prajela Ali Asmar Andini Diva Luthfiyah april leniati Armiati Arnellis Arnellis Arssita Nur Muharromah Atus Amadi Putra Azma, Meil Sri Dian Bahri Annur Sinaga Bonita Nurul Afifah Carina, Fadhillah Meisya Denny Armelia Dewi Febiyanti Dina Fitria Dina Fitria, Dina Dinul Haq, Asra Dodi Vionanda Dwi Putri Amilia Dwi Ratih Listiani Yusri Dwi Sulistiowati Edwin Musdi Elita Zusti Jamaan Elsa Oktaviani Elvina Catria Emi Suryani Putri Fadhilah Fitri Fadhillah Fitri Fadlan Rafly, Muhammad Fanni Rahma Sari Fauzan Arrahman Febri Ramayanti Fenni Kurnia Mutiya Fishuri, Nufhika Hana Rahma Trifanni Hana Zafirah haniyathul husna Hardi, Afifah Hasna, Hanifa Hefiani Mustika Hasanah Helma Helma Huriati Khaira I Made Arnawa Ibnul farizi, Gilang iin aini fitri Indonesia Irma Surya Anisa Isra Miraltamirus Kamil, Fakhri Kurnia Andrea Diva martha, Ully Martha Media Rosha Meidiani Sandra Meliani Maya Sari Meliani Putri Mohammad Reza febrino Muslimah, Nailul Amani Muthia Sakhdiah Mutiara Amazona Sosiawati nabillah putri Nadya Nadya nazhiroh, hanifah Nilda Yanti Nisa Ulkhairat Asfar Nisa, Farras Luthfyah Nonong Amalita Nur Fadillah, Nur Nurdalia Nurul Afifah Putra, M. Farel Rusde rahmad revi fadillah rama novialdi Refenia Usman Refina Rintani Revina Rahmadani Ridha Fajria rios Riry Sriningsih RIZKIA, DHEA PUTRI Ronald Rinaldo roza maylinda Salsabilla Khairani Septrina Kiki Arisandi Siltima Wiska Siregar, Fauzan Al-Hamdani Sofni Fajriani SRI RAHAYU Suherman Suherman Suwanda Risky Syafriandi Syafriandi Syafriandi Tessy Octavia Mukhti Titin Mardianingsih Tri Wahyuni Nurmulyati Vinka Haura Nabilla Wahda Aulia Assara Welgi Okta Irawan Widia Handa Riska Yarman Yarman Yatri Asri Yenni Kurniawati Yerizon Yerizon Yoga Perdana Yuli Andari Wulan Yulia Pertiwi Yulia Utami Putri Yulyanti Harisman Yurivo Rianda Saputra YUSWITA, AULIA Zamahsary Martha Zilrahmi, Zilrahmi