Informatics and Software Engineering (ISE)
Vol. 2 No. 2 (2024): December 2024

Linear Regression Analysis to Predict the Percentage of Smoking in the Population Age 15 Years and Over

Riziq Shihab, Muhammad Alfata (Unknown)
Napiah, Musriatun (Unknown)



Article Info

Publish Date
13 Nov 2024

Abstract

Smoking is a serious public health problem in many countries, including Indonesia, as it can cause diseases such as lung cancer, heart disease and respiratory disorders. According to data from the Ministry of Health of the Republic of Indonesia, the prevalence of smoking among the population aged 15 years and above is still high. This study uses secondary data from the Central Bureau of Statistics (BPS) that records the percentage of smoking in the population aged 15 years and above by age group from 2019 to 2023. With this data, a linear regression algorithm was applied using RapidMiner to predict the percentage of smoking in 2024. The analysis showed that out of 11 age groups, 6 age groups experienced an increase in smoking percentage from the previous year: 15-19, 20-24, 25-29, 30-34, 55-59, and 60-64. Meanwhile, the other 5 age groups experienced a decrease: 35-39, 40-44, 45-49, 50-54, and 65+. Evaluation of the prediction model using root mean squared error (RMSE) resulted in a value of 0.4 +/- 0.000. This RMSE value indicates that the model has a low error rate, making it reliable for predicting the percentage of smoking by age group in Indonesia.   Merokok adalah masalah kesehatan masyarakat yang serius di banyak negara, termasuk Indonesia, karena dapat menyebabkan penyakit seperti kanker paru-paru, penyakit jantung, dan gangguan pernapasan. Menurut data Kementerian Kesehatan Republik Indonesia, prevalensi merokok di kalangan penduduk usia 15 tahun ke atas masih tinggi. Penelitian ini menggunakan data sekunder dari Badan Pusat Statistik (BPS) yang mencatat persentase merokok penduduk usia 15 tahun ke atas berdasarkan kelompok umur dari tahun 2019 hingga 2023. Dengan data ini, algoritma regresi linear diterapkan menggunakan RapidMiner untuk memprediksi persentase merokok pada tahun 2024. Hasil analisis menunjukkan bahwa dari 11 kelompok umur, terdapat 6 kelompok umur yang mengalami peningkatan persentase merokok dari tahun sebelumnya: 15-19, 20-24, 25-29, 30-34, 55-59, dan 60-64. Sementara itu, 5 kelompok umur lainnya mengalami penurunan: 35-39, 40-44, 45-49, 50-54, dan 65+. Evaluasi model prediksi menggunakan root mean squared error (RMSE) menghasilkan nilai 0.884 +/- 0.000. Nilai RMSE ini menunjukkan bahwa model memiliki tingkat kesalahan yang rendah, sehingga dapat diandalkan untuk memprediksi persentase merokok berdasarkan kelompok umur di Indonesia.

Copyrights © 2024






Journal Info

Abbrev

ise

Publisher

Subject

Computer Science & IT

Description

The Informatics and Software Engineering is an open-access and peer-reviewed journal that publishes theoretical and empirical research articles, review papers, and case studies on all major Informatics and Software Engineering topics. The journals mission is to offer a forum for the growing amount ...