Claim Missing Document
Check
Articles

Found 3 Documents
Search

CABLE NEWS NETWORK (CNN) ARTICLES CLASSIFICATION USING RANDOM FOREST ALGORITHM WITH HYPERPARAMETER OPTIMIZATION Saputro, Dewi Retno Sari; Sidiq, Krisna
BAREKENG: Jurnal Ilmu Matematika dan Terapan Vol 17 No 2 (2023): BAREKENG: Journal of Mathematics and Its Applications
Publisher : PATTIMURA UNIVERSITY

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.30598/barekengvol17iss2pp0847-0854

Abstract

The growth of news articles on the internet occurs in a short period with large amounts so necessary to be grouped into several categories for easy access. There is a method for grouping news articles, namely classification. One of the classification methods is random forest which is built on decision tree. This research discusses the application of random forest as a method of classifying news articles into six categories, these are business, entertainment, health, politics, sport, and news. The data used is Cable News Network (CNN) articles from 2011 to 2022. The data is in form of text and has large amounts so good handling is needed to avoid overfitting and underfitting. Random forest is proper to apply to the data because the algorithm works very well on large amounts of data. However, random forest has a difficult interpretation if the combination of parameters is not appropriate in the data processing. Therefore, hyperparameter optimization is needed to discover the best combination of parameters in the random forest. This research uses search cross-validation (SearchCV) method to optimize hyperparameters in the random forest by testing the combinations one by one and validating those. Then we obtain the classification of news articles into six categories with an accuracy value of 0.81 on training and 0.76 on testing.
BIBLIOMETRIC ANALYSIS OF NEURAL BASIS EXPANSION ANALYSIS FOR INTERPRETABLE TIME SERIES (N-BEATS) FOR RESEARCH TREND MAPPING Saputro, Dewi Retno Sari; Prasetyo, Heri; Wibowo, Antoni; Khairina, Fadiah; Sidiq, Krisna; Wibowo, Gusti Ngurah Adhi
BAREKENG: Jurnal Ilmu Matematika dan Terapan Vol 17 No 2 (2023): BAREKENG: Journal of Mathematics and Its Applications
Publisher : PATTIMURA UNIVERSITY

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.30598/barekengvol17iss2pp1103-1112

Abstract

Bibliometrics is the statistical analysis of articles, books, and other forms of publication. The bibliometrics analysis is performed with data on the number and authorship of scientific publications and articles, and citations to measure the work of individuals or groups of researchers, organizations, and countries to identify national and international networks and map developments in new multidisciplinary fields of science and technology. In addition, bibliometrics assesses and maps the research, organization, and country of researchers at a given time period. The Bibliometric analysis also has advantages which include mapping relationships between concepts, mapping research directions or trends, mapping state of the art (the novelty of the results of research conducted), and providing insights related to fields, topics, and research problems for future works. This study aims to determine the growth and development of N-BEATS publications, their distribution, variable keywords, and author collaboration using a bibliometric network. The research method used in this paper, through screening of articles obtained from the Scopus database page in 2008-2022, is used for citations in the form of metrics. At the same time, they are visualizing the metadata with VOSviewer. Data was collected from the direct science database with the keyword N-BEATS. The results show that 2022 has the highest number of publications, reaching 310 publications (14.90%). The distribution of research publications on N-BEATS shows a perfect distribution. Terms in the N-BEATS variable that often appear and are associated with other variables.
TEXT CLASSIFICATION USING ADAPTIVE BOOSTING ALGORITHM WITH OPTIMIZATION OF PARAMETERS TUNING ON CABLE NEWS NETWORK (CNN) ARTICLES Saputro, Dewi Retno Sari; Sidiq, Krisna; Rasyid, Harun Al; Sutanto, Sutanto
BAREKENG: Jurnal Ilmu Matematika dan Terapan Vol 18 No 2 (2024): BAREKENG: Journal of Mathematics and Its Application
Publisher : PATTIMURA UNIVERSITY

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.30598/barekengvol18iss2pp1297-1306

Abstract

The development of the era encourages advances in communication and information technology. This resulted in the exchange of information being faster because it is connected to the internet. One platform that provides online news articles is Cabel News Network (CNN), which has been broadcasting news on its website since 1995. The number of Cabel News Network news articles continues to increase, so news articles are categorized to make it easier for readers to find articles according to the category they want. Classification is a technique for determining the class of an object based on its characteristics, where the class label is known beforehand. One of the algorithms for classification is adaptive boosting (AdaBoost). The AdaBoost algorithm performs classification by building several weighted decision trees (stumps), then the class determination is based on the number of stumps with the highest weight. The AdaBoost algorithm can be combined with parameter tuning to avoid overfitting or underfitting resulting from a weak set of stumps. Therefore, this study implements the AdaBoost algorithm with parameter tuning on CNN news article classification. The data used in this study is CNN news article data from 2011 to 2022 sourced from the Kaggle page. The data is categorized into six classes, namely business, entertainment, health, news, politics, and sports. This study uses two evaluation metrics, namely the accuracy value and the confusion matrix to measure the performance of the AdaBoost algorithm. The accuracy value obtained is 0,78763, the precision value is 0.91, the recall value is 0.85, and the F1 score value is 0.88.