Claim Missing Document
Check
Articles

Found 13 Documents
Search

Implementation of Web Scraping on Google Search Engine for Text Collection Into Structured 2D List Fahrudin, Tresna Maulana; Riyantoko, Prismahardi Aji; Hindrayani, Kartika Maulida
Telematika Vol 20 No 2 (2023): Edisi Juni 2023
Publisher : Jurusan Informatika

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.31315/telematika.v20i2.9575

Abstract

Purpose: This research proposes the implementation of web scraping on Google Search Engine to collect text into a structured 2D list.Design/methodology/approach: Implementing two important stages in the process of collecting data through web scraping, namely the HTML parsing process to extract links (URL) on Google Search Engine pages, and HTML parsing process to extract the body text from website pages on each link that has been collected.Findings/result: The inputted query is adjusted to the latest issues and news in Indonesia, for example the President's important figures, the month of Ramadan and Idul Fitri, riots tragedy (stadium) and natural disasters, rising prices of basic commodities, oil and gold, as well as other news. The least number of links obtained was 56 links and the most was 151 links, while the processing time to obtain links for each of the fastest queries was 1 minute 6.3 seconds and the longest was 2 minutes 49.1 seconds. The results of scraping links from these queries were obtained from Wikipedia, Detik, Kompas, the Election Supervisory Body (Bawaslu), CNN Indonesia, the General Election Commission (KPU), Pikiran Rakyat, and others.Originality/value/state of the art: Based on previous research, this study provides an alternative to produce optimal collection of links and text from web scraping results in the form of a 2D list structure. Lists in the Python programming language can store character sequences in the form of strings and can be accessed using index keys, and manipulate text efficiently.
Application of K-Means Clustering for Regency/City Clustering in East Java Based on 2024 Human Development Index Indicators Emilia, Kholidatus; Rahayu, Ayu Sri; Yuliani, Devina Putri; Prasetya, Dwi Arman; Riyantoko, Prismahardi Aji
Jurnal Aplikasi Sains Data Vol. 1 No. 2 (2025): Journal of Data Science Applications.
Publisher : Program Studi Sains Data UPN "Veteran" Jawa Timur

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.33005/jasid.v1i2.21

Abstract

This study applies the K-Means clustering algorithm to group 38 regencies and cities in East Java Province based on five Human Development Index (HDI) indicators for the year 2024. These indicators include Life Expectancy (UHH), Expected Years of Schooling (HLS), Mean Years of Schooling (RLS), and Real Expenditure Per Capita (PPK). The aim of this research is to uncover hidden patterns and disparities in regional development, which can be used as a basis for more targeted and data-driven policy interventions.The optimal number of clusters was determined using three evaluation metrics: the Elbow Method, Silhouette Score, and Davies-Bouldin Index. These evaluations collectively identified three distinct clusters. Cluster 0 represents regions with high levels of development across all indicators. Cluster 1 consists of regions with moderate development levels and potential for improvement, while Cluster 2 contains regions with significantly lower values, particularly in education and income metrics.In addition to clustering, a correlation analysis was conducted to examine the relationship between HDI and its supporting indicators. The results show that Mean Years of Schooling (RLS) and Real Expenditure Per Capita (PPK) have the strongest positive correlation with HDI across all clusters. This highlights the key role of education and economic well-being in improving human development. The findings emphasize the importance of clustering analysis in shaping equitable and region-specific development strategies.
A Hybrid Neural Network-Time Series Regression Model for Intermittent Demand Forecasting Data Amri Muhaimin; Damaliana, Aviolla Terza; Muhammad Nasrudin; Riyantoko, Prismahardi Aji; Nabilah Selayanti; Putri, Shafira Amanda
Journal of Advances in Information and Industrial Technology Vol. 7 No. 2 (2025): Nov
Publisher : LPPM Telkom University Surabaya

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.52435/jaiit.v7i2.704

Abstract

Forecasting is a vital tool that helps us make informed decisions by predicting future events based on past data. For forecasts to be accurate, it is important that the data is reliable, complete, and consistent. Yet, the intermittent data is a unique data that is challenging to forecast. Intermittent data contains a characteristic that the data has a lot of long zeros in some periods. The zero value will influence the model to generate a forecasting model. This study aims to tackle those problems by applying a hybrid approach. We integrate the regression model and neural network to create a novel approach for forecasting intermittent data. The dataset used for this data is from Kaggle, sales at Walmart supermarket for one category only. The sales data always produce an intermittent demand pattern, because not every day are the items always sold to customers. This irregular pattern makes the data difficult to forecast using a naïve approach, such as the Croston method, exponential smoothing, and ARIMA. To evaluate the performance of our model, some metrics were calculated. We use mean squared error, root mean squared error, and root mean squared scaled error. The result shows that our proposed method outperforms the benchmark model, with an RMSSE of 0.98, which is the lowest compared to other benchmark models in the root mean squared scaled error value. This result shows promise as an exciting solution for overcoming the challenges posed by irregular data in future forecasting tasks.