Journal of Innovation Research and Knowledge
Vol. 1 No. 7: Desember 2021

OPTIMASI HASIL PENCARIAN PADA WEB SCRAPPING MENGGUNAKAN PEMBOBOTAN KATA TF-IDF

Edy Prayitno (Program Studi Sistem Informasi, STMIK AKAKOM Yogyakarta)
Totok Suprawoto (Program Studi Sistem Informasi, STMIK AKAKOM Yogyakarta)
Beny Fajar Riyanto (Program Studi Sistem Informasi, STMIK AKAKOM Yogyakarta)



Article Info

Publish Date
24 Dec 2021

Abstract

This research is motivated by the amount of information offered on various webs. The large number of relevant websites related to the information sought causes users to have to search for the desired information one by one on the web so that the time required becomes longer. This research uses web scraping and TF-IDF method. Web scraping is a technique for getting information from web pages. In doing scraping, curl and simple html dom are needed to parse the scraped data. TF-IDF is a method to perform a search by looking for similarity of data with the keywords entered so that by using TF-IDF it is hoped that information that is more in line with the keywords entered is obtained. By using web scraping, additional data can be added to the system without using a web service. The use of TF-IDF results in a better search because the search is done by comparing the similarity of words between the data in the system and the search keywords.

Copyrights © 2021






Journal Info

Abbrev

JIRK

Publisher

Subject

Humanities Economics, Econometrics & Finance Education Health Professions Law, Crime, Criminology & Criminal Justice Social Sciences

Description

Journal of Innovation Research and Knowledge, published by Bajang Institute. Published in two formats, print and online, print version of ISSN: 2798-3471 and the online version of ISSN: 798-3641, both of which are published every month. The scope of the journal studies broadly includes: Culture (a ...