Rakhman Halim Satrio
Fakultas Ilmu Komputer, Universitas Brawijaya

Published : 1 Documents Claim Missing Document
Claim Missing Document
Check
Articles

Found 1 Documents
Search

Klasifikasi Tweets Pada Twitter Menggunakan Metode K-Nearest Neighbour (K-NN) Dengan Pembobotan TF-IDF Rakhman Halim Satrio; Mochammad Ali Fauzi; Indriati Indriati
Jurnal Pengembangan Teknologi Informasi dan Ilmu Komputer Vol 3 No 8 (2019): Agustus 2019
Publisher : Fakultas Ilmu Komputer (FILKOM), Universitas Brawijaya

Show Abstract | Download Original | Original Source | Check in Google Scholar | Full PDF (1258.626 KB)

Abstract

Twitter is a microblog that is currently favored by many people and has turned out to be a very fast spreader of information at this time. Information released and circulates through this media is very free and has many variations, like news, opinions, questions, criticisms, comments either positive or negative. Classification is a rule in text mining that collects content based on the similarity of the script. With this classification allows a tweets on Twitter to be grouped into one based on the category. For example, football, basketball and chess content are grouped into sports categories. Prosedure of classification begins using preprocessing, then term weighting is done, then categorization consists of cosine similarity calculations. Preprocessing itself consists of several phases, that is document cleaning, tokenizing, stopword removal, and stemming. The word weighting method used in this thesis is Term Frequency - Inverse Document Frequency (TF-IDF) & using K-Nearest Neighbor (K-NN) for its classification method. The KNN method is a classification of a set of data based on data learning that has been previously classified. Accuracy testing of the classification of tweets on Twitter with step of K-Nearest Neighbor (K-NN) theorem resulted in accuracy where the total data amounted to 140, with descriptions of 100 training data and 40 testing data and the values of k entered were 1, 3, 5, and 7, each the result is when k = 1, the accuration is 75.0%; k = 3, accuration is 72.5%; k = 5, accuration is 62.5%; k = 7, accuration is 55.0%.