Winkie Setyono
Telkom University, Bandung

Published : 1 Documents Claim Missing Document
Claim Missing Document
Check
Articles

Found 1 Documents
Search

POS Tagger Improvisation with the Addition of Foreign Word Labels on Telkom University News Winkie Setyono; Donni Richasdy; Mahendra Dwifebri Purbolaksono
Building of Informatics, Technology and Science (BITS) Vol 4 No 2 (2022): September 2022
Publisher : Forum Kerjasama Pendidikan Tinggi

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47065/bits.v4i2.1983

Abstract

News is a medium of daily information usually obtained by the public. The news consists of a lot of information in it and is composed of sentence structures. Each language is unique with its own sentence structure, like Indonesian and other foreign languages. But nowadays, many media mix Indonesian with foreign languages, making the sentence structure different from Bahasa Indonesia. To classify these words, Part Of Speech Tagging needed to determine the class of words composed of sentences by learning from the Corpus of each language. With the new sentence structure, POS Tagger requires a larger Corpus to learn. The language structure can determine the results of tagging from the POS Tagger. If there are words that are not in the Corpus, it can reduce the accuracy of the POS Tagger. We conducted to enhance the research results by adding data with a different sentence structure from the Indonesian Language Corpus using sentences from online media. Added about 242 sentences with 7,043 tokens on Corpus focused on Foreign Word tags, which total 3819 tags. After doing some testing and scenarios, the results of the accuracy of POS Tagger show an accuracy of 94.7% using the Hidden Markov Model method with the F1-Score tag FW 78%.