Jurnal Aplikasi Statistika & Komputasi Statistik
Vol 9 No 1 (2017): Journal of Statistical Application and Computational Statistics

Named Entity Recognition on A Collection of Research Titles

Siti Mariyah (The Center of Computational Statistics Study, Institute of Statistics)



Article Info

Publish Date
30 Jun 2017

Abstract

The title can help the reader to get the universal point of view of the article as the initial understanding before reading the content as a whole. On technical research papers, the title states essential information. In this study, we aim to develop information extraction techniques to recognize and extract problem, method, and domain of research contained in a title. We apply supervised learning on 671 research titles in computer science from various online journals and international conference proceedings. We conducted some experiments with different schemas to discover the influence of features and the performance of the algorithm. We examined contextual, syntactic, and the bag of words feature sets using Naïve Bayes and Maximum Entropy. The Naïve Bayes classifier learned from the first group of the feature set is successful in predicting category of each token in title dataset. The accuracy and f1-score for each class are more than 0.80 since the first group of feature sets considers the location of a token within a sentence, considers the token and POS tag of some tokens before and after and deliberates the rules of a token. While the Naïve Bayes classifier learned from the second group of the feature set is more appropriate classifying a phrase token than a word token.

Copyrights © 2017






Journal Info

Abbrev

jurnalasks

Publisher

Subject

Computer Science & IT Decision Sciences, Operations Research & Management Mathematics

Description

Redaksi menerima karya ilmiah atau artikel penelitian mengenai kajian teori statistika dan komputasi statistik pada bidang ekonomi dan sosial dan kependudukan, serta teknologi informasi. Redaksi berhak menyunting tulisan tanpa mengubah makna subtansi tulisan. Isi jurnal Aplikasi Statistika dan ...