Siti Mariyah
The Center of Computational Statistics Study, Institute of Statistics

Published : 1 Documents Claim Missing Document
Claim Missing Document
Check
Articles

Found 1 Documents
Search

Named Entity Recognition on A Collection of Research Titles Siti Mariyah
Jurnal Aplikasi Statistika & Komputasi Statistik Vol 9 No 1 (2017): Journal of Statistical Application and Computational Statistics
Publisher : Pusat Penelitian dan Pengabdian Masyarakat Politeknik Statistika STIS

Show Abstract | Download Original | Original Source | Check in Google Scholar | Full PDF (662.326 KB) | DOI: 10.34123/jurnalasks.v9i1.95

Abstract

The title can help the reader to get the universal point of view of the article as the initial understanding before reading the content as a whole. On technical research papers, the title states essential information. In this study, we aim to develop information extraction techniques to recognize and extract problem, method, and domain of research contained in a title. We apply supervised learning on 671 research titles in computer science from various online journals and international conference proceedings. We conducted some experiments with different schemas to discover the influence of features and the performance of the algorithm. We examined contextual, syntactic, and the bag of words feature sets using Naïve Bayes and Maximum Entropy. The Naïve Bayes classifier learned from the first group of the feature set is successful in predicting category of each token in title dataset. The accuracy and f1-score for each class are more than 0.80 since the first group of feature sets considers the location of a token within a sentence, considers the token and POS tag of some tokens before and after and deliberates the rules of a token. While the Naïve Bayes classifier learned from the second group of the feature set is more appropriate classifying a phrase token than a word token.