Named Entity Recognition (NER) is useful to help identify and detect entities of a word. The biomedical field has many literature so NER is highly demanded in this domain. Since biomedical has a large scale, research will only focus on biology cell documents. This research will use rule based and Naive Bayes Classifier for NER in biology cell documents. With 19 training documents which processed and annotated manually to search for Named Entity (NE) and obtain 1135 word training data. Test documents are denoted and tagged by tagger site then search for bigram and trigram. Furthermore, rule-based process, if in the rule based not found solution, it will continue on feature extraction process and NBC. Using 16 NE classes, 18 rules, and 7 features were tested with three scenarios: rule based testing, NBC, and a combination of both. The highest average precision, recall and f-measure with micro average on rule based is 0.85. With macro average the highest recall and f-measure obtained combination is 0.66 and 0.45, while the highest precision obtained rule based is 0.39.
Copyrights © 2018