This study was conducted to compare the performance of the SpaCy Named Entity Recognition (NER) model and the Bidirectional Encoder Representation from Transformers (BERT) model in identifying the distribution of Bernadya fans based on the mention of Geo-Political Entity (GPE) locations. The dataset used was collected from X users' tweets using a scraping method with Python and will be analyzed on both NER models. The SpaCy NER model will be built from scratch with manual annotation, while the BERT model will be built using the transforms approach. From the evaluation results, the SpaCy model achieved a precision of 1.00, a recall of 0.92, and an F1-score of 0.96 on the training data, as well as a recall of 0.98 and an F1-score of 0.99 on the test data. The BERT model recorded a precision of 1.00, a recall of 0.95 (training), and 1.00 (testing), with an F1-score of 0.98 and 1.00. The Spacy model can recognize more than two entities well in one test sentence. However, when tested with the entire dataset, it cannot consistently recognize GPE entities. Conversely, the BERT model is better at recognizing GPE entities, with 4 GPE entities identified, including: Karanganyar, Indonesia, Mongolia, and Bandung as regions capable of identifying GPE entities with the most mentions. Therefore, in this study, the BERT model is better at recognizing GPE entities from the dataset used.
Copyrights © 2025