Garuda - Garba Rujukan Digital

Indonesian Journal of Data and Science

Vol. 6 No. 1 (2025): Indonesian Journal of Data and Science

Mohammad, Abdukarim (Unknown)
Abdullahi , Mohammed (Unknown)
Achir, Jerome Aondongu (Unknown)

Publish Date
31 Mar 2025

Introduction: Part-of-speech (POS) tagging plays a pivotal role in natural language processing (NLP) tasks such as semantic parsing and machine translation. However, challenges with ambiguous and unknown words, along with limitations of absolute positional encoding in transformers, often affect tagging accuracy. This study proposes an enhanced POS tagging model integrating relative positional encoding and a rule-based correction module. Methods: The model utilizes a transformer-based architecture equipped with relative positional encoding to better capture token dependencies. Word embeddings, POS tag embeddings, and relative position embeddings are combined and processed through a multi-head attention mechanism. Following the initial classification by the transformer, a corrective rule-based module is applied to refine misclassified tokens. The approach was evaluated using the Groningen Meaning Bank (GMB) dataset, comprising over 1.3 million tokens. Results: The transformer model achieved an accuracy of 98.50% prior to rule-based corrections. After applying the rule-based module, overall accuracy increased to 99.68%, outperforming a comparable model using absolute positional encoding (98.60%). Additional evaluation metrics, including a precision of 0.92, recall of 0.89, and F1-score of 0.90, further validate the model’s effectiveness. Conclusions: Incorporating relative positional encoding significantly enhances the transformer’s contextual understanding and performance in POS tagging. The addition of a rule-based correction module improves classification accuracy, especially for linguistically ambiguous tokens. The proposed hybrid model demonstrates robust performance and adaptability, offering a promising direction for future multilingual POS tagging systems.

Citation Download

EndNote, Reference Manager, ProCite

Latex, Jabref

Check in Google Scholar

Journal Info

Indonesian Journal of Data and Science

Website

Abbrev

ijodas

Publisher

yocto brain

Subject

Computer Science & IT Decision Sciences, Operations Research & Management Mathematics

Description

IJODAS provides online media to publish scientific articles from research in the field of Data Science, Data Mining, Data Communication, Data Security and Data ...

Article Info

Abstract

Improving Part-of-Speech Tagging with Relative Positional Encoding in Transformer Models and Basic Rules

Article Info

Abstract