The rapid growth of trademark registrations in Indonesia has increased the demand for efficient and accurate classification into the internationally recognized NICE system. Manual assignment of classes remains time-consuming and prone to human error, motivating the need for an automated approach. This study investigates the application of Transformer-based language models for trademark class identification based solely on product and service descriptions. Two models were evaluated: the Multilingual BERT (mBERT) and the monolingual IndoBERT, both fine-tuned for sequence classification across 45 NICE classes using 59,948 trademark entries collected from the Directorate General of Intellectual Property (DGIP) database. The research methodology encompassed data preprocessing, stratified train-test splitting (80:20), and tokenization with a maximum sequence length of 64 tokens. Both models were trained for two epochs using the AdamW optimizer, and evaluated with accuracy, precision, recall, F1-score, and per-class accuracy (one-vs-all). Experimental results reveal that IndoBERT significantly outperforms mBERT, achieving an overall accuracy, precision, recall, and F1-score of 0.90, compared to 0.85 for mBERT. IndoBERT demonstrated particularly robust performance in low-support classes, indicating its superior ability to capture domain-specific linguistic nuances in Indonesian trademark descriptions. The findings underscore the potential of monolingual Transformer models for automating trademark classification in national intellectual property systems. The integration of such models can accelerate trademark registration, reduce examiner workload, and enhance consistency in class assignment. These results contribute to advancing the deployment of AI in legal and administrative contexts, while providing a foundation for future work involving multimodal features and explainable AI for comprehensive trademark management solutions.
Copyrights © 2025