Bulletin of Electrical Engineering and Informatics
Vol 14, No 3: June 2025

Multiword target-independent transformer-based model for financial sentiment analysis in colloquial Cantonese

Chun Fai Chu, Carlin (Unknown)
So, Raymond (Unknown)
Kan Lam Kwong, Ernest (Unknown)
Chan, Andy (Unknown)



Article Info

Publish Date
01 Jun 2025

Abstract

Tokenization process decomposes a multi-word-span instrument name into several tokens and the transformer attention mechanism handles each token individually, thus hindering the treatment of the related tokens as a single entity. The existence of multiple instruments in a single message further exaggerates the complications and results in low predictive performance. This study proposed the use of sequentially tagged target-independent sentinel tokens to encapsulate multiword instrument aspects for natural language inference model fine-tuning. The encapsulation not only facilitated the attention mechanism to handle an instrument name as a single entity but also enabled the model to handle unseen instruments effectively. Our empirical analysis was based on 5,178 manually annotated instrument–sentiment pairs originated from finance discussion board messages that addressed sentiments of one to four instruments in a single post. The proposed approach consistently outperformed the direct bidirectional encoder representations from transformers (BERT) based approach in terms of recall, precision, and F1-score when handling financial commentaries written in colloquial Cantonese. This study demonstrated the potential benefits of target-independent sentinel token encapsulation for natural language inference. The underlying logic of multiword target-independent encapsulation was expected to hold for other languages, including Chinese, Japanese, and Thai.

Copyrights © 2025






Journal Info

Abbrev

EEI

Publisher

Subject

Electrical & Electronics Engineering

Description

Bulletin of Electrical Engineering and Informatics (Buletin Teknik Elektro dan Informatika) ISSN: 2089-3191, e-ISSN: 2302-9285 is open to submission from scholars and experts in the wide areas of electrical, electronics, instrumentation, control, telecommunication and computer engineering from the ...