bit-Tech
Vol. 8 No. 3 (2026): bit-Tech - IN PROGRESS

Text Pre-processing Techniques for the Fulfulde Language using NLTK

Raphael Babgai Guidéké (University of Yaounde I)
Olivier Vidémé Bossou (University of Yaounde I)
Missa Habiba (University of Yaounde I)



Article Info

Publish Date
10 Apr 2026

Abstract

The Fulfulde language, spoken by over 60 million people, presents significant challenges for Natural Language Processing (NLP) due to its complex morphology and dialectal diversity. Existing low-resource frameworks and standard rule-based approaches often fail to adequately address these morphosyntactic intricacies, creating a critical research gap. To bridge this gap, this study introduces the Text Pre-processing Technique for Fulfulde (TPTF). This pipeline engineer’s specific adaptations of conventional NLP techniques tokenisation, normalisation, lemmatisation, stop-word removal, and POS tagging tailored for Fulfulde's linguistic structure. The system was evaluated on a compiled corpus of 6,583 sentences from diverse media and literary sources, acknowledging the constraints inherent to such a low-resource dataset. Performance was assessed using ROUGE metrics to quantify the overlap and fidelity between automatic and reference pre-processing. The proposed technique achieved 99.22% precision, 61.73% recall, and an F-score of 76.11%. These results demonstrate TPTF's superior capacity to handle Fulfulde specificities compared to generic models. The TPTF pipeline provides a robust engineering foundation for downstream NLP tasks. Beyond technical performance, this contribution supports the future development of translation tools, aiding the preservation of Fulfulde's linguistic heritage and enhancing digital information access for its speakers.

Copyrights © 2026






Journal Info

Abbrev

bt

Publisher

Subject

Computer Science & IT

Description

The bit-Tech journal was developed with the aim of accommodating the scientific work of Lecturers and Students, both the results of scientific papers and research in the form of literature study results. It is hoped that this journal will increase the knowledge and exchange of scientific ...