International Journal of Electrical and Computer Engineering
Vol 13, No 5: October 2023

Identification of monolingual and code-switch information from English-Kannada code-switch data

Ramesh Chundi (REVA University)
Vishwanath R. Hulipalled (REVA University)
Jay Bharthish Simha (REVA University)



Article Info

Publish Date
01 Oct 2023

Abstract

Code-switching is a very common occurrence in social media communication, predominantly found in multilingual countries like India. Using more than one language in communication is known as code-switching or code-mixing. Some of the important applications of code-switch are machine translation (MT), shallow parsing, dialog systems, and semantic parsing. Identifying code-switch and monolingual information is useful for better communication in online networking websites. In this paper, we performed a character level n-gram approach to identify monolingual and code-switch information from English-Kannada social media data. We paralleled various machine learning techniques such as naïve Bayes (NB), support vector classifier (SVC), logistic regression (LR) and neural network (NN) on English-Kannada code-switch (EKCS) data. From the proposed approach, it is observed that the character level n-gram approach provides 1.8% to 4.1% of improvement in terms of Accuracy and 1.6% to 3.8% of improvement in F1-score. Also observed that SVC and NN techniques are outperformed in terms of accuracy (97.9%) and F1-score (98%) with character level n-gram.

Copyrights © 2023






Journal Info

Abbrev

IJECE

Publisher

Subject

Computer Science & IT Electrical & Electronics Engineering

Description

International Journal of Electrical and Computer Engineering (IJECE, ISSN: 2088-8708, a SCOPUS indexed Journal, SNIP: 1.001; SJR: 0.296; CiteScore: 0.99; SJR & CiteScore Q2 on both of the Electrical & Electronics Engineering, and Computer Science) is the official publication of the Institute of ...