International Journal of Quantitative Research and Modeling
Vol 5, No 3 (2024)

Semantic Classification of Sentences Using SMOTE and BiLSTM

Tanjung, Irvan (Unknown)
Ilyas, Rid (Unknown)
Melina, Melina (Unknown)



Article Info

Publish Date
01 Sep 2024

Abstract

A paraphrase is a sentence that is re-expressed with a different word arrangement without changing its meaning (semantics). To find out the semantic proximity to the pair of citation sentences in the form of paraphrases, a computational model is needed. In doing classification sometimes appears a problem called Imbalance Class, which is a situation in which the distribution of data of each class is uneven. There are class groups that have less data (minorities) and class groups that have more data (majority). Any unbalanced real data can affect and decrease the performance of classification methods. One way to deal with it is using the SMOTE method, which is an over-sampling method that generates synthesis data derived from data replication in the minority class as much as data in the majority class. The study applied SMOTE in the classification of semantic proximity of citation pairs, used Word2Vec to convert words into vectors, and used the BiLSTM model for the learning process. The research was conducted through 8 different scenarios in terms of the data used, the selection of learning models, and the influence of SMOTE. The results showed that scenarios using previous research data with BiLSTM and SMOTE models provided the best accuracy and performance.

Copyrights © 2024






Journal Info

Abbrev

ijqrm

Publisher

Subject

Computer Science & IT Decision Sciences, Operations Research & Management Engineering Environmental Science Physics

Description

International Journal of Quantitative Research and Modeling (IJQRM) is published 4 times a year and is the flagship journal of the Research Collaboration Community (RCC). It is the aim of IJQRM to present papers which cover the theory, practice, history or methodology of Quatitative Research (QR) ...