A paraphrase is a sentence that is re-expressed with a different word arrangement without changing its meaning (semantics). To find out the semantic proximity to the pair of citation sentences in the form of paraphrases, a computational model is needed. In doing classification sometimes appears a problem called Imbalance Class, which is a situation in which the distribution of data of each class is uneven. There are class groups that have less data (minorities) and class groups that have more data (majority). Any unbalanced real data can affect and decrease the performance of classification methods. One way to deal with it is using the SMOTE method, which is an over-sampling method that generates synthesis data derived from data replication in the minority class as much as data in the majority class. The study applied SMOTE in the classification of semantic proximity of citation pairs, used Word2Vec to convert words into vectors, and used the BiLSTM model for the learning process. The research was conducted through 8 different scenarios in terms of the data used, the selection of learning models, and the influence of SMOTE. The results showed that scenarios using previous research data with BiLSTM and SMOTE models provided the best accuracy and performance.