This study aims to develop and optimize a Long Short-Term Memory (LSTM) model to reduce overfitting in text classification using the Kaggle IMDB movie review dataset. Overfitting is a common problem in machine learning that causes the model to overfit to the training data, thus degrading its performance on the test data. In this study, various optimization techniques such as regularization, dropout, and careful training methods are applied to improve the generalization of the LSTM model. This study shows that overfitting reduction techniques, such as dropout and the use of the RMSProp optimizer, significantly improve the performance of the Long Short-Term Memory (LSTM) model in IMDB movie review text classification. The optimized LSTM model achieves an accuracy of 83.45%, an increase of 2.07% compared to the standard model which has an accuracy of 81.38%. The precision of the optimized model increases to 89.65%, compared to 84.46% in the standard model, although the recall is slightly lower (75.69% compared to 76.91%). The F1-score of the optimized model is also higher, which is 82.07% compared to 80.53% in the standard model. The experimental results show that the techniques successfully improve the accuracy and reliability of the text classification model, with better performance on the test data. This research makes a significant contribution to understanding and overfitting in deep learning models in the context of natural language processing, and offers insights into best practices in applying LSTM models to text classification.
Copyrights © 2024