Dataset optimization is an important step in machine learning to improve model performance. This review discusses the use of Random Forest, Principal Component Analysis (PCA), and Feature Selection algorithms to optimize datasets. Based on this review, the combination of Random Forest, PCA, and Feature Selection is proven to be effective in improving machine learning model performance. This combination can help reduce overfitting, improve prediction accuracy, and speed up the model training process. In our experiments with the Amazon Reviews dataset, this optimized approach achieved an impressive accuracy of 91%, demonstrating a significant improvement over baseline models.
Copyrights © 2025