Microblogging has become the media information that is very popular among internet users. Therefore, the microblogging became a source of rich data for opinions and reviews especially on movie reviews. We proposed, sentiment analysis on movie review using ensemble features and Bag of Words and selection Features Pearson's Correlation to reduce the dimension of the feature and get the optimal feature combinations. Use the feature selection is done to improve the performance of the classification, reducing the dimension of the feature and get the optimal feature combinations. The process of classification using several models of Naive Bayes i.e. Bernoulli Naive Bayes for binary data , Gaussian Naive Bayes for continuous data and Multinomial Naive Bayes for numeric data. The results of this study indicate that by using the non-standard word on tweet evaluation results obtained accuracy 82%, precision 86%, recall 79.62% and f-measure 82.69% using Feature Selection 20%. Then after using manual standardization of word the evaluation results on the accuracy increased by 8% and then the accuracy becomes 90%, precision 92%, recall 88.46% and f-measure 90.19% using 85% feature selection. Based on these results it can be concluded that by using the standardization of word can improve the performance of classification and feature selection Pearson's provide optimal feature combinations and reducing the total number of dimensions feature.
Copyrights © 2018