The issues investigated in this research include the accuracy of artificial intelligence (AI) scoring of essay answers, the challenges of validity and fairness in automated scoring, as well as potential biases in the algorithms used, and how these affect academic assessment results compared to human scoring. The method used in this study consists of several systematic steps. First, data was collected by asking relevant questions to an artificial intelligence (AI) platform. After that, the questions are inputted into AI platforms such as Copilot, Gemini, and Blackbox AI. Next, the resulting answers are analyzed using machine learning algorithms and natural language processing (NLP) to ensure the quality and depth of analysis. The answer results are then tested to evaluate their accuracy and relevance to the context of the question asked. the analysis results are visualized in the form of graphs or tables. this research shows that artificial intelligence (AI) has great potential in the automated assessment of essay answers. The results showed that Copilot got the highest score of 63.4%, Gemini 50.5% and Blackbox.AI 56.9%. However, the answer accuracy results between each AI Platform are still below 70%, meaning that the answers from the AI platform are still far from expectations. The next research recommendation is to retest the AI Platform using other methods to get accurate and consistent results.