The emergence of large language models such as ChatGPT has created unprecedented opportunities for automating software development processes, particularly within the machine learning domain. This study aims to empirically evaluate the effectiveness of ChatGPT in generating machine learning code for image classification tasks using the Keras framework. The research employs an experimental methodology utilizing the MNIST dataset, comprising 70,000 handwritten digit images. A systematic series of experiments was conducted through progressive prompting strategies, ranging from basic model construction to comprehensive evaluation protocols. The findings demonstrate that ChatGPT successfully generated 100% executable code without errors, with the resulting models achieving 99% accuracy on the test dataset. A notable discovery emerged in the form of "intelligent deviation" phenomena, wherein ChatGPT autonomously provided Convolutional Neural Network (CNN) architectures despite explicit requests for fully connected layers, demonstrating sophisticated contextual understanding. The generated code quality met professional standards with robust multi-library integration capabilities. This research provides the first systematic empirical contribution regarding large language model capabilities in machine learning code generation, offering significant implications for democratizing artificial intelligence technology access within educational and research contexts.
Copyrights © 2025