This study focuses on comparing the performance of Convolutional Neural Networks (CNN) and Vision Transformers (ViT) in classifying brain tumors using Magnetic Resonance Imaging (MRI) data. The MRI images are grouped into four categories: normal, glioma, meningioma, and pituitary, which represent healthy brain conditions and several common types of brain tumors. Before the classification process, data preprocessing was carried out to improve image quality and consistency. This included resizing images and normalizing intensity values. The dataset was then divided into training and testing sets using three different ratios: 70:30, 80:20, and 90:10, allowing the models to be evaluated under varying data conditions.The CNN and ViT models were designed to extract important features from medical images using different approaches. CNN uses convolutional and pooling layers to capture local spatial features, making it well suited for identifying texture and structural patterns in MRI images. In contrast, ViT applies a self-attention mechanism that enables the model to learn global relationships across the entire image. To make the system more user-friendly, a graphical user interface (GUI) based on Tkinter was developed. This interface allows users to select datasets, train the models, and view evaluation results such as graphs and confusion matrices interactively. Model performance was assessed using accuracy, precision, recall, and F1-score metrics. The results show that CNN consistently achieved more stable and higher performance across all data splits. At the 90:10 ratio, CNN reached an accuracy of 95%, while ViT achieved 88%. Similar trends were observed at the 80:20 and 70:30 ratios