The increasing volume of Household waste requires an accurate and efficient automatic waste sorting system. This study aims to apply Vision Transformer (ViT) for image-based household waste clasification. The dataset was divided inti training and validation sets and prepared to match the Vision Transformer archtecture. The ViT-Base Patch16-224 model was trained using the AdamW optimizer with a learning rate of 0.0002, batch size of 16, and 15 training epoch. Model performence was evaluated using accuracy, precision, recall, F1-score, and confusion matrix. Experimental results show that the proposed model achieved an overall accuracy of 95%. The inorganic class obtained a precision of 0.9, recall of 0.96, and F1-score of 0.95, while the organic class achived a precision of 0.94, recall of 0.93, F1-score of 0.94. these result indicate that self-attention mechanism in Vision Transformer effectively extracts global visual features and improves clasification stability. Therefore, Vision Transformer dermonstrates strong potential for implementasi in intelligent automatic waste sorting systems.
Copyrights © 2026