The growth of fashion e-commerce leads to information overload for users. Traditional collaborative filtering-based recommendation systems often face cold-start problems. This study aims to develop a content-based fashion recommendation system that integrates visual and textual features without relying on user historical data. The proposed hybrid approach combines visual feature extraction using Convolutional Neural Network (VGG16) and textual feature extraction using Term Frequency-Inverse Document Frequency (TF-IDF). The Fashion Product Images dataset was modified from 44,400 to 5,000 samples through stratified sampling for computational efficiency. Experimental results show that the hybrid system with 60% visual and 40% textual weights achieved the best performance: Precision@5 of 78%, Recall@5 of 65%, and Accuracy@5 of 88%. The system's response time of 0.82 seconds meets real-time application criteria. Dataset reduction only decreased accuracy by 0.4% from the full dataset, but reduced training time by 82% and memory usage by 75%. This research proves that multimodal integration in content-based systems can produce relevant, personalized, and computationally efficient fashion recommendations.
Copyrights © 2025