Journal of Computer Networks, Architecture and High Performance Computing
Vol. 8 No. 2 (2026): Research Paper April 2026

Hyperparameter Sensitivity of Vanilla Knowledge Distillation for Compact CNNs on CIFAR-100

Fauzan, Mochamad Rizal (Unknown)
Rachman, Raden Muhammad Rafi (Unknown)
Saputra, Shifa Rangga (Unknown)
Nugraha, Daffa Irsyad (Unknown)



Article Info

Publish Date
27 Apr 2026

Abstract

Knowledge distillation has become an effective strategy for improving compact convolutional neural networks, yet the performance of vanilla knowledge distillation in lightweight image classification is still often reported using default hyperparameter settings without systematic justification. This study addresses the limited empirical understanding of how two core vanilla knowledge distillation hyperparameters, temperature scaling (T) and loss balancing (?), affect compact convolutional neural networks under a unified experimental setting. Using CIFAR-100 as the benchmark dataset, a ResNet-50 teacher was employed to distill knowledge into two lightweight student models, MobileNetV2 and ShuffleNetV2 ×1.0. Performance was evaluated using top-1 accuracy, top-5 accuracy, parameter count, and inference latency. The teacher achieved 81.24% top-1 accuracy and 96.05% top-5 accuracy. Under the default distillation setting, MobileNetV2 improved from 79.18% to 80.83% top-1 accuracy and from 95.77% to 96.40% top-5 accuracy, while reducing latency from 3.98 ms to 3.44 ms. ShuffleNetV2 ×1.0 improved from 77.00% to 78.36% top-1 accuracy and from 94.81% to 95.45% top-5 accuracy, with only a marginal latency increase from 4.23 ms to 4.29 ms. To examine hyperparameter sensitivity, an ablation study was conducted on MobileNetV2 with T = 2, 4, and 6, and ? = 0.3, 0.5, and 0.7. The best configuration was obtained at T = 4 and ? = 0.3, yielding 80.88% top-1 accuracy and 96.51% top-5 accuracy. These results show that vanilla knowledge distillation consistently improves compact convolutional neural networks, but its effectiveness depends strongly on careful hyperparameter selection rather than inherited default settings.

Copyrights © 2026






Journal Info

Abbrev

CNAPC

Publisher

Subject

Computer Science & IT Education

Description

Journal of Computer Networks, Architecture and Performance Computing is a scientific journal that contains all the results of research by lecturers, researchers, especially in the fields of computer networks, computer architecture, computing. this journal is published by Information Technology and ...