Claim Missing Document
Check
Articles

Found 1 Documents
Search
Journal : JITK (Jurnal Ilmu Pengetahuan dan Komputer)

AUTOMATION OF THE BERT AND RESNET50 MODEL INFERENCE CONFIGURATION ANALYSIS PROCESS Medi Noviana; Sunny Arief Sudiro
JITK (Jurnal Ilmu Pengetahuan dan Teknologi Komputer) Vol. 10 No. 2 (2024): JITK Issue November 2024
Publisher : LPPM Nusa Mandiri

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.33480/jitk.v10i2.5053

Abstract

Inference is the process of using models to make predictions on new data, performance is measured based on throughput, latency, GPU memory usage, and GPU power usage. The models used are BERT and ResNet50. The right configuration can be used to maximise inference. Configuration analysis needs to be done to find out which configuration is right for model inference. The main challenge in the analysis process lies in its inherent time-intensive nature and inherent complexity, making it a task that is not simple. The analysis needs to be made easier by building an automation programme. The automation programme analyses the BERT model inference configuration by dividing 10 configurations namely bert-large_config_0 to bert-large_config_9, the result is that the right configuration is bert-large_config_2 resulting in a throughput of 12.8 infer/sec with a latency of 618 ms. While the ResNet50 model is divided into 5 configurations, namely resnet50_config_0 to resnet50_config_4, the result is that the right configuration is resnet50_config_1 which produces a throughput of 120.6 infer/sec with a latency of 60.9 ms. The automation programme has the benefit of facilitating the process of analysing the inference configuration.