JITK (Jurnal Ilmu Pengetahuan dan Komputer)
Vol. 10 No. 2 (2024): JITK Issue November 2024

AUTOMATION OF THE BERT AND RESNET50 MODEL INFERENCE CONFIGURATION ANALYSIS PROCESS

Medi Noviana (Master of Electrical Engineering, Universitas Gunadarma)
Sunny Arief Sudiro (STMIK Jakarta STI&K)



Article Info

Publish Date
19 Nov 2024

Abstract

Inference is the process of using models to make predictions on new data, performance is measured based on throughput, latency, GPU memory usage, and GPU power usage. The models used are BERT and ResNet50. The right configuration can be used to maximise inference. Configuration analysis needs to be done to find out which configuration is right for model inference. The main challenge in the analysis process lies in its inherent time-intensive nature and inherent complexity, making it a task that is not simple. The analysis needs to be made easier by building an automation programme. The automation programme analyses the BERT model inference configuration by dividing 10 configurations namely bert-large_config_0 to bert-large_config_9, the result is that the right configuration is bert-large_config_2 resulting in a throughput of 12.8 infer/sec with a latency of 618 ms. While the ResNet50 model is divided into 5 configurations, namely resnet50_config_0 to resnet50_config_4, the result is that the right configuration is resnet50_config_1 which produces a throughput of 120.6 infer/sec with a latency of 60.9 ms. The automation programme has the benefit of facilitating the process of analysing the inference configuration.

Copyrights © 2024






Journal Info

Abbrev

jitk

Publisher

Subject

Computer Science & IT

Description

Kegiatan menonton film merupakan salah satu cara sederhana untuk menghibur diri dari rasa gundah gulana ataupun melepas rasa lelah setelah melakukan aktivitas sehari-hari. Akan tetapi, karena berbagai alasan terkadang seseorang tidak ada waktu untuk menonton film di bioskop. Dengan bantuan media ...