Garuda - Garba Rujukan Digital

International Journal Science and Technology (IJST)

Vol. 3 No. 3 (2024): November: International Journal Science and Technology

Mangesh Pujari (Unknown)
Anshul Goel (Unknown)
Anil Kumar Pakina (Unknown)

Publish Date
28 Nov 2024

Deploying small language models (SLMs) on ultra-low-power edge devices requires careful optimization to meet strict memory, latency, and energy constraints while preserving privacy. This paper presents a systematic approach to adapting SLMs for Tiny ML, focusing on model compression, hardware-aware quantization, and lightweight privacy mechanisms. We introduce a sparse ternary quantization technique that reduces model size by 5.8× with minimal accuracy loss and an efficient federated fine-tuning method for edge deployment. To address privacy concerns, we implement on-device differential noise injection during text preprocessing, adding negligible computational overhead. Evaluations on constrained devices (Cortex-M7 and ESP32) show our optimized models achieve 92% of the accuracy of full-precision baselines while operating within 256KB RAM and reducing inference latency by 4.3×. The proposed techniques enable new applications for SLMs in always-on edge scenarios where both efficiency and data protection are critical.

Citation Download

EndNote, Reference Manager, ProCite

Latex, Jabref

Check in Google Scholar

Journal Info

International Journal Science and Technology (IJST)

Website

Abbrev

IJST

Publisher

Asosiasi Dosen Muda Indonesia

Subject

Computer Science & IT Control & Systems Engineering Electrical & Electronics Engineering

Description

International Journal Science and Technology (IJST) is a scientific journal that presents original articles about research knowledge and information or the latest research and development applications in the field of technology. The scope of the IJST Journal covers the fields of Informatics, ...

Article Info

Abstract

Efficient TinyML Architectures for On-Device Small Language Models: Privacy-Preserving Inference at the Edge

Article Info

Abstract