JUSIFOR : Jurnal Sistem Informasi dan Informatika
Vol 4 No 2 (2025): JUSIFOR - Desember 2025

Paradigma Epistemologis Kompresi Data Teks: Huffman, Arithmetic, dan Neural Language Model

Affandi, Luqman (Unknown)
Prasetya, Didik Dwi (Unknown)
Patmanthara, Syaad (Unknown)



Article Info

Publish Date
31 Dec 2025

Abstract

This study explores text data compression as an epistemological paradigm through a comparative analysis of three fundamental approaches: traditional methods (Huffman Coding + LZW), bit-based methods (Arithmetic Coding), and machine learning approaches (Neural Language Models). Using the Project Gutenberg dataset comprising 15,000 classical literary works with a total size of 8.5 GB and 2.1-billion-word tokens, the evaluation is conducted based on compression ratio, execution time, and memory usage. The results reveal fundamental trade-offs among the paradigms. Traditional methods achieve the fastest execution (8.3 seconds/GB, 482 MB/s, 52 MB) with a compression ratio of 3.2:1. Arithmetic coding attains near-optimal performance (99.5% of the Shannon bound) with a compression ratio of 3.8:1. Neural language models yield the highest compression ratio of 4.6:1 but require substantially higher execution time and memory. The epistemological analysis highlights distinct conceptions of information—mechanistic, mathematically optimal, and semantic-aware—and provides a conceptual framework for developing adaptive compression systems.

Copyrights © 2025






Journal Info

Abbrev

jusifor

Publisher

Subject

Computer Science & IT Decision Sciences, Operations Research & Management Engineering Library & Information Science

Description

JUSIFOR adalah jurnal akses terbuka di bidang Informatika dan Sistem Informasi. Jurnal ini tersedia bagi para peneliti yang ingin meningkatkan pengetahuan mereka dibidang tertentu dan dimaksudkan untuk menyebarkan pengalaman hasil studi. JUSIFOR merupakan Jurnal penelitian ilmiah bidang informatika ...