Syafrial Fachri Pane
Universitas Logistik dan Bisnis Internasional

Published : 3 Documents Claim Missing Document
Claim Missing Document
Check
Articles

Found 3 Documents
Search

Optimizing Search Efficiency in Ordered Data: A Hybrid Approach Using Jump Binary Search Gabriella Youzanna Rorong; Syafrial Fachri Pane; M Amran Hakim Siregar
Indonesian Journal of Data Science, IoT, Machine Learning and Informatics Vol 5 No 1 (2025): February
Publisher : Research Group of Data Engineering, Faculty of Informatics

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.20895/dinda.v5i1.1764

Abstract

This research presents the development of a hybrid algorithm called Jump Binary Search (JBS), which integrates jump search and binary search techniques to improve search efficiency in sorted data distributions. JBS is designed to accelerate the search process using a jump technique to find the target block, after the block is identified, it is followed by a binary search to narrow down the search space. The results of this study show that the performance of JBS is superior compared to Jump Linear Search (JLS) when applied to non-uniform and ordered categorical data distributions. JBS only requires an execution time ranging from 0-15ms and 0-10ms, demonstrating efficiency and speed on elements consisting of 400 elements. The execution time of JBS demonstrates its efficiency compared to JLS. By minimizing unnecessary data access, JBS becomes the right solution for finding target elements in sorted data distribution.
Pengaruh Metode Seleksi Fitur terhadap Akurasi Model SVM dalam Klasifikasi Customer Churn pada Perusahaan Telekomunikasi Mayke Andani Rohmaniar; Roni Habibi; Syafrial Fachri Pane
IJAI (Indonesian Journal of Applied Informatics) Vol 9, No 1 (2024)
Publisher : Universitas Sebelas Maret

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.20961/ijai.v9i1.92983

Abstract

Abstrak:Penelitian ini menganalisis pengaruh metode seleksi fitur terhadap akurasi model Support Vector Machine dalam memprediksi pelanggan di industri telekomunikasi. Empat metode seleksi fitur (Correlation Matrix, PCA, dan GA) dan empat kernel (Linear, Polynomial, RBF, dan Sigmoid) dibandingkan menggunakan dataset pelanggan telekomunikasi dari Kaggle dengan 7043 entri dan 33 fitur. Metodologi CRISP-DM digunakan, meliputi Pemahaman Bisnis, Pemahaman Data, Persiapan Data, Pemodelan, Evaluasi, dan Implementasi. Hasil penelitian menunjukkan bahwa metode seleksi fitur menggunakan Correlation Matrix dengan kernel Linear memberikan kinerja terbaik. Model ini mencapai akurasi tertinggi sebesar 92,48%, dengan precision 0,93, recall 0,97, dan f1-score 0,95. Metode seleksi fitur lainnya, seperti PCA dan GA, memberikan hasil yang lebih rendah dibandingkan dengan Correlation Matrix. Implementasi model prediksi yang akurat diharapkan dapat membantu perusahaan telekomunikasi mengembangkan strategi retensi pelanggan yang lebih efektif.=================================================Abstract:This study examines the impact of various feature selection methods on the accuracy of the Support Vector Machine (SVM) model in predicting customer behavior within the telecommunications sector. Specifically, the research compares four feature selection techniques: Correlation Matrix, Principal Component Analysis (PCA), and Genetic Algorithm (GA). Additionally, it evaluates the performance of four SVM kernels: Linear, Polynomial, Radial Basis Function (RBF), and Sigmoid. Utilizing a telecom customer dataset from Kaggle, which comprises 7043 entries and 33 features, the study adheres to the CRISP-DM methodology. This methodology includes phases such as Business Understanding, Data Understanding, Data Preparation, Modeling, Evaluation, and Implementation. The findings indicate that the Correlation Matrix feature selection method, when paired with the Linear kernel, provides the best performance. This particular configuration achieves the highest accuracy rate of 92.48%, along with a precision score of 0.93, a recall score of 0.97, and an F1-score of 0.95. In contrast, other feature selection methods, such as PCA and GA, result in lower performance metrics. These findings underscore the effectiveness of the Correlation Matrix and Linear kernel combination in enhancing the predictive accuracy of SVM models.
Enhancing OCR Accuracy on Indonesian ID Cards Using Dual-Pipeline Tesseract and Post-Processing Rendy Dwi Reksiyano; Syafrial Fachri Pane; Rolly Maulana Awangga
JEECS (Journal of Electrical Engineering and Computer Sciences) Vol. 10 No. 2 (2025): JEECS (Journal of Electrical Engineering and Computer Sciences)
Publisher : Fakultas Teknik Universitas Bhayangkara

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.54732/jeecs.v10i2.3

Abstract

Manual transcription of data from Indonesian identity cards (KTP) remains prevalent in public institutions, often resulting in inefficiencies and human errors that compromise data accuracy. While Optical Character Recognition (OCR) technologies such as Tesseract have been widely adopted. However, the performance on KTP images is still inconsistent due to non-uniform layouts, low contrast, and background noise. This study proposes a dual-pipeline OCR framework designed to enhance the recognition accuracy of Indonesian KTPs under real-world conditions. First, the pipeline performs static region segmentation based on predefined Regions of Interest (ROI), then uses dynamic keyword heuristics to locate text adaptively across varying layouts. The outputs of both pipelines are merged through a voting and regex-based post-processing mechanism, which includes character normalization and field validation using predefined dictionaries. Experiments were conducted on 78 annotated KTP samples with diverse resolutions and quality of images. Evaluation using Character Error Rate (CER), Word Error Rate (WER), and field-level accuracy metrics resulted in an average CER of 69.82%, WER of 80.20%, and character-level accuracy of 30.18%. Despite moderate performance in free-text areas such as address or occupation, structured fields achieved higher accuracy above 60%. The method runs efficiently in a CPU-only environment without requiring large annotated datasets, demonstrating its suitability for low-resource OCR deployment. Compared to conventional single-pipeline approaches, the proposed framework improves robustness across heterogeneous document layouts and illumination conditions. These findings highlight the potential of lightweight, rule-based OCR systems for practical e-KYC digitization and form a foundation for integrating deep-learning-based layout detection in future research.