Putra Kusuma, Gede
Unknown Affiliation

Published : 3 Documents Claim Missing Document
Claim Missing Document
Check
Articles

Found 3 Documents
Search

Serial Multimodal Biometrics Authentication and Liveness Detection Using Speech Recognition with Normalized Longest Word Subsequence Method Andrian, Rafi; Putra Kusuma, Gede
JOIV : International Journal on Informatics Visualization Vol 8, No 3 (2024)
Publisher : Society of Visual Informatics

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.62527/joiv.8.3.2247

Abstract

Biometric authentication aims to verify whether an entity matches the claimed identity based on biometric data. Despite its advantages, vulnerabilities, particularly those related to spoofing, still exist. Efforts to mitigate these vulnerabilities include multimodal approaches and liveness detection. However, these strategies may potentially increase resource requirements in the authentication process. This paper proposes a multimodal authentication process incorporating voice and facial recognition, with liveness detection applied to voice data using speech recognition. This paper introduces Normalized Longest Word Subsequence (NLWS), a combination of Intersection Over Union (IOU) and the longest common subsequence, to compare the prompted system sentence with the user's spoken sentence at speech recognition. Unlike the Word Error Rate (WER), NLWS has a measurable range between 1 and 0. Furthermore, the paper introduces decision-level fusion in the multimodal approach, employing two threshold levels in voice authentication. This approach aims to reduce resource requirements while enhancing the overall security of the authentication process. This paper uses cosine similarity, Euclidean distance, random forest, and extreme gradient boosting (XGBoost) to measure distance or similarity. The results show that the proposed method has better accuracy compared to unimodal approaches, achieving accuracies of 98.44%, 98.83%, 97.46%, and 99.22% using cosine similarity, Euclidean distance, random forest, and XGBoost calculations. The proposed method also demonstrates resource savings, reducing from 5.19 MB to 0.792 MB, from 7.3294 MB to 1.9437 MB, from 6.6512 MB to 1.3284 MB, and from 7.8632 MB to 2.1517 MB in different distance or similarity measurements
Comparative study of pothole detection using deep learning on smartphone Ulul Amri, Achyar; Putra Kusuma, Gede
Indonesian Journal of Electrical Engineering and Computer Science Vol 37, No 2: February 2025
Publisher : Institute of Advanced Engineering and Science

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.11591/ijeecs.v37.i2.pp995-1004

Abstract

Potholes present a significant problem in many countries, leading to vehicle damage and traffic accidents. These road imperfections pose safety risks and impose economic burdens. Despite existing detection methods using sensors and computer vision deep learning processed on PCs, a gap remains in deploying cost-effective, widely accessible solutions. This study aims to bridge this gap by developing deep learning models optimized for smartphones, reducing costs and enhancing deployment feasibility. We developed multiple models for pothole detection, utilizing transfer learning and Bayesian hyperparameter tuning to optimize detection accuracy and resource efficiency. Our evaluations focused on computationally light models such as YOLOv8 small, YOLOv8-nano, YOLOv7 tiny, and faster R-CNN MobileNetV3. In terms of detection accuracy, YOLOv8 small and YOLOv8 nano stood out, achieving average precisions (AP) of 83.5% and 82.5%, respectively. YOLOv8 nano proved the most efficient, offering high detection accuracy, a file size three times smaller than YOLOv8 small in TFLite format, and the fastest inference time of 0.72 seconds per image. This study highlights the potential of smartphones in urban pothole detection, contributing to improved road maintenance and urban policy.
Leveraging distillation token and weaker teacher model to improve DeiT transfer learning capability Gavra Reswara, Christopher; Putra Kusuma, Gede
International Journal of Informatics and Communication Technology (IJ-ICT) Vol 15, No 1: March 2026
Publisher : Institute of Advanced Engineering and Science

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.11591/ijict.v15i1.pp198-206

Abstract

Recently, distilling knowledge from convolutional neural networks (CNN) has positively impacted the data-efficient image transformer (DeiT) model. Due to the distillation token, this method is capable of boosting DeiT performance and helping DeiT to learn faster. Unfortunately, a distillation procedure with that token has not yet been implemented in the DeiT for transfer learning to the downstream dataset. This study proposes implementing a distillation procedure based on a distillation token for transfer learning. It boosts DeiT performance on downstream datasets. For example, our proposed method improves the DeiT B 16 model performance by 1.75% on the OxfordIIIT-Pets dataset. Furthermore, we present using a weaker model as a teacher of the DeiT. It could reduce the transfer learning process of the teacher model without reducing the DeiT performance too much. For example, DeiT B 16 model performance decreased by only 0.42% on Oxford 102 Flowers with EfficientNet V2S compared to RegNet Y 16GF. In contrast, in several cases, the DeiT B 16 model performance could improve with a weaker teacher model. For example, DeiT B 16 model performance improved by 1.06% on the OxfordIIIT-Pets dataset with EfficientNet V2S compared to RegNet Y 16GF as a teacher model.