Claim Missing Document
Check
Articles

Found 40 Documents
Search

Augmented Reality in STEM Using Personalized Learning to Promote Students’ Understanding Erlangga; Mukhlis, Rizki; Wihardi, Yaya; Raflesia, Sarifah Putri
Computer Engineering and Applications Journal (ComEngApp) Vol. 13 No. 2 (2024)
Publisher : Universitas Sriwijaya

Show Abstract | Download Original | Original Source | Check in Google Scholar

Abstract

The current curriculum highlights the premise of self-directed learning performed by students. Additionally, technological uses in educational settings prove to be a challenging task in a sense of implementing them in learning media and materials used in the classroom. This study aims at investigating the utilization of augmented reality (AR) in STEM (Science, Mathematics, Engineering, and Technology) using personalized learning. This study employed pre-experimental research design, specifically adopting One-Group Pretest-Posttest Design. The findings highlight that students’ pretest scores on average reached 51,6 and significantly improved to 82,67 in their posttest, whereas students’ gain score reached 0,64 which is considered as moderate. Their perspectives towards the use of augmented reality with personalized learning were significantly positive with the percentage of 82,1%. It is evident that the use of augmented reality with personalized learning is a viable option when it comes to affecting the learning outcomes.
Analysis of Direct Scoring and Similarity-Based Scoring Approaches in Automatic Short Answer Scoring (ASAS) Wicaksono, Bayu; Rasim, Rasim; Wihardi, Yaya
Brilliance: Research of Artificial Intelligence Vol. 5 No. 1 (2025): Brilliance: Research of Artificial Intelligence, Article Research May 2025
Publisher : Yayasan Cita Cendekiawan Al Khwarizmi

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47709/brilliance.v5i1.6275

Abstract

In the era of digital education, the need for automated scoring systems for short text answers has been steadily increasing. Automatic Short Answer Scoring (ASAS) aims to automate this assessment process with efficient and consistent approaches. Two commonly used approaches in ASAS are direct scoring and similarity-based scoring. Although these two approaches have been widely used, previous research has mostly focused on metrics like RMSE and Pearson Correlation to assess model performance. This study aims to provide a more in-depth analysis by comparing both approaches in two evaluation scenarios, specific-prompt and cross-prompt, by evaluating the accuracy and stability of the models. The dataset used in this study is the Rahutomo dataset. The results of the analysis show that direct scoring outperforms similarity-based scoring in terms of lower RMSE, higher Pearson Correlation, and fewer outliers. In the specific-prompt scenario, an RMSE of 0.0817 and a Pearson Correlation of 0.9504 were obtained, while in the cross-prompt scenario, the RMSE was 0.0917 and the Pearson Correlation was 0.9286. This study provides a more comprehensive insight into model performance by not only relying on evaluation metrics but also examining the distribution of residuals and outliers, which offers a more complete picture of model stability. Based on these findings, direct scoring is recommended for implementation in ASAS systems and for future research that can extend the analysis to other datasets or languages.
Facial Expression Recognition of Students in Classroom Using Hybrid MobileNetV3-Vision Transformer with Token Downsampling Khaairi, Mochamad; Rasim, Rasim; Wihardi, Yaya
Brilliance: Research of Artificial Intelligence Vol. 5 No. 1 (2025): Brilliance: Research of Artificial Intelligence, Article Research May 2025
Publisher : Yayasan Cita Cendekiawan Al Khwarizmi

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47709/brilliance.v5i1.6323

Abstract

In large classroom environments, teachers often struggle to monitor each student’s facial expression throughout the learning process. Yet, facial expressions are important indicators of students’ emotional states and engagement, which, when detected in real time, can support a more adaptive learning experience. Most previous research on Facial Expression Recognition (FER) has relied on Convolutional Neural Networks (CNN), which tend to be limited in capturing global relationships between facial features. Additionally, many studies focus on model accuracy without evaluating their practical effectiveness in real classroom settings. This study aims to develop a facial expression recognition model that is both accurate and efficient for use in classroom contexts. A hybrid Vision Transformer (ViT) architecture is proposed, which combines MobileNetV3 for local feature extraction and a Vision Transformer for global context modeling. To reduce the number of tokens and computational cost, a Token Downsampling method is introduced within the transformer blocks. The model is trained using the FER2013 dataset and achieves a test accuracy of 71.24%, surpassing the baseline pretrained ViT model, which reached only 70.10%. Additionally, the Token Downsampling method improves inference speed. Furthermore, the model is tested on a custom dataset collected from students in a real classroom setting to evaluate its performance in practical implementation. Although the performance on the classroom dataset is not yet optimal, the results on FER2013 demonstrate the potential of this approach for further development toward real-time and accurate facial expression recognition in educational environments.
The Development of Web-Based Learning using Interactive Media for Science Learning on Levers in Human Body Topic Astuti, Lia; Wihardi, Yaya; Rochintaniawati, Diana; Prima, Eka Cahya
Journal of Science Learning Vol 3, No 2 (2020): Journal of Science Learning
Publisher : Universitas Pendidikan Indonesia

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.17509/jsl.v3i2.19366

Abstract

Integrated curriculum is a popular way to develop 21st-century skills, but most of the materials are written on the books separately. Furthermore, web-based learning is an online learning media that can be accessed using an internet connection anytime and anywhere. However, many educational websites do not apply the principles of effective learning. Besides, the traditional learning methods tend to be bored for the students. In order to solve this problem, this study designed a website education that uses interactive content to assist students in learning levers in the human body topic as one of the integrated science materials. The process of developing an educational website consists of three steps: (1) analysis, (2) design, (3) construction making. This research method used descriptive method, and the experts' judgment will evaluate it on content, language, and media/IT. The questionnaires used the technology acceptance model (TAM) and five-dimensional interactivity to investigate the readability of the subjects' perception responses. The research subject was three science teachers and 31 students on private Junior High School in Bandung. According to the result, generally, it has a good evaluation of each aspect. But, the website education needs a strong signal to access for not taking load time consumer.
Penerapan Large Languange Models Dalam Pembaruan Artikel Biografi Wikipedia Dwiharani, Najma Qalbi; Yudi Wibisono; Yaya Wihardi
Jurnal Komputer Teknologi Informasi Sistem Informasi (JUKTISI) Vol. 4 No. 2 (2025): September 2025
Publisher : LKP KARYA PRIMA KURSUS

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.62712/juktisi.v4i2.499

Abstract

Wikipedia merupakan sumber informasi daring yang sangat populer di Indonesia, namun pembaruan artikelnya masih sangat bergantung pada kontribusi penyunting. Pada kategori artikel biografi, pembaruan informasi secara berkala sangat penting karena adanya perkembangan karier dan peristiwa terkini dari tokoh yang bersangkutan. Penelitian ini bertujuan untuk mengeksplorasi penerapan Large Language Models (LLM) dalam menambahkan informasi baru ke artikel biografi Wikipedia Indonesia dengan referensi dari satu artikel berita daring. Model utama yang digunakan adalah Gemma 3 yang kemudian dibandingkan dengan model baseline Phi-3-mini. Penelitian ini juga menguji efektivitas lima strategi prompting yang berbeda, yaitu simple prompt, system prompt (en), system prompt (id), one-shot, dan prompt chaining untuk mengarahkan model dalam menghasilkan keluaran yang relevan dan sesuai dengan gaya Wikipedia. Proses fine-tuning dilakukan menggunakan data berbentuk kombinasi artikel Wikipedia sebelum diperbarui, artikel berita sebagai referensi, dan teks berisi informasi baru yang relevan untuk ditambahkan ke dalam artikel Wikipedia sebagai target keluaran. Evaluasi dilakukan dengan metrik ROUGE untuk mengukur kesamaan antara hasil keluaran model dan referensi dari penyunting Wikipedia. Hasil penelitian menunjukkan bahwa fine-tuning model Gemma 4B secara signifikan meningkatkan performa, khususnya pada strategi prompt chaining dengan rata-rata skor ROUGE-1 sebesar 0.3687. Dibandingkan dengan baseline Phi-3-mini, model Gemma memberikan hasil yang lebih konsisten dan relevan. Temuan ini menunjukkan bahwa pendekatan berbasis LLM dapat menjadi solusi potensial dalam membantu proses pembaruan artikel biografi Wikipedia.
Unsupervised Clustering of Handwritten Essay Answer Images Using Vision Transformer Mohamad Asyqari Anugrah; Yaya Wihardi; Rani Megasari
Jurnal Komputer Teknologi Informasi Sistem Informasi (JUKTISI) Vol. 4 No. 2 (2025): September 2025
Publisher : LKP KARYA PRIMA KURSUS

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.62712/juktisi.v4i2.517

Abstract

This study explores the use of deep clustering methods to automatically group handwritten essay answer sheets based on their visual patterns. Feature extraction was performed using three backbone models: ResNet-50, Vision Transformer (ViT-base), and Tr-OCR. These features were then clustered using two unsupervised algorithms—K-means (with k=5) and HDBSCAN (with minimum cluster size = 10). To enhance clustering performance, a deep clustering approach was implemented by applying K-means iteratively to refine feature representations. Evaluation was conducted both quantitatively, using Silhouette Score, Davies-Bouldin Index, and Calinski- Harabasz Score, and qualitatively, through t-SNE visualizations and cluster content inspection. The ViT and Tr-OCR backbones outperformed CNN-based ResNet-50, achieving higher cluster cohesion and separation. Notably, the final clustering result using ViT with HDBSCAN reached a Silhouette Score of 0.772, Davies-Bouldin Index of 0.369, and Calinski-Harabasz Score of 408.006. The findings indicate that vision transformer-based models are more effective for unsupervised grouping of handwritten visual data. This approach can assist educators in accelerating and objectifying the grading process and may serve as a foundation for future automated essay evaluation systems integrating OCR and NLP techniques.
Pengenalan Ekspresi Wajah Peserta Didik di Ruang Kelas Menggunakan Vision Transformer (ViT) Muhammad Fakhri Fadhlurrahman; Munir; Yaya Wihardi
Jurnal Komputer Teknologi Informasi Sistem Informasi (JUKTISI) Vol. 4 No. 2 (2025): September 2025
Publisher : LKP KARYA PRIMA KURSUS

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.62712/juktisi.v4i2.531

Abstract

Abstrak Ekspresi wajah merupakan bentuk komunikasi non-verbal yang penting dalam memahami kondisi emosional peserta didik di ruang kelas. Pemahaman ini dapat membantu pendidik menyesuaikan metode pengajaran sesuai dengan keadaan emosional siswa, sehingga proses belajar mengajar menjadi lebih efektif. Penelitian ini bertujuan untuk mengembangkan dan menerapkan sistem pengenalan ekspresi wajah secara real-time di ruang kelas dengan memanfaatkan arsitektur Vision Transformer (ViT). Dua pendekatan sistem dikembangkan dalam penelitian ini: sistem dual-stage yang memanfaatkan kombinasi model deteksi wajah YOLOv11s dan model pengenalan ekspresi wajah HybridViT (ResNet-50), serta sistem single-stage yang menggunakan model YOLOv11s untuk langsung mendeteksi emosi dari citra wajah. Dataset yang digunakan meliputi Real-world Affective Face Database (RAF-DB), Face Detection Dataset, dan Facial Expression in Classroom, yang masing-masing digunakan untuk pelatihan awal dan fine-tuning model. Hasil pengujian menunjukkan bahwa sistem dual-stage memiliki performa klasifikasi yang lebih baik dengan nilai mean Average Precision (mAP) sebesar 0,2846, dibandingkan sistem single-stage dengan mAP sebesar 0,1603. Sebaliknya, dari segi efisiensi inferensi, sistem single-stage lebih unggul dengan latensi rata-rata per wajah sebesar 0,290 ms (6.539 FPS) di GPU dan 1,862 ms (545 FPS) di CPU, dibandingkan sistem dual-stage yang memiliki latensi lebih tinggi. Selain itu, evaluasi menunjukkan ketidakseimbangan performa antar kelas emosi akibat distribusi data yang tidak merata. Secara keseluruhan, kedua pendekatan menunjukkan potensi yang menjanjikan untuk implementasi sistem pengenalan ekspresi wajah di ruang kelas. Keduanya masih dapat ditingkatkan dari segi akurasi, generalisasi antar emosi, serta efisiensi waktu inferensi melalui peningkatan kualitas dataset dan eksplorasi teknik pelatihan lanjutan. Kata Kunci: Pengenalan Ekspresi Wajah, Vision Transformer, YOLOv11s, Real-Time, Ruang Kelas, Dual-Stage, Single-Stage Abstract Facial expressions serve as an essential form of non-verbal communication in understanding students' emotional states in the classroom. This understanding enables educators to adjust their teaching methods according to students' emotions, thus improving the effectiveness of the learning process. This study aims to develop and implement a real-time facial expression recognition system in classroom settings by utilizing the Vision Transformer (ViT) architecture. Two system approaches were developed: a dual-stage system combining a YOLOv11s face detection model with a HybridViT (ResNet-50) facial expression recognition model, and a single-stage system using a YOLOv11s model to directly detect emotions from facial images. The datasets used include the Real-world Affective Faces Database (RAF-DB) and the Facial Expression in Classroom Dataset, which were employed for model training and fine-tuning, respectively. Evaluation results demonstrate that the dual-stage system achieves superior classification performance with a mean Average Precision (mAP) of 0.2846, compared to the single-stage system's mAP of 0.1603. However, in terms of inference efficiency, the single-stage system outperforms the dual-stage system, achieving a lower average latency per face of 0.290 ms (6.539 FPS) on GPU and 1.862 ms (545 FPS) on CPU. The evaluation also highlights an imbalance in classification performance across emotion classes, primarily due to the uneven distribution of training and fine-tuning data. Overall, both approaches exhibit promising potential for facial expression recognition applications in classroom environments. Further improvements in accuracy, emotional generalization, and computational efficiency can be achieved through enhanced dataset quality, balanced emotion representation, and exploration of advanced training techniques. Keywords: Facial Expression Recognition, Vision Transformer, YOLOv11s, Real-Time, Classroom, Dual-Stage, Single-Stage
Deteksi Aksi Kekerasan pada Video CCTV Berbasis Skeleton dan Frame Grouping Menggunakan ConvLSTM Kafilli, Muhammad Fikri; Riza, Lala Septem; Wihardi, Yaya
JATISI Vol 12 No 3 (2025): JATISI (Jurnal Teknik Informatika dan Sistem Informasi)
Publisher : Universitas Multi Data Palembang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.35957/jatisi.v12i3.12873

Abstract

Manual monitoring of surveillance video (CCTV) is inefficient and prone to human oversight. This drives the need for an automated violence detection system that is fast and accurate. Existing deep learning models are often too computationally heavy for real-time implementation, creating a dilemma between accuracy and efficiency. This research proposes a lightweight two-stream ConvLSTM architecture to address this dilemma. The method efficiently models spatio-temporal relationships by combining skeleton representation and change detection, which is then packaged through a frame grouping technique. The ConvLSTM layer serves as the main temporal model, supported by a SeparableConv2D backbone for efficient feature extraction. The model is trained on the RWF-2000 dataset and evaluated using cross-dataset validation on the Surveillance Camera Fight Dataset to test its generalization capability. The results show that the proposed model achieves superior performance with an accuracy and F1-Score of 74.00%, and is highly efficient with an inference speed of 518.45 FPS. This research demonstrates that the two-stream architecture combining skeleton representation, frame grouping, and ConvLSTM modeling successfully creates a robust, fast violence detection system, offering a practical solution for real-world monitoring applications.
Klasifikasi Pose Kepala Siswa Menggunakan EfficientNetV2 dengan Seat Position Embedding Marhelio, Ananda Myzza; Munir, Munir; Wihardi, Yaya
JATISI Vol 12 No 3 (2025): JATISI (Jurnal Teknik Informatika dan Sistem Informasi)
Publisher : Universitas Multi Data Palembang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.35957/jatisi.v12i3.12987

Abstract

Understanding students’ visual attention direction is essential for evaluating their engagement during classroom learning. Head Pose Estimation (HPE) is an effective method for identifying attention focus, however, its application in real classroom settings is often hindered by low image quality and varied student seating positions, which makes regression-based methods for predicting facial landmarks or Euler angles suboptimal. This study adopts an image-based classification approach as an alternative and proposes a modification of the EfficientNetV2-S architecture by integrating Seat Position Embedding (SPE) as spatial context to improve the accuracy of head pose classification. The dataset was developed from direct classroom recordings and processed into 4,574 head pose images with five directional labels (up, down, front, right, left). Several CNN architectures were evaluated with and without SPE. The results show that the proposed model with SPE achieved an accuracy of 83.25%, surpassing the baseline model’s accuracy of 82.53%. This approach has proven effective in reducing visual ambiguity and providing a more accurate interpretation of students' attention.
Artificial Intelligence-Based Leveling System for Determining Severity Level of Autism Spectrum Disorder Rasim, R; Munir, M; Wihardi, Yaya; Ningrayati Amali, Lanto
Scientific Journal of Informatics Vol. 12 No. 4: November 2025
Publisher : Universitas Negeri Semarang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.15294/sji.v12i4.14440

Abstract

Purpose: The aim of this research is to analyze the use of an artificial intelligence (AI)-based leveling system to determine the severity of autism spectrum disorders (ASD). Methods: The research method is a systematic literature review. This study addresses three key questions: (i) What factors are used to determine ASD severity? (ii) What algorithms or AI models are used in classifying ASD severity? (iii) What are the results of this AI-based leveling system in terms of severity levels or categories? Results: The study results identified several key factors that influence ASD severity, including age, IQ, genetic and neurological factors, co-occurring mental health conditions, and sociodemographic variables. Various AI algorithms, including machine learning and deep learning techniques, are used to classify the severity of ASD. The results of this study highlight the effectiveness of AI in providing objective, consistent, and measurable assessments of ASD severity, although challenges such as data quality and ethical considerations remain. AI-based leveling systems show significant potential in improving assessment and intervention processes for ASD. Novelty: This research systematically synthesizes studies on AI-driven ASD severity assessment, providing insights into crucial variables for AI-based evaluation tools. By analyzing the factors influencing severity and the effectiveness of AI models, this study identifies promising approaches for classification. The findings offer valuable contributions to the development of AI-based tools in clinical and educational applications. Further research is necessary to improve AI reliability, address biases, and maximize its potential in ASD assessment and intervention.