Claim Missing Document
Check
Articles

Found 27 Documents
Search

Artificial intelligence multilingual image-to-speech for accessibility and text recognition Rosalina, Rosalina; Fahmi, Hasanul; Sahuri, Genta
IAES International Journal of Artificial Intelligence (IJ-AI) Vol 14, No 3: June 2025
Publisher : Institute of Advanced Engineering and Science

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.11591/ijai.v14.i3.pp1743-1751

Abstract

The primary challenge for visually impaired and illiterate individuals is accessing and understanding visual content, which hinders their ability to navigate environments and engage with text-based information. This research addresses this problem by implementing an artificial intelligence (AI)-powered multilingual image-to-speech technology that converts text from images into audio descriptions. The system combines optical character recognition (OCR) and text-to-speech (TTS) synthesis, using natural language processing (NLP) and digital signal processing (DSP) to generate spoken outputs in various languages. Tested for accuracy, the system demonstrated high precision, recall, and an average accuracy rate of 0.976, proving its effectiveness in real-world applications. This technology enhances accessibility, significantly improving the quality of life for visually impaired individuals and offering scalable solutions for illiterate populations. The results also provide insights for refining OCR accuracy and expanding multilingual support.
Crowdfunding platform integrated with cryptocurrency payment support Rosalina, Rosalina; Sahuri, Genta
International Journal of Advances in Applied Sciences Vol 14, No 2: June 2025
Publisher : Institute of Advanced Engineering and Science

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.11591/ijaas.v14.i2.pp598-608

Abstract

Crowdfunding platforms often face challenges such as high transaction fees, limited global accessibility, and reliance on traditional banking systems, which restrict participation and efficiency. These limitations hinder the full potential of crowdfunding, particularly for global contributors and projects. This research addresses these issues by proposing the development of a mobile crowdfunding platform integrated with cryptocurrency payment support. By incorporating cryptocurrency, the platform aims to reduce transaction costs, remove geographical barriers, and enhance transaction security through blockchain technology. The platform is built using a cross-platform mobile framework to ensure broad accessibility while integrating cryptocurrency gateways for decentralized financial transactions. This allows for real-time, secure, and low-cost payments, offering a transparent and efficient process for both contributors and fundraisers. Additionally, the platform's design supports scalability to accommodate various cryptocurrencies and an expanding user base. The findings demonstrate that cryptocurrency payment integration significantly improves transaction speed, reduces fees, and enhances security compared to traditional payment methods. It also fosters global participation, increasing engagement in crowdfunding initiatives.
MIDI-based generative neural networks with variational autoencoders for innovative music creation Rosalina, Rosalina; Sahuri, Genta
International Journal of Advances in Applied Sciences Vol 13, No 2: June 2024
Publisher : Institute of Advanced Engineering and Science

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.11591/ijaas.v13.i2.pp360-370

Abstract

By utilizing variational autoencoder (VAE) architectures in musical instrument digital interface (MIDI)-based generative neural networks (GNNs), this study explores the field of creative music composition. The study evaluates the success of VAEs in generating musical compositions that exhibit both structural integrity and a resemblance to authentic music. Despite achieving convergence in the latent space, the degree of convergence falls slightly short of initial expectations. This prompts an exploration of contributing factors, with a particular focus on the influence of training data variation. The study acknowledges the optimal performance of VAEs when exposed to diverse training data, emphasizing the importance of sufficient intermediate data between extreme ends. The intricacies of latent space dimensions also come under scrutiny, with challenges arising in creating a smaller latent space due to the complexities of representing data in N dimensions. The neural network tends to position data further apart, and incorporating additional information necessitates exponentially more data. Despite the suboptimal parameters employed in the creation and training process, the study concludes that they are sufficient to yield commendable results, showcasing the promising potential of MIDI-based GNNs with VAEs in pushing the boundaries of innovative music composition.
Generating intelligent agent behaviors in multi-agent game AI using deep reinforcement learning algorithm Rosalina, Rosalina; Sengkey, Axel; Sahuri, Genta; Mandala, Rila
International Journal of Advances in Applied Sciences Vol 12, No 4: December 2023
Publisher : Institute of Advanced Engineering and Science

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.11591/ijaas.v12.i4.pp396-404

Abstract

The utilization of games in training the reinforcement learning (RL) agent is to describe the complex and high-dimensional real-world data. By utilizing games, RL researchers will be able to evade high experimental costs in training an agent to do intelligence tasks. The objective of this research is to generate intelligent agent behaviors in multi-agent game artificial intelligence (AI) using deep reinforcement learning (DRL) algorithm. A basic RL algorithm called deep Q network is chosen to be implemented. The agent is trained by the environment's raw pixel images and the action list information. The experiments conducted by using this algorithm show the agent’s decision-making ability in choosing a favorable action. In the default setting for the algorithm, the training is set into 1 epoch and 0.0025 learning rate. The number of training iterations is set to one as the training function will be repeatedly called for every 4-timestep. However, the author also experimented with two different scenarios in training the agent and compared the results. The experimental findings demonstrate that our agents learn correctly and successfully while actively participating in the game in real time. Additionally, our agent can quickly adjust against a different enemy on a varied map because of the observed knowledge from prior training.
SME Business Intelligence Support Using Retrieval-Augmented Generation and RFM Segmentation Rosalina Rosalina; Noor Lees Ismail; Genta Sahuri; Joseph Tedja Nugraha Wibawa
Journal of Applied Data Sciences Vol 7, No 2: May 2026
Publisher : Bright Publisher

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47738/jads.v7i2.1163

Abstract

This study presents the design and evaluation of a cloud-based business intelligence support system for small and medium enterprises that integrates retrieval-grounded text generation with recency–frequency–monetary customer segmentation to enhance digital customer communication and promotional decision making. The primary objective is to assist individual small businesses in responding accurately to customer inquiries while simultaneously leveraging historical transaction data to identify actionable customer groups, all within their existing messaging workflows through a mobile keyboard interface. The proposed framework combines two complementary components. The first component automatically generates customer replies by retrieving semantically relevant information from a structured business knowledge base and using it to produce grounded, context-aware responses. The second component analyzes invoice records to segment customers into loyal, moderate, and at-risk groups, enabling sellers to tailor promotional strategies based on observed purchasing behavior. The system is implemented as a cloud service accessed by individual enterprises without requiring local infrastructure or model training. System evaluation was conducted using real small business data collected over several weeks. Experimental procedures included retrieval faithfulness analysis, response correctness evaluation with confidence intervals, customer cluster validation using silhouette analysis, end-to-end latency measurement, and structured user acceptance testing. Performance results demonstrate that the retrieval mechanism consistently provides accurate knowledge grounding, while the segmentation module effectively distinguishes high-value and churn-risk customers. The average response time remained within a range perceived as acceptable for real-time mobile conversations, and user testing confirms that the keyboard-based interface does not disrupt normal communication practices. The findings indicate that embedding retrieval-grounded generation and lightweight customer analytics directly into daily messaging tools can significantly improve the operational efficiency of small enterprises. This integrated approach reduces the burden of manual response handling while enabling data-driven promotional decision making. The framework offers a practical pathway for adopting artificial intelligence in small business environments and provides a foundation for future enhancements such as temporal behavior modeling and multilingual support.
Unraveling Indonesian heritage through pattern recognition using YOLOv5 Rosalina Rosalina; Genta Sahuri
Computer Science and Information Technologies Vol 5, No 3: November 2024
Publisher : Institute of Advanced Engineering and Science

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.11591/csit.v5i3.p265-271

Abstract

This research focuses on three iconic Indonesian batik patterns-Kawung, Mega Mendung, and Parang-due to their cultural significance and recognition. Kawung symbolizes harmony, Mega Mendung represents power, and Parang signifies protection and spiritual power. Using the YOLOv5 deep learning model, the study aimed to accurately identify these patterns. Results showed mean average precision (mAP) scores of 77% for Kawung, 80% for Parang, and an impressive 99% for Mega Mendung. The highest precision results were 91% for Kawung, 88% for Parang, and 77% for Mega Mendung. These findings highlight the potential of pattern recognition in preserving cultural heritage. Understanding these designs contributes to the appreciation of Indonesia s culture. The research suggests applications in cultural studies, digital archiving, and the textile industry, ensuring the legacy of these patterns endures.
Tower Defense Game based on 2D Grid Using Goal-Based Pathfinding Method Genta Sahuri; Rosalina Rosalina; Hardwin Welly Tulili Panandu
International Journal of Management Science and Information Technology Vol. 3 No. 1 (2023): January - June 2023
Publisher : Lembaga Komunitas Informasi Teknologi Aceh (KITA), Indonesia

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.35870/ijmsit.v3i1.819

Abstract

At the moment, agents cannot choose their own path with any flexibility in tower defense games. There may be a lot of enemies in one level of a tower defense game. The majority of in-game characters have a habit of moving in the direction of goals or objectives, though most have distinctive numbers and behaviors. The pathfinding method can be used to determine the route between the sources coordinates and the destination coordinates in an AI movement system. In this study, an objective-based pathfinding technique is used in a tower defense game where players can choose their own route. Based on the test results, the game can change the destination, which forces the adversary to alter their course to reach the new location. By placing units that can block these paths, this game also has the capacity to alter the available paths on the map.