Claim Missing Document
Check
Articles

Found 3 Documents
Search

Understanding of Convolutional Neural Network (CNN): A Review Purwono, Purwono; Ma'arif, Alfian; Rahmaniar, Wahyu; Fathurrahman, Haris Imam Karim; Frisky, Aufaclav Zatu Kusuma; Haq, Qazi Mazhar ul
International Journal of Robotics and Control Systems Vol 2, No 4 (2022)
Publisher : Association for Scientific Computing Electronics and Engineering (ASCEE)

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.31763/ijrcs.v2i4.888

Abstract

The application of deep learning technology has increased rapidly in recent years. Technologies in deep learning increasingly emulate natural human abilities, such as knowledge learning, problem-solving, and decision-making. In general, deep learning can carry out self-training without repetitive programming by humans. Convolutional neural networks (CNNs) are deep learning algorithms commonly used in wide applications. CNN is often used for image classification, segmentation, object detection, video processing, natural language processing, and speech recognition. CNN has four layers: convolution layer, pooling layer, fully connected layer, and non-linear layer. The convolutional layer uses kernel filters to calculate the convolution of the input image by extracting the fundamental features. The pooling layer combines two successive convolutional layers. The third layer is the fully connected layer, commonly called the convolutional output layer. The activation function defines the output of a neural network, such as 'yes' or 'no'. The most common and popular CNN activation functions are Sigmoid, Tanh, ReLU, Leaky ReLU, Noisy ReLU, and Parametric Linear Units. The organization and function of the visual cortex greatly influence CNN architecture because it is designed to resemble the neuronal connections in the human brain. Some of the popular CNN architectures are LeNet, AlexNet and VGGNet.
Regularized Xception for facial expression recognition with extra training data and step decay learning rate Azrien, Elang Arkanaufa; Hartati, Sri; Frisky, Aufaclav Zatu Kusuma
IAES International Journal of Artificial Intelligence (IJ-AI) Vol 13, No 4: December 2024
Publisher : Institute of Advanced Engineering and Science

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.11591/ijai.v13.i4.pp4703-4710

Abstract

Despite extensive research on facial expression recognition, achieving the highest level of accuracy remains challenging. The objective of this study is to enhance the accuracy of current models by adjusting the structure, the data used, and the training procedure. The incorporation of regularization into the Xception architecture, the augmentation of training data, and the utilization of step decay learning rate together address and surpass current constraints. A substantial improvement in accuracy is demonstrated by the assessment conducted on the facial expression recognition (FER2013) dataset, achieving a remarkable 94.34%. This study introduces potential avenues for enhancing facial expression recognition systems, specifically targeting the requirement for increased accuracy within this domain.
Computer vision syndrome prevention: detection of expression and eye distance with monitor screens Frisky, Aufaclav Zatu Kusuma; Azrien, Elang Arkanaufa; Sumiharto, Raden; Hartati, Sri
IAES International Journal of Artificial Intelligence (IJ-AI) Vol 14, No 6: December 2025
Publisher : Institute of Advanced Engineering and Science

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.11591/ijai.v14.i6.pp4533-4540

Abstract

Computer vision syndrome (CVS) is a vision-related complaint caused by computer usage. CVS can be analyzed through facial expressions detected by a camera. Expression detection is categorized into two groups: safe and dangerous. The safe category comprises happy, neutral, disgusted, sad, angry, and surprised, while the dangerous category includes sad and fearful emotions. This division is based on the similarity of CVS symptoms to facial emotion characteristics. Additionally, an additional feature is implemented to detect the distance between the screen and the user's eyes using the FaceMeshModule to prevent the user's eyes from getting too close to the screen. Both detections will provide warning notifications when a dangerous category expression is detected ≥70% every minute, and when the distance between the screen and the eyes is ≤40 cm. Notifications in this program use the Tkinter library as a graphical user interface (GUI) message box. In this research, facial expressions are detected using the CascadeClassifier for face detection and the extreme inception (Xception) as the facial expression classifier. The results of expression detection achieved an accuracy of 94%, an F1-score of 94%, precision of 95%, and recall of 94%.