Claim Missing Document
Check
Articles

Found 20 Documents
Search

OPTIMIZATION OF PORTFOLIO USING FUZZY SELECTION Wardani, Rahmania Ayu; Surono, Sugiyarto; Wen, Goh Kang
BAREKENG: Jurnal Ilmu Matematika dan Terapan Vol 16 No 4 (2022): BAREKENG: Journal of Mathematics and Its Applications
Publisher : PATTIMURA UNIVERSITY

Show Abstract | Download Original | Original Source | Check in Google Scholar | Full PDF (517.741 KB) | DOI: 10.30598/barekengvol16iss4pp1325-1336

Abstract

The problem of portfolio optimization concerns the allocation of the investor’s wealth between several security alternatives so that the maximum profit can be obtained. One of the methods used is Fuzzy Portfolio Selection to understand it better. This method separates the objective function of return and the objective function of risk to determine the limit of the membership function that will be used. The goal of this study is to understand the application of the Fuzzy Portfolio Selection method over shares that have been chosen on a portfolio optimization problem, understand return and risk, and understand the budget proportion of each claim. The subject of this study is the shares of 20 companies included in Bursa Efek Indonesia from 1 January 2021 until 1 January 2022. The result of this study shows that from 20 shares, there are 10 shares that is suitable in the forming of optimal portfolio, those are ADRO (0%), ANTM (43.3%), ASII (0%), BBCA (0%), BBRI (0%), BBTN (0%), BRPT (0%), BSDE (0%), ERAA (16%), and INCO (40.7%). The expected return from the portfolio is 0.0878895207 or 8.8% for the return and 0.0226022117 or 2.3% for the risk.
FUZZY TIME SERIES BASED ON THE HYBRID OF FCM WITH CMBO OPTIMIZATION TECHNIQUE FOR HIGH WATER PREDICTION Irsalinda, Nursyiva; Laely, Dera Kurnia; Surono, Sugiyarto
BAREKENG: Jurnal Ilmu Matematika dan Terapan Vol 17 No 3 (2023): BAREKENG: Journal of Mathematics and Its Applications
Publisher : PATTIMURA UNIVERSITY

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.30598/barekengvol17iss3pp1245-1256

Abstract

Time series data represents measurements taken over a specific period and is often employed for forecasting purposes. The typical approach in forecasting involves the analysis of relationships among estimated variables.In this study, we apply Fuzzy Time Series (FTS) to water level data collected every 10 minutes at the Irish Achill Island Observation Station. The FTS, which is based on Fuzzy C-Means (FCM), is hybridized with the Cat and Mouse Based Optimizer (CMBO). This hybridization of FCM with the CMBO optimizer aims to address weaknesses inherent in FTS, particularly concerning the determination of interval lengths, with the ultimate goal of enhancing prediction accuracy.Before conducting forecasts, we execute the FCM-CMBO process to determine the optimal centroid used for defining interval lengths within the FTS framework. Our study utilizes a dataset comprising 52,562 data points, obtained from the official Kaggle website. Subsequently, we assess forecasting accuracy using the Mean Absolute Percent Error (MAPE), where a smaller percentage indicates superior performance. Our proposed methodology effectively mitigates the limitations associated with interval length determination and significantly improves forecasting accuracy. Specifically, the MAPE percentage for FTS-FCM before optimization is 20.180%, while that of FCM-CMBO is notably lower at 18.265%. These results highlight the superior performance of the FCM-CMBO hybrid approach, which achieves a forecasting accuracy of 81.735% when compared to actual data.
Chi-Square Feature Selection with Pseudo-Labelling in Natural Language Processing Afriyani, Sintia; Surono, Sugiyarto; Solihin, Iwan Mahmud
JTAM (Jurnal Teori dan Aplikasi Matematika) Vol 8, No 3 (2024): July
Publisher : Universitas Muhammadiyah Mataram

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.31764/jtam.v8i3.22751

Abstract

This study aims to evaluate the effectiveness of the Chi-Square feature selection method in improving the classification accuracy of linear Support Vector Machine, K-Nearest Neighbors and Random Forest in natural language processing when combined with classification algorithms as well as introducing Pseudo-Labelling techniques to improve semi-supervised classification performance. This research is important in the context of NLP as accurate feature selection can significantly improve model performance by reducing data noise and focusing on the most relevant information, while Pseudo-Labelling techniques help maximise unlabelled data, which is particularly useful when labelled data is sparse. The research methodology involves collecting relevant datasets, thus applying the Chi-Square method to filter out significant features, and applying Pseudo-Labelling techniques to train semi-supervised models. In this study, the dataset used in this research is the text data of public comments related to the 2024 Presidential General Election, which is obtained from the Twitter scrapping process. The characteristics of this dataset include various comments and opinions from the public related to presidential candidates, including political views, support, and criticism of these candidates. The experimental results show a significant improvement in classification accuracy to 0.9200, with precision of 0.8893, recall of 0.9200, and F1-score of 0.8828. The integration of Pseudo-Labelling techniques prominently improves the performance of semi-supervised classification, suggesting that the combination of Chi-Square and Pseudo-Labelling methods can improve classification systems in various natural language processing applications. This opens up opportunities to develop more efficient methodologies in improving classification accuracy and effectiveness in natural language processing tasks, especially in the domains of linear Support Vector Machine, K-Nearest Neighbors and Random Forest well as semi-supervised learning.
DynamicWeighted Particle Swarm Optimization - Support Vector Machine Optimization in Recursive Feature Elimination Feature Selection: Optimization in Recursive Feature Elimination Sya'idah, Irma Binti; Surono, Sugiyarto; Khang Wen, Goh
MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer Vol. 23 No. 3 (2024)
Publisher : Universitas Bumigora

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.30812/matrik.v23i3.3963

Abstract

Feature Selection is a crucial step in data preprocessing to enhance machine learning efficiency, reduce computational complexity, and improve classification accuracy. The main challenge in feature selection for classification is identifying the most relevant and informative subset to enhance prediction accuracy. Previous studies often resulted in suboptimal subsets, leading to poor model performance and low accuracy. This research aims to enhance classification accuracy by utilizing Recursive Feature Elimination (RFE) combined with Dynamic Weighted Particle Swarm Optimization (DWPSO) and Support Vector Machine (SVM) algorithms. The research method involves the utilization of 12 datasets from the University of California, Irvine (UCI) repository, where features are selected via RFE and applied to the DWPSO-SVM algorithm. RFE iteratively removes the weakest features, constructing a model with the most relevant features to enhance accuracy. The research findings indicate that DWPSO-SVM with RFE significantly improves classification accuracy. For example, accuracy on the Breast Cancer dataset increased from 58% to 76%, and on the Heart dataset from 80% to 97%. The highest accuracy achieved was 100% on the Iris dataset. The conclusion of these findings that RFE in DWPSO-SVM offers consistent and balanced results in True Positive Rate (TPR) and True Negative Rate (TNR), providing reliable and accurate predictions for various applications.
Distance Functions Study in Fuzzy C-Means Core and Reduct Clustering Eliyanto, Joko; Surono, Sugiyarto
Jurnal Ilmiah Teknik Elektro Komputer dan Informatika Vol. 7 No. 1 (2021): April
Publisher : Universitas Ahmad Dahlan

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.26555/jiteki.v7i1.20516

Abstract

Fuzzy C-Means is a distance-based clustering process which applied by fuzzy logic concept. Clustering process worked in linear to the iteration process to minimizing the objective function. The objective function is an addition of the multiplication between the coordinates distance towards their closest cluster centroid and their membership degree. The more the iteration process, the objective function should get lower and lower. The objective of this research is to observe whether the distances which usually applied are able to fulfill the aforementioned hypothesis for determining the most suitable distance for Fuzzy C-Means clustering application. Few distance function was applied in the same dataset. 5 standard datasets and 2 random datasets were used to test the fuzzy c-means clustering performance with the 7 different distance function. Accuracy, purity, and Rand Index also applied to measure the quality of the resulted cluster. The observation result depicted that the distance function which resulted in the best quality of clusters are Euclidean, Average, Manhattan, Minkowski, Minkowski-Chebisev, and Canberra distance. These 6 distances were able to fulfill the basic hypothesis of the objective function behavior on Fuzzy C-Means Clustering method. The only distance who were not able to fulfill the basic hypothesis is Chebisev distance.
Comparative Evaluation of Feature Selection Methods for Heart Disease Classification with Support Vector Machine Bidul, Winarsi J.; Surono, Sugiyarto; Kurniawan, Tri Basuki
Jurnal Ilmiah Teknik Elektro Komputer dan Informatika Vol. 10 No. 2 (2024): June
Publisher : Universitas Ahmad Dahlan

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.26555/jiteki.v10i2.28647

Abstract

The purpose of this study is to compare the effectiveness of a variety of feature selection techniques to enhance the performance of Support Vector Machine (SVM) models for classifying heart disease data, particularly in the context of big data. The main challenge lies in managing large datasets, which necessitates the application of feature selection techniques to streamline the analysis process. Therefore, several feature selection methods, including Logistic Regression-Recursive Feature Elimination (LR-RFE), Logistic RegressionSequential Forward Selection (LR-SFS), Correlation-based Feature Selection (CFS), and Variance Threshold were explored to identify the most efficient approach. Based on existing research, these methods have shown a great impact in improving classification accuracy. In this study, it was found that combining the SVM model with LR-RFE, LR-SFS, and Variance Threshold resulted in superior evaluation, achieving the highest accuracy of 89%. Based on the comparison of other evaluation results, including precision, recall, and F1-score, the performance of these models varied depending on the feature selection method chosen and the distribution of data used for training and testing. But in general, LR-RFE-SVM and Variance Threshold-SVM tend to provide better evaluation values than LR-SFS-SVM and SVM-CFS. Based on the computation time, SVM classification with the Variance Threshold method as the feature selection method obtained the fastest time of 118.1540 seconds with the number and retention of 23 important features. Therefore, it is very important to choose a suitable feature selection technique, taking into account the number of retained features and the computation time. This research underscores the significance of feature selection in addressing big data challenges, particularly in heart disease classification. In addition, this study also highlights practical implications for healthcare practitioners and researchers by recommending methods that can be integrated into real-world healthcare settings or existing clinical decision support systems.
Perbandingan 5 Jarak K-Nearest Neighbor pada Analisis Sentimen Mujhid, Almuzhidul; Thobirin, Aris; Firdausy, Salma Nadya; Surono, Sugiyarto; Rahmadani, Lanova Ade
Jurnal Ilmiah Matematika Vol 8, No 2 (2021)
Publisher : Universitas Ahmad Dahlan

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.26555/konvergensi.v0i0.23170

Abstract

K-Nearest Neighbor (KNN) merupakan algoritma yang biasa digunakan untuk klasifikasi. Penelitian ini menggunakan ulasan aplikasi Maxim di Google Play Store. Pengguna yang sudah mengunduh aplikasi Maxim berhak memberikan ulasan di Google Play Store guna berbagi informasi untuk pengguna lain. Implementasi K-Nearest Neighbor (KNN) terhadap Sentiment Analysis ulasan aplikasi Maxim dapat digunakan untuk menentukan kelas ulasan bernilai positif, neutral, atau negatif. Peneliti melakukan perbandingan 5 jarak yang berbeda untuk metode KNN yaitu jarak Euclidean, Manhattan, Minkowski, Chebyshev dan Canberra. Pengujian yang telah dilakukan memberikan hasil akurasi pada klasifikasi KNN dengan jarak yang berbeda, memberikan hasil akurasi yang berbeda-beda, yaitu jarak Euclidean  84 persen, jarak Manhattan  79 persen, jarak Minkowski 84 persen, jarak Chebyshev  7 persen dan jarak Canberra =44 persen.
Hybrid feature fusion from multiple CNN models with bayesian-optimized machine learning classifiers Rismawati, Dewi; Surono, Sugiyarto; Thobirin, Aris
Computer Science and Information Technologies Vol 6, No 3: November 2025
Publisher : Institute of Advanced Engineering and Science

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.11591/csit.v6i3.p315-325

Abstract

Information technology advancements have created big data, necessitating efficient techniques to retrieve helpful information. With its capacity to recognize and categorize patterns in data, especially the growing amount of picture data, deep learning is becoming a viable option. This research aims to develop a medical image classification model using chest X-Ray with four classes, namely Covid-19, Pneumonia, Tuberculosis, and Normal. The proposed method combines the advantages of deep learning and machine learning. Three pre-trained CNN models, VGG16, DenseNet201, and InceptionV3, extract features from images. The features generated from each model are fused to enhance the relevant information. Furthermore, principal component analysis (PCA) was applied to reduce the dimensionality of the features, and Bayesian optimization was used to optimize the hyperparameters of the machine learning algorithms support vector machine (SVM), decision tree (DT), and k-nearest neighbors (k-NN). The resulting classification model was evaluated based on accuracy, precision, recall, and F1-score. The results showed that FF-SVM, which is the proposed model, achieved an accuracy of 98.79% with precision, recall, and F1-score of 98.85%, 98.82%, and 98.84%, respectively. In conclusion, fusing feature extraction from multiple CNN models improved the classification accuracy of each machine-learning model. It provided reliable and accurate predictions for lung image diagnosis using chest X-Ray.
Fuzzy Support Vector Machine Using Function Linear Membership and Exponential with Mahanalobis Distance Sukeiti, Wiwi Widia; Surono, Sugiyarto
JTAM (Jurnal Teori dan Aplikasi Matematika) Vol 6, No 2 (2022): April
Publisher : Universitas Muhammadiyah Mataram

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.31764/jtam.v6i2.6912

Abstract

Support vector machine (SVM) is one of effective biner classification technic with structural risk minimization (SRM) principle. SVM method is known as one of successful method in classification technic. But the real-life data problem lies in the occurrence of noise and outlier. Noise will create confusion for the SVM when the data is being processed. On this research, SVM is being developed by adding its fuzzy membership function to lessen the noise and outlier effect in data when trying to figure out the hyperplane solution. Distance calculation is also being considered while determining fuzzy value because it is a basic thing in determining the proximity between data elements, which in general is built depending on the distance between the point into the real class mass center. Fuzzy support vector machine (FSVM) uses Mahalanobis distances with the goal of finding the best hyperplane by separating data between defined classes. The data used will be going over trial for several dividing partition percentage transforming into training set and testing set. Although theoretically FSVM is able to overcome noise and outliers, the results show that the accuracy of FSVM, namely 0.017170689 and 0.018668421, is lower than the accuracy of the classical SVM method, which is 0.018838348. The existence of fuzzy membership function is extremely influential in deciding the best hyperplane. Based on that, determining the correct fuzzy membership is critical in FSVM problem.
Comparative study of unsupervised anomaly detection methods on imbalanced time series data Hanifa, Riza Aulia; Thobirin, Aris; Surono, Sugiyarto
Jurnal Ilmiah Kursor Vol. 13 No. 2 (2025)
Publisher : Universitas Trunojoyo Madura

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.21107/kursor.v13i2.431

Abstract

Anomaly detection in time series data is essential, especially when dealing with imbalanced datasets such as air quality records. This study addresses the challenge of identifying point anomalies rare and extreme pollution levels within a highly imbalanced dataset. Failing to detect such anomalies may lead to delayed environmental interventions and poor public health responses. To solve this, we propose a comparative analysis of three unsupervised learning methods: K-means clustering, Isolation Forest (IForest), and Autoencoder (AE), including its LSTM variant. These algorithms are applied to monthly air quality data collected in 2023 from 2,110 cities across Asia. The models are evaluated using Area Under the Curve (AUC), Precision, Recall, and F1-score to assess their effectiveness in detecting anomalies. Results indicate that the Autoencoder and Autoencoder LSTM outperform the others with an AUC of 98.23%, followed by K-means (97.78%) and IForest (96.01%). The Autoencoder’s reconstruction capability makes it highly effective for capturing complex temporal patterns. K-means and IForest also show strong results, offering efficient and interpretable solutions for structured data. This research highlights the potential of unsupervised anomaly detection techniques for environmental monitoring and provides practical insights into handling imbalanced time series data.