Claim Missing Document
Check
Articles

Found 2 Documents
Search

Scalable Machine Learning Approaches for Real-Time Anomaly and Outlier Detection in Streaming Environments Dewi, Deshinta Arrova; Singh, Harprith Kaur Rajinder; Periasamy, Jeyarani; Kurniawan, Tri Basuki; Henderi, Henderi; Hasibuan, M. Said
Journal of Applied Data Sciences Vol 5, No 4: DECEMBER 2024
Publisher : Bright Publisher

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47738/jads.v5i4.444

Abstract

The prevalence of streaming data across various sectors poses significant challenges for real-time anomaly detection due to its volume, velocity, and variability. Traditional data processing methods often need to be improved for such dynamic environments, necessitating robust, scalable, and efficient real-time analysis systems. This study compares two advanced machine learning approaches—LSTM autoencoders and Matrix Profile algorithms—to identify the most effective method for anomaly detection in streaming environments using the NYC taxi dataset. Existing literature on anomaly detection in streaming data highlights various methodologies, including statistical tests, window-based techniques, and machine learning models. Traditional methods like the Generalized ESD test have been adapted for streaming data but often require a full historical dataset to function effectively. In contrast, machine learning approaches, particularly those using LSTM networks, are noted for their ability to learn complex patterns and dependencies, offering promising results in real-time applications. In a comparative analysis, LSTM autoencoders significantly outperformed other methods, achieving an F1-score of 0.22 for anomaly detection, notably higher than other techniques. This model demonstrated superior capability in capturing temporal dependencies and complex data patterns, making it highly effective for the dynamic and varied data in the NYC taxi dataset. The LSTM autoencoder's advanced pattern recognition and anomaly detection capabilities confirm its suitability for complex, high-velocity streaming data environments. Future research should explore the integration of LSTM autoencoders with other machine-learning techniques to enhance further the accuracy, scalability, and efficiency of anomaly detection systems. This study advances our understanding of scalable machine-learning approaches and underscores the critical importance of selecting appropriate models based on the specific characteristics and challenges of the data involved.
Incorporate Transformer-Based Models for Anomaly Detection Dewi, Deshinta Arrova; Singh, Harprith Kaur Rajinder; Periasamy, Jeyarani; Kurniawan, Tri Basuki; Henderi, Henderi; Hasibuan, M. Said; Nathan, Yogeswaran
Journal of Applied Data Sciences Vol 6, No 3: September 2025
Publisher : Bright Publisher

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47738/jads.v6i3.762

Abstract

This paper explores the effectiveness of Transformer-based models, specifically the Time-Series Transformer (TST) and Temporal Fusion Transformer (TFT), for anomaly detection in streaming data. We review related work on anomaly detection models, highlighting traditional methods' limitations in speed, accuracy, and scalability. While LSTM Autoencoders are known for their ability to capture temporal patterns, they suffer from high memory consumption and slower inference times. Though efficient in terms of memory usage, the Matrix Profile provides lower performance in detecting anomalies. To address these challenges, we propose using Transformer-based models, which leverage the self-attention mechanism to capture long-range dependencies in data, process sequences in parallel, and achieve superior performance in both accuracy and efficiency. Our experiments show that TFT outperforms the other models with an F1-score of 0.92 and a Precision-Recall AUC of 0.71, demonstrating significant improvements in anomaly detection. The TST model also shows competitive performance with an F1-score of 0.88 and Precision-Recall AUC of 0.68, offering a more efficient alternative to LSTMs. The results underscore that Transformer models, particularly TST and TFT, provide a robust solution for anomaly detection in real-time applications, offering improved performance, faster inference times, and lower memory usage than traditional models. In conclusion, Transformer-based models stand out as the most effective and scalable solution for large-scale, real-time anomaly detection in streaming time-series data, paving the way for their broader application across various industries. Future work will further focus on optimizing these models and exploring hybrid approaches to enhance detection capabilities and real-time performance.