Unlike humans, the energies in industrial machine sounds (IMS) vary across a wide range of frequencies. Mel scales, which are developed for the perception of human audio, fail to capture the complete information present in IMS. To improve performance, we propose using an inverse-Mel scale, along with the concatenation and combination of Mel and inverse-Mel scale based spectrograms, as feature vectors for audio anomaly detection (AAD) in industrial machines. Adaptation in the Librosa Python package and the DCASE 2022 Challenge Task 2 baseline system is pursued for the construction of inverse-Mel scale spectrograms. Experiments are conducted using the malfunctioning industrial machine investigation and inspection for domain generalization (MIMII DG) datasets. Systems based on the inverse-Mel scale achieve a maximum improvement of up to 37% in the bearing machine and an average improvement of up to 9% in the area under the curve (AUC) score across all machines in the MIMII DG datasets. The proposed features also enhance DG, overcoming the effects of environmental and operational domain shifts caused by variations in recording setup, load, background noise, and operational patterns. Challenge official evaluator assessed the proposed system against the evaluation datasets, ranking it three positions higher than the baseline system.
Copyrights © 2025