Chandrasekar, Divya
Unknown Affiliation

Published : 2 Documents Claim Missing Document
Claim Missing Document
Check
Articles

Found 2 Documents
Search

Feature separation of music across diverse dataset: a comparative perspective Shunmugalingam Parvathi, Sakthidevi; Chandrasekar, Divya
Bulletin of Electrical Engineering and Informatics Vol 14, No 5: October 2025
Publisher : Institute of Advanced Engineering and Science

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.11591/eei.v14i5.9962

Abstract

In music, feature separation is the process of separating distinguishable auditory characteristics, such as pitch, timbre, rhythm, and harmonic content, from a complicated, mixed signal. Virtual reality (VR), gaming, music transcription, karaoke systems, audio restoration, music information retrieval (MIR), music education, and audio forensics, are just a few of the areas where the topic has attracted a lot of attention. Feature extraction is crucial in music separation as it identifies and isolates sound elements, improving accuracy, and reducing noise. It simplifies raw audio into meaningful data for efficient processing and effective model learning. Without it, clean separation of audio components is very difficult. In this research, extracting features from mixed audio sources enables clean and accurate isolation of musical elements, enhancing quality, supporting precise evaluations, and boosting neural network performance across varied datasets including DSD100, MUSDB, and MUSDB18-HQ, which collectively afford rich musical content for making evaluations and benchmarks. Evaluation metrics, such as F1-score, precision, and recall, are utilized to demonstrate the performance data of the extracted features. The MUSDB18-HQ dataset yielded an overall increase of 17.86% in the F1-score metrics with significant increases in drums (+25.05%) and vocals (+20.04%), showing that the dataset was highly effective for feature separation.
DeepFloyd-IF via diffusion and U-Net based cross-model attention for semantic coherence Veilumuthu, Kowsalya; Chandrasekar, Divya; Parvathi, Sakthidevi Shunmugalingam
Bulletin of Electrical Engineering and Informatics Vol 15, No 2: April 2026
Publisher : Institute of Advanced Engineering and Science

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.11591/eei.v15i2.9927

Abstract

Text to image synthesis is getting harder in artificial intelligence, impacting gaming, advertising, and multimedia. The practical use of current Text to Image models is limited by the trade-off between semantic coherence and visual quality. To address this, this work presents stable diffusion cross-modal attention with multi-head attention (SD-CMA-MHA), a framework for the DeepFloyd-IF task. This combines stable diffusion with U-Net based cross-modal attention and multi-head attention (MHA) to improve DeepFloyd-IF, a standard for high quality image synthesis. This allows the model to capture subtle semantic relationships between text and images while dynamically focusing on relevant input features. Experiments on LAION-1.2B and MS-COCO datasets show that the model achieves 80% generation accuracy, 70% text-image alignment similarity and reduced divergence from real images, better than previous methods. This shows that SD-CMA-MHA improves semantic alignment and fidelity. The conclusion is that by enabling more reliable and context aware visual generation, this work not only bridges the gap between text and visual modalities but also has implications for creative industries, education and human-computer interaction.