Understanding the underlying themes in presidential speeches is critical for analyzing political discourse and determining public policy direction. However, topic modeling in this context presents difficulties, particularly when clustering semantically rich topics from high-dimensional embeddings. This study seeks to improve topic modeling performance by incorporating a Neural Network Clustering (NNC) approach into the BERTopic pipeline. We analyze 2,747 speeches delivered by U.S President Joe Biden (2021-2025) and compare three clustering techniques: HDBSCAN, KMeans, and the proposed Autoencoder-based NNC. The evaluation metrics (UMass, NPMI, Topic Diversity) show that NNC produces the most coherent and diverse topic clusters (UMass = -0.4548, NPMI = 0.0234, Diversity = 0.3950, ). These findings show that NNC can overcome the limitations of density and centroid-based clustering in high-dimensional semantic spaces. The study contributes to the field of Natural Language Processing by demonstrating how neural-based clustering can improve topic modeling, particularly for complex, real-world political corpora.
Copyrights © 2025