This study addresses the critical challenge of accurately detecting emerging technology topics from large-scale heterogeneous data sources. The methodology encompasses four sequential stages: (1) multi-source data acquisition from academic papers, patents, policies, and technical reports; (2) BERTopic-based topic modeling utilizing BERT embeddings and c-TF-IDF for enhanced semantic representation; (3) topic consolidation through cosine similarity analysis of topic vectors; and (4) emerging topic identification via a weighted evaluation system incorporating novelty, growth, continuity, and impact dimensions. Applied to the new energy vehicle domain using data from 2010-2022, the framework successfully identified 16 candidate emerging technology topics through analysis of 27,058 academic papers and 54,572 patents. Validation results indicate that 12 of the 16 identified topics (75% accuracy) align with technological priorities outlined in government policies and industry reports. The method effectively captures cross-domain technological convergence, with four common topics identified between academic and patent datasets, primarily concentrated in battery technology domains.
Copyrights © 2025