Maulidya, Luthfi
Unknown Affiliation

Published : 1 Documents Claim Missing Document
Claim Missing Document
Check
Articles

Found 1 Documents
Search
Journal : Jurnal Nasional Pendidikan Teknik Informatika (JANAPATI)

Multi-Source Data Fusion For Data Extraction and Integration of Scientific Publications in Academic Institution STIS Maulidya, Luthfi; Suadaa, Lya Hulliyyatus; Wijayanto, Arie Wahyu; Ridho, Farid
Jurnal Nasional Pendidikan Teknik Informatika: JANAPATI Vol. 14 No. 2 (2025)
Publisher : Prodi Pendidikan Teknik Informatika Universitas Pendidikan Ganesha

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.23887/janapati.v14i2.87050

Abstract

Scientific research publication data is one of the most important data required by academic and research institution because it can be used as a reference to measure the performance of lecturers in research activities, to assess study programs and university accreditation, to identify research trends, and to plan research development policies and strategies. However, to fulfill these data needs, research data must be collected and integrated from various data sources due to the diversity of databases. One of the portals that provides scientific research publication data for universities in Indonesia is Sinta (Science and Technology Index). The integrated research databases in Sinta are Scopus, Web of Science (WoS), Garba Rujukan Digital (Garuda), and Google Scholar. However, there are limitations, namely that some scientific research publication metadata in Sinta are still not covered, such as Digital Object Identifier (DOI), abstract, author's full name, publication/journal name, publication type, and number of citations. In addition, each data source has a different data format, which requires data processing so that it can be integrated. Processing and integrating research data from different sources will be very inefficient if it is done manually and not computerized. Therefore, this study proposes a data engineering pipeline framework for the extraction and integration of scientific research publication data from various data sources using the multi-source data fusion method with the Unified Cube methodology approach, which is then implemented by building a web interface. We use Politeknik Statistika STIS, Jakarta as a case study. This framework refers to the data engineering lifecycle and multi-source data fusion method based on abstraction levels for the extraction and integration of scientific research publication data. Then, the transformed data will be classified using rule-based classification. The results show that the accuracy of the framework was more than 90% and the accuracy of the classification results was 87.5%.