Data Science Insights
Vol. 2 No. 1 (2024): Journal of Data Science Insights

Database-Specific Keyword Frequency Analysis in Merged Web Log Data: A Preprocessing Method

Wan Ishak, Wan Hussain (Unknown)
Nurul Farhana Ismail (Unknown)
Fadhilah Mat Yamin (Unknown)
Husin, Abdullah (Unknown)



Article Info

Publish Date
29 Feb 2024

Abstract

This study investigates the complex intricacies of web log data within the Electronic Resources module of the Perpustakaan Sultanah Bahiyah (PSB) website at Universiti Utara Malaysia (UUM). Serving as a cornerstone of academic infrastructure, the Electronic Resources module acts as a vital gateway, seamlessly connecting the UUM academic community to a vast repository of scholarly information. To tackle challenges posed by the size and complexity of web log data, the research employs a meticulous preprocessing method, involving the restructuring of raw data, outlier cleaning, and user session identification, laying the foundation for a comprehensive analysis. The study further explores the identification of search keywords embedded in the log file, employing a systematic process that transforms data into a structured format. The subsequent extraction of databases and keywords yields intriguing findings, prominently highlighting IEEE and Serial Solution databases. The analysis of 19,146 keywords associated with 11 databases offers valuable insights into user behavior, preferences, and the overall effectiveness of the Electronic Resources module. The identification of frequent keywords not only provides analytical insights but also serves to accelerate users' search processes, reducing cognitive load and fostering a more efficient research experience. This research contributes to the optimization of user experiences and the ongoing refinement of digital library services, aligning them with the evolving needs of the academic community

Copyrights © 2024






Journal Info

Abbrev

jdsi

Publisher

Subject

Computer Science & IT Engineering

Description

Data Science Insights, with ISSN 3031-1268 (Online) published by PT Visi Media Network is a journal that publishes Focus & Scope research articles, which include Data Science and Machine Learning; Data Science and AI; Blockchain and Advance Data Science; Cloud computing and Big Data; Business ...