This paper proposed a big data pipeline to analyze user behavior on Last.fm which aims to make data-driven recommendations for improving user engagement and attracting new users. The comprehensive analysis of user behavior in the music streaming industry using the Hadoop ecosystem and data analytics techniques. Specifically, the study focuses on Last.fm, a popular music streaming platform that collects large amounts of user activity data. The article proposes a new data pipeline utilizing Hadoop Distributed File System (HDFS) for data storage and Apache Pig for data transformation, leading to improved data preprocessing and analysis. Various analyses are conducted, including identifying the most listened to artists, top users based on song consumption and social connections, artist popularity by tags, and the most recently tagged artists. The findings provide valuable insights into user preferences, current trends, and opportunities for enhancing the recommendation algorithm and user engagement. The article concludes by offering recommendations for personalized marketing strategies and curated playlists to increase user satisfaction and revenue.
Copyrights © 2024