International Journal of Informatics and Data Science
Vol. 1 No. 2 (2024): June 2024

Clustering of YouTube Viewer Data Based on Preferences using Leiden Algorithm

Erlin Windia Ambarsari (Unknown)
Aulia Paramita (Unknown)
Desyanti (Unknown)



Article Info

Publish Date
29 Jun 2024

Abstract

This study aims to analyze YouTube viewer engagement patterns by applying the Leiden algorithm for clustering based on user interactions such as likes, dislikes, and subscription behaviors in correlation with video duration. Therefore, the method that we used begins with data cleaning to ensure completeness, followed by selecting relevant features and applying z-score normalization to equalize their contributions. A similarity graph is constructed using cosine similarity, representing instances as nodes and their relationships as edges. The Leiden algorithm is then applied to optimize modularity and extract clusters, with results integrated into the original dataset for analysis. Dimensionality reduction using PCA facilitates cluster visualization, while statistical summaries and distribution plots provide deeper insights into cluster characteristics. Subsequently, we obtained a dataset sourced from the YouTube content creator @ArmanVesona, which includes 237 instances with ten features: Shares, Comments Added, Dislikes, Likes, Subscribers Lost, Subscribers Gained, Views, Watch Time (hours), Impressions, and Click-Through Rate (%). The analysis reveals two distinct clusters: Cluster 0, characterized by lower engagement and stable audience, and Cluster 1, exhibiting higher engagement but higher subscriber churn. The findings highlight the effectiveness of the Leiden algorithm in detecting well-connected communities and provide insights into viewer behavior, aiding in the development of improved content strategies and targeted marketing approaches.

Copyrights © 2024






Journal Info

Abbrev

ijids

Publisher

Subject

Computer Science & IT

Description

International Journal of Informatics and Data Science publishes manuscripts of Computer Science, but is not limited to the fields of: 1. Natural Language Processing Pattern Classification, 2. Speech recognition and synthesis, 3. Robotic Intelligence, 4. Big Data, 5. Informatics Techniques, 6. Image ...