With the development of Blockchain technology, for example, the Solana Blockchain has generated enormous amounts of data and possesses the 5Vs of Big Data: volume, velocity, value, veracity, and variety. This has brought challenges, for example, in distinguishing transactions carried out by humans from automated bots that often carry out market manipulation or Sybil attacks. Therefore, this research aims to detect bot activity on the Solana network by applying data mining techniques, namely the K-Means Clustering algorithm. From the large transaction data that will be extracted only a portion from the public Solana dataset in BigQuery, it will then be processed through a preprocessing stage to normalize the data and simplify complex data into simpler variables before being grouped. Because the extracted data is in the form of unlabeled data groups (unsupervised data), the Clustering Method is used because of its ability to recognize data groups based on behavioral or characteristic similarities without requiring initial data labels (unsupervised learning). The main variables used for the grouping process include transaction frequency, inter-arrival time (inter-transaction), and the number of unique program interactions. The results of this analysis are expected to map transaction accounts into several clusters based on their transaction patterns, allowing for the classification of bots and humans. This research is expected to demonstrate that Big Data infrastructure such as Google Cloud, using data mining techniques (Clustering), can be used to maintain the security and integrity of the blockchain ecosystem.
Copyrights © 2026