Identification and Detection of Illegal Gambling Websites and Analysis of User Behavior
- College of Information Engineering, Shanghai Maritime University
201306 Shanghai, China
zhangzhimin@stu.shmtu.edu.cn, dzhan@shmtu.edu.cn, shishuxin@stu.shmtu.edu.cn - Network Security Center, The Third Research Institute of the Ministry of Public Security
200031 Shanghai, China
wusongyang@stars.org.cn, sunwenqi@gass.ac.cn
Abstract
Illegal gambling websites use advanced technology to evade regulations, posing cybersecurity challenges. To address this, we propose a machine learning method to identify these sites and analyze user behavior accurately. The method extracts key data from post messages in a real-world network environment, generating word vectors via Word2Vec with TF-IDF, which are then downscaled and feature-extracted using a Stacked Denoising Auto Encoder (SDAE). Next, this paper uses Agglomerative Clustering, improved through a combination of distance caching and heap optimization, to initially cluster post-template websites of the same type by clustering them into the same cluster. Then, multiple algorithms are integrated within each website cluster to cluster users’ different operational behaviors into different clusters based on the cosine similarity consensus function voting secondary clustering. Results show improved detection of illegal gambling sites and classification of user activities, offering new insights for combating these sites.
Key words
Gambling websites, post messages, feature extraction, illegal website identification, cluster analysis
How to cite
Zhang, Z., Han, D., Wu, S., Sun, W., Shi, S.: Identification and Detection of Illegal Gambling Websites and Analysis of User Behavior. Computer Science and Information Systems