Social Networks are an expression of freedom of thought and a way to give possibility to speaking freely. Unfortunately, between the huge amount of internet users, there are people that abuse of this power and use microblogs for harassing other people or spreading malicious contents. Identifying these accounts, bounding damages and classifying the greatest number of spammers are increasingly important. This work proposes a framework that exploits a non-uniform feature sampling inside a gray box Machine Learning System, using a variant of the Random Forests Algorithm to identify spammers inside Twitter traffic with the lower computational cost possible. This work also provides a dataset of Twitter users, labeled as spammers or legitimate users, described by 54 features. Experimental results demonstrate the enriched feature sampling method effectiveness.
Spam Detection of Twitter Traffic: A Framework based on Random Forests and nonuniform feature sampling
zunino rodolfo;gianoglio christian;ragusa edoardo;meda claudia;Surlinelli Roberto
2016-01-01
Abstract
Social Networks are an expression of freedom of thought and a way to give possibility to speaking freely. Unfortunately, between the huge amount of internet users, there are people that abuse of this power and use microblogs for harassing other people or spreading malicious contents. Identifying these accounts, bounding damages and classifying the greatest number of spammers are increasingly important. This work proposes a framework that exploits a non-uniform feature sampling inside a gray box Machine Learning System, using a variant of the Random Forests Algorithm to identify spammers inside Twitter traffic with the lower computational cost possible. This work also provides a dataset of Twitter users, labeled as spammers or legitimate users, described by 54 features. Experimental results demonstrate the enriched feature sampling method effectiveness.File | Dimensione | Formato | |
---|---|---|---|
PaperFOSINT2016.pdf
accesso chiuso
Tipologia:
Documento in versione editoriale
Dimensione
453.85 kB
Formato
Adobe PDF
|
453.85 kB | Adobe PDF | Visualizza/Apri Richiedi una copia |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.