Sunday, December 4, 2022

Cleaning up social media with machine learning

- Advertisement -spot_imgspot_img
- Advertisement -spot_imgspot_img

Credit: CC0 Public Domain

Adult, or pornographic, content spam is a growing problem on social media. New research in the International Journal of Business Intelligence and Data Mining discusses how such content might be quickly detected and removed in a timely manner.

Deepali Dhaka, Surbhi Kakar, and Monica Mehrotra of Jamia Millia Islamia (Central University) in Jamia Nagar, New Delhi, India, explain how the general user experience and that of using might be improved if obscene spam content can be filtered effectively and quickly. Machine learning tools are often the way forward in detecting particular types of content and the team has demonstrated that one such tool, XGboost, can detect adult spam content with more than 90% accuracy. This was the most effective classification algorithm of the six tested and adapted by the team for detecting pornographic spam on Twitter.

As such, fewer than ten in every hundred updates flagged as adult spam would be . The team’s approach needed to analyze just a small number of features, value system, the entropy of words, lexical diversity, and word embeddings, to be able to pluck adult spam updates from the general stream of updates on one of the most well-known social media platforms, Twitter.

Inherent in positive detection is that in general, everyday users of the platform discuss a wide variety of topics in different contexts and write and share in what might be referred to as an organic manner. In contrast, and pornographic spammers, in this case, tend to have a fixed or even entirely automated approach to their updates, limited diversity of subject matter, as one would expect, and a very limited lexicon. These and other characteristics of spam messages, make them recognizable to the algorithm.


Twitter says it removes 1 million spam accounts a day


More information:
Monica Mehrotra et al, Detection of Spammers disseminating obscene content on Twitter, International Journal of Business Intelligence and Data Mining (2021). DOI: 10.1504/IJBIDM.2022.10040432

Citation:
Cleaning up social media with machine learning (2022, September 7)
retrieved 17 October 2022
from https://techxplore.com/news/2022-09-social-media-machine.html

This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no
part may be reproduced without the written permission. The content is provided for information purposes only.



- Advertisement -spot_imgspot_img
Latest news
- Advertisement -spot_img
Related news
- Advertisement -spot_img

LEAVE A REPLY

Please enter your comment!
Please enter your name here

%d bloggers like this: