• disinfolab

Using Neural Networks to Predict the Spread of Tweets Containing Disinformation

Jeremy Swack, Jinyang Liu, Aaraj Vij, Atharv Gupta


In Summer 2021, DisinfoLab established a Technical Analyst team to grow the research capacity of our lab. Our Technical Analysts build tools to collect large swaths of data for analysis and employ artificial intelligence to identify salient trends in disinformation. For our pilot project, we built a neural network that predicts the engagement of tweets containing disinformation based on thirteen variables. Such research may aid social media companies in implementing measures to stop the spread of disinformation on their platforms.

Data Collection

To train our neural network, we first collected Twitter data from two online platforms: BotSentinel and Hoaxy.

BotSentinel is a program that tracks inauthentic Twitter account activity. Using artificial intelligence and machine learning, BotSentinel generates authenticity ratings for Twitter accounts based on their abidance with Twitter’s user guidelines. These ratings are not binary––they do not directly verify whether an account is a “bot” account or a “real” person. Instead, their system rates each account on a scale from 1 to 100 and categorizes each account as either normal, questionable, disruptive, or problematic. Using this data, BotSentinel then formulates a list of the top hashtags, two-word phrases, URLs, and mentions tweeted by likely inauthentic accounts, which are updated every hour. More information about how BotSentinel analyzes accounts can be found on their website.

Hoaxy is an online tool developed by the Indiana University Observatory on Social Media that visualizes the spread of articles and phrases on Twitter. Hoaxy tracks the sharing of links to stories from low-credibility sources and offers a Live Search function where users can search for the network spread of specific phrases and links. Hoaxy also calculates a bot score for users sharing a target piece of information, using a machine learning algorithm called Botometer.

For data collection, our analyst team uses three Python scripts.

Script A

This script web scrapes trending two-word phrases identified by BotSentinel and inputs these phrases as Live Search terms into Hoaxy. Hoaxy retrieves relevant tweets and exports the results into spreadsheets, which are saved locally.

Script B

This script uses the Twitter API to collect all publicly available user and tweet information, including the full text of each tweet.

Script C

This script uses Valence Aware Dictionary and sEntiment Reasoner (VADER) natural language processing to determine the polarity, subjectivity, and negative, neutral or positive sentiment of each tweet.

Using these scripts, our team collected approximately 20,000 data points between July and August 2021.

Data Analysis

To analyze the collected data, we developed a neural network that predicts the number of Twitter interactions—retweets, mentions, and quote tweets—that a tweet will receive based on thirteen variables:

  • Hoaxy Data

  • Hoaxy bot score of account

  • Twitter API Data

  • Date account was created

  • Number of users the account follows

  • Number of users following the account

  • Number of public lists the account has

  • Number of tweets the account has

  • Number of tweets the account has liked

  • Verification status of the account

  • Default or custom profile picture

  • Language of target tweet

  • VADER Natural Language Processing Data

  • Polarity of target tweet

  • Subjectivity of target tweet

  • Sentiment score of given tweet

To train the model, we split the data into 15,000 training data points and 5,000 testing data points. We built the model using Keras’ Sequential class to produce a single numerical output. To improve the model’s accuracy and prevent overfitting, we added batch normalization and dropout layers alongside the dense layers. The results are shown in the plot below.

This plot shows the actual number of Twitter interactions a tweet received on the X-axis against the predicted number of Twitter interactions a tweet received on the Y-axis for the five thousand testing data points. The closer a point is to the blue line, the more accurate the prediction for that data point is. Most points fall between zero and one hundred interactions and the model generally had a high degree of accuracy for points in this range. However, at the extremes where there was less data for the model to train with, the model struggled to consistently create accurate predictions.

To gather insight into the significance of each variable, we made a variable importance chart using the package SHAP. This package allows us to calculate SHapley Additive exPlanations (SHAP) values, or the average contribution of a variable to the model’s predictions across all permutations of the different variables. The variable importance chart is displayed below.

The top ten most impactful variables are shown on this chart. A variable with a blue bar means that a higher value of that variable increases the model’s prediction of the number of engagements a tweet will receive, while a red bar means that a higher value of that variable lowers the model’s prediction.


Based on our findings, the most important variable in predicting the engagements of a tweet was the Hoaxy bot score of the account that posted it. Tweets from accounts that had a higher probability of being a bot, and thus a larger bot score, received a significant positive boost towards their predicted number of interactions. The second most important variable in the model was the time when an account was created. Because the bar is red, this means that tweets from older accounts received a significant negative push on their predicted number of interactions. Putting these elements together, this means that generally, tweets from accounts that were newer and more likely to be bots tended to get higher levels of interactions. The third most important variable, sentiment score, comes from VADER’s natural language processing that was done on the data. This variable, which is a value from -1 to 1, where -1 is the most negative, 0 is neutral, and 1 is the most positive, indicates the sentiment of a given tweet. Because the bar is blue, this means that tweets that were more positive received a positive bump in their predictions.

Interestingly, the number of tweets or statuses that an account had, the number of people an account followed, and the number of followers an account had were not the most significant predictors in a tweet’s number of interactions. A likely explanation for this could be that viral tweets can come from anywhere depending on the kind of interactions they get.

Future Work

We have begun examining the performance of this model on emerging articles containing disinformation. We identified 15 articles containing disinformation and utilized Hoaxy’s Live Search feature to collect 400 data points. We then separated these articles and data points into three distinct categories: COVID-19, International Relations, and United States Politics. We tested these data points on our existing model, which produced the following result.

For these known disinformation data points, our existing model performed poorly. Two possible explanations for this performance are:

  1. The test data set is small, and therefore is subjected to increased variance

  2. Our metric for collecting data—tweets containing flagged phrases from BotSentinel—is an inadequate proxy for the true spread of disinformation on Twitter

DisinfoLab will continue to investigate these possibilities and create new training and test sets for an updated model.

295 views0 comments

Recent Posts

See All

Sarah Wozniak Russia and China are actively creating and disseminating disinformation narratives in Ukraine and Taiwan to pursue their policy agendas and expand their spheres of influence. Both target

DisinfoLab is a student-run disinformation and technology think tank at William & Mary. We use a multidisciplinary approach to investigate emerging trends in online disinformation and develop novel st