SUMMARY: Pollchatter’s “Hashtag warnings” and “District warnings” pages give an indication of which hashtags and districts might be the target of deliberate traffic manipulation.
Numbers matter on Twitter. Retweets, likes, followers, networks of influence. But it’s not always obvious how fluctuations in these factors should be interpreted.
It’s thus helpful to have a measure of the degree to which specific traffic flows on Twitter are being manipulated.
However, defining this isn’t easy. The Atlantic Council’s Ben Nimmo has offered one useful mechanism in this regard, based around a measure he calls the “coefficient of traffic manipulation.” This focuses on three individual variables:
- Average number of posts by each account (U)
- Percentage of retweets vs. original posts (R)
- Percentage of traffic produced by the 50 most active accounts (using a particular hashtag) (F)
By comparing a number of innocuous hashtags (#Monday, #Tuesday) with several that were fairly clearly targeted for influence (#LaFranceVoteMarine, #DigDoug), Nimmo observed that concerted campaigns of traffic manipulation – that is, coordinated efforts by a relatively small group of users to boost traffic – led to higher values within these variables.
He then seeks to balance the effect of each factor in the final coefficient of traffic manipulation (CTM):
CTM = U + R/10 + F
From Nimmo’s paper:
The CTM can highlight subject flows which were generated by a disproportionately small user base, as would be the case if a small group of individuals had tried to “game” the algorithm by each posting many times in a short period; traffic featuring a disproportionate number of retweets, as would be the case if it had been amplified by retweet bots; traffic featuring an overall number of users disproportionate to the volume of traffic, as would be the case in a planned and coordinated posting campaign by a small group; and any combination of the above.
Unfortunately, Pollchatter can’t in practice use this model without modification. Most of the districts and hashtags we’re observing see far less traffic on a daily basis than did the high-intensity hashtags sampled in Nimmo’s study. In many districts, the total number of accounts posting over the course of a single (ordinary) day is well under 500, and not infrequently even under 50.
This makes the activity of the top 50 accounts an impractical measure, for example.
Nevertheless, relatively low-traffic accounts may still be subject to manipulation, and in an election where a few hundred votes could make the difference between a Republican or Democratic Senate, it is vital to identify these campaigns of influence.
We have thus modified Nimmo’s original model in several ways, with the aim of allowing it to function for lower-traffic tweet samples.
Modifying F
- Rather than using the top 50 most active accounts as the “top group,” we examine the top 10% most active accounts (F).
Rebalancing coefficient components
In Nimmo’s samples, the “normal” F and U values were generally under 3. This meant that the “normal” average posts per account figure was generally under 2, while the top 50 users were generally responsible for less than 3% of total posts.
In our lower-traffic districts and hashtags, we find that the top 10% of users is often responsible for a considerably larger share of posts than in Nimmo’s “normal” hashtags – as much as 30% to 40% on an ordinary day (e.g., the median for a sample of 1,000 hashtags over 60 days (excluding any day with fewer than 10 posts in that hashtag) was 21%, with an interquartile range of 17%)).
- Because this F variable is often very high even on normal days, we are using F/10 as a coefficient component, rather than F. This helps keep fluctuations in F from overpowering fluctuations in the other variables.
- Similarly, the average retweet shares prove exceedingly high even on normal days (median for a sample of 1000 hashtags over 60 days (excluding any day with fewer than 10 posts in that hashtag) was 84%). Thus, we are using R/20 rather than Nimmo’s R/10, again seeking to keep the “normal” weighting of the individual components at comparable levels.
Thus, our CTM is derived using the following equation:
CTM = U + R/20 + F/10
Establishing a “typical” range for comparison
A measure of possible manipulation is most useful when we have a reasonably good sense of what is normal, and what is not. Values falling outside this typical range don’t necessarily mean a deliberate or nefarious attempt at traffic manipulation is underway. However, it means a closer look is worthwhile.
Given the substantial differences in demographics and internet usage between each district, we assume that “normal” Twitter use may vary from district to district. We thus created a different threshold and “typical” use range for each district.
To do so, we sampled the three key variables – average posts per user, retweet share, and top 10% share – for each district over a period of 60 days, from late July through late September.
The distribution of these samples and the corresponding CTM did not in every case follow a normal or “bell curve” distribution. Thus, to identify the region in which values might be deemed excessively high, we used a common method of finding outliers based on the interquartile range (IQR), or the middle 50% of values.
For each individual district, we calculated the IQR for the three main coefficient components and the CTM, based on our 60-day data sample. To find the threshold for possible outliers, we multiplied the IQR by 1.5, and then added that figure on top of the 75th percentile mark in the sample data for each factor, and for the final CTM.
We conducted a similar exercise for hashtags. Over the same 60-day period, we selected 1000 hashtags at random (with one crucial exception: discarding any hashtag that had received fewer than 100 posts over the period).
We then sampled the key variables – average posts per user, retweet share, and top 10% share – for each day, for each hashtag over the period. Hashtag-days in which fewer than 10 posts were made were discarded.
This data was then used to find the median, interquartile range, and outlier range for hashtags as a whole (within this electoral-twitter universe – this will not necessarily apply to other types of content), for each of the three components, and for the CTM derived from them.
For both the district and general-hashtag measures, the “yellow” alert threshold is thus the 75th percentile plus the IQR. The “red” threshold is the traditional outlier mark, the 75th percentile plus 1.5 times the IQR.
We’ll discuss more about what this means in future posts.