Tuesday, November 1, 2016

The shape of virality

Tweets rarely go viral. Again, as Goel et al tell us in the paper I blogged about yesterday, almost all tweets we see are either one shot posts (95%) or first round retweets (3%). But that means 2% are viral-ish, being retweeted at least twice down the chain.

But what do those viral cascades look like? Are they like the spread of the flu, slowly working their way from person to person, infecting a few at at time but eventually hitting large swathes of the population? Or do they spread in bursts, propelled by super-tweeters? And is there something about the inherent tweet-worthiness of a post that makes it more likely to go viral?

Goel takes on these questions, again with Duncan Watts (and adding on Ashton Anderson and Jake Hofman), in an extremely impressive paper that tracks over a billion tweets and simulates diffusion on model networks with 25 million nodes. (Woah.)

First, they find that what we might call tweet cascades are even rarer than stated above. If you consider cascades that have at least 100 retweets, those make up only 0.025% of all initial tweets. (What's less clear is what percent of tweets we see on our wall are initial tweets or retweets. The fact that the authors track 600 million initial tweets and a total of 1.2 billion "adoptions" suggests that half of the tweets we see are re-tweets.)

When they do go viral, to get back to the questions above, they don't look like flu epidemics or like broadcasts - rather a mash of both kinds of cascades. Since pictures will save me using a thousand words:


The images are in order of their "structural virality" - the most "virally" cascade being in the bottom right corner - but all of them show a combination of both central tweeters broadcasting a tweet and lots of little tweeters passing it along.

A more interesting finding, though, is that there doesn't have to be anything particularly "sticky" about a tweet to see super cascades like the ones above. The authors do some impressive modeling on "scale free" networks (ones that look like Twitter) and find that even if you choose a fixed "stickiness" of tweets (ie the probability that they'll be retweeted) you'll find a similar array of cascades running simulations on those models as you find in reality on Twitter. In other words, whether a tweet turns out to be a dud or be a super-virus could just be a function of randomness. Cool stuff.

No comments:

Post a Comment