Visualizing reddit’s activity and uptime over the past 8 years
How hard would it be to visualize reddit’s uptime from 2008 to 2016? Let’s do it here, while going through most of its history.
We’ll be looking at the number of comments per minute that reddit gets. From 1 in 2008 to more than 2300 comments per minute in 2016.
It’s pretty tight, as there are a lot of minutes during these 9 years. My favorite thing to look at are the dashes of white space (when reddit was down), plus how it keeps growing steadily and rhythmically.
2008 — getting started
I don’t know if reddit was up or down, but I can take a look at how many comments were posted per minute during that year (thanks /u/stuck_in_the_matrix for collecting these comments!).
Let’s look at how many comments were posted per minute during 2008 (all times are UTC):
(How cool is that?)
- The “peak rate” of comments during 2008 was above 40 per minute.
- The white pixels show minutes when reddit got zero comments. The longer the line, the longer reddit was down — with the longest period being around 5 hours!
- 2008 trivia: reddit got its first sub-reddits in January.
- Minutes down: 1586.
Many of the white dots during night might not be reddit being down, but just having very few users at that time. Let’s see how this changes through time:
2009: reddit is growing
- Peak rate is going up — from more than 40 comments/minute in 2008 to more than 105 comments/minute in 2009.
- The longest period of downtime we can see here is more than 7 hours during May.
- Can you notice the chart getting brighter towards the bottom? That’s reddit growing, and getting more comments every minute.
- 2009 trivia: /r/IAmA is created in January. October marks Alexis Ohanian and Steve Huffman departure.
- Minutes down: 1079.
2010: The downtime is too high
- Reddit is growing, more than doubling the rate of comments per minute every year — but also showing a lot of growing pains.
- This was the year that reddit migrated their servers to AWS. Here they explain why reddit was down 71 minutes in March.
- 2010 trivia: Reddit gold was created in July.
- Minutes down: 3004.
2011: reddit keeps growing, despite downtime
- Reddit looks unstoppable — this year the comment rate per minute continues going up, more than doubling 2010 records.
- In March reddit was down for more than 6 hours. The post-mortem is not on the internet anymore, but the comments show many redditors being skeptic of ‘the cloud’.
- Worse days were coming ahead, as in April AWS took down many websites for more than 24 hours — including reddit.
- 2011 trivia: Reddit becomes operationally independent of Condé Nast.
- Minutes down: 4966.
2012: better uptime and growing up
- Most downtime periods seem to be planned now: At the middle of the night, when comment rates is at the lowest.
- That ugly 12 hour downtime period during January? Totally planned: reddit and other major websites joined forces to stop SOPA.
- Periods of unplanned downtime seem surrounded by periods of very low commenting rate — probably users were facing lots of errors before and after total shutdown.
- 2012 trivia: Yishan Wong becomes CEO in March. President Obama dropped by /r/IAmA during August (try to spot in the chart how reddit behaved).
- Minutes down: 1681 - 720 (SOPA) = 961.
2013: sweet stable growth
- 2013 looks pretty, though reddit stopped doubling its growth rate.
- In April reddit went through a severe DDoS attack, while many redditors where busy trying to identify the Boston bomber.
- Minutes down: 247.
2014: the year of uptime
- More of that stable growth: Not a lot downtime visible, while growth rate stays up (but not doubling as in the earlier years).
- 2014 trivia: Ellen Pao becomes reddit’s CEO in November. Alexis Ohanian returns.
- Minutes down: 71.
2015: turbulent waters, nice uptime
- Reddit starts getting consistently more than 2000 comments per minute.
- In early June reddit’s Victoria is fired, and redditors unite in protest turning many subredddits off. The commenting rate stayed visibly up that night.
- 2015 trivia: Steve Huffman returns as CEO. Reddit comments loaded into BigQuery that July.
- Minutes down: 231.
2016: present, future, and beyond
- Reddit stays on its “slower” growth curve, and shows some nice stable uptime.
- Reddit goes down for 90 minutes in August and they publish a post about it. Said post inspires me to write this one :).
- Minutes “down”: 198 (and counting).
Everything together
Get the full picture at http://imgur.com/a/qejIw.
How-to:
BigQuery queries
2008–2014 are archived in one table per year in BigQuery:
SELECT INTEGER(created_utc/(60*60*24)) day, INTEGER(created_utc/60)-INTEGER(created_utc/(60*60*24))*60*24 minute, COUNT(*) c
FROM [fh-bigquery:reddit_comments.%s]
GROUP BY 1,2
ORDER BY 1,2
For 2016 I mixed the monthly tables, and the live realtime posts that /u/Stuck_in_the_Matrix maintains:
SELECT INTEGER(created_utc/(60*60*24)) day, INTEGER(created_utc/60)-INTEGER(created_utc/(60*60*24))*60*24 minute, COUNT(*) c
FROM TABLE_QUERY([fh-bigquery:reddit_comments], "table_id CONTAINS '2016' AND LENGTH(table_id)<8"), (
SELECT TIMESTAMP_TO_SEC(created_utc) created_utc
FROM [pushshift:rt_reddit.comments]
WHERE YEAR(created_utc)=2016
AND MONTH(created_utc)>5
)
GROUP BY 1,2
ORDER BY 1,2
Charting
Python, pandas, and matplotlib magic:
def mymap(year):
pivoted=df[year].pivot('day', 'minute', 'c').clip(0,df[year].c.quantile(0.999))
fig, ax = plt.subplots(figsize=(26,26))
ax.grid(False)
plt.imshow(pivoted, origin='upper').set_cmap('copper')
ax.set_xticklabels(['%02i:%02i' % divmod(x, 60) for x in ax.get_xticks()])
ax.set_yticklabels([(datetime.datetime(year, 1, 1) + datetime.timedelta(x)).strftime('%m-%d') for x in ax.get_yticks()])
ax.set_title('reddit uptime: comments per minute %s by @felipehoffa reddit.com/r/bigquery' % year)plt.colorbar(fraction=0.0125, pad=0.001, label='comments per minute')
plt.show()
Counting downtime
(If you have a better way, please share)
for year in range(2008,2017):
down=df[year].pivot('day', 'minute', 'c').isnull().sum().sum()
print '%s %s' %( year ,down)
More?
Want more stories? Check my medium, add me on twitter, and subscribe to reddit.com/r/bigquery.
Thanks for your likes, favs, and comments!