Reddit top domains: The news sources that reddit prefers

What sources does each subreddit prefer for it’s news? Look at these screenshots, and try out the interactive version.

Felipe Hoffa
4 min readMar 7, 2017

Based on a post by /u/subroutines on /r/dataisbeautiful. Data in BigQuery, provided by pushshift.io. Interactive version hosted on Google Data Studio. Try it and share Data Studio is now globally available, and for free!

(see also: top Hacker News domain)

/r/politics + /r/The_Donald + /r/worldnews + /r/news

/r/EnoughTrumpSpam + /r/conspiracy + /r/uncensorednews + /r/Conservative

/r/technology + /r/Futurology + /r/science

/r/unitedkingdom + /r/australia + /r/canada + /r/europe

/r/nfl + /r/soccer + /r/hockey + /r/nba

/r/WayOfTheBern + /r/hillaryclinton + /r/democrats

/r/movies + /r/television

/r/nottheonion + /r/UpliftingNews + /r/environment

/r/Conservative + /r/progressive + /r/Libertarian

Reverse: Top subs for media

/r/The_Donald favorites:

Same, without /r/The_Donald:

MSM news:

Same, without /r/politics:

Source

Make your own (w/ Data Studio), or change the rules by running a variation of this BigQuery query:

#standardSQL
SELECT domain, subreddit, count_dom, COUNT(*) posts FROM (
SELECT id, domain, subreddit, COUNT(*) OVER(PARTITION BY domain) count_dom
FROM `fh-bigquery.reddit_posts.2017_01`
WHERE score>25
AND domain NOT IN (
'puu.sh', 'zkillboard.com', 'gifsound.com', 'youtu.be', 'bato.to', 'archive.is', 'archive.fo',
'pbs.twimg.com', 'streamable.com', 'cdn.awwni.me')
AND NOT domain LIKE 'self.%'
AND NOT domain LIKE '%redd.it%'
AND NOT domain LIKE '%sli.mg%'
AND NOT domain LIKE '%instagram%'
AND NOT domain LIKE '%steamcommunity%'
AND NOT domain LIKE '%gfycat%'
AND NOT domain LIKE '%fav.me%'
AND NOT domain LIKE '%steampower%'
AND NOT domain LIKE '%amazon%'
AND NOT domain LIKE '%twitch%'
AND NOT domain LIKE '%blogspot%'
AND NOT domain LIKE '%mixtape%'
AND NOT domain LIKE '%spotify%'
AND NOT domain LIKE '%prntscr%'
AND NOT domain LIKE '%akamai%'
AND NOT domain LIKE '%vid.me%'
AND NOT domain LIKE '%github%'
AND NOT domain LIKE '%google%'
AND NOT domain LIKE '%vimeo%'
AND NOT domain LIKE '%medium%'
AND NOT domain LIKE '%upload%'
AND NOT domain LIKE '%imgur%'
AND NOT domain LIKE '%youtube%'
AND NOT domain LIKE '%giphy%'
AND NOT domain LIKE '%reddit%'
AND NOT domain LIKE '%twitter%'
AND NOT domain LIKE '%tumblr.com'
AND NOT domain LIKE '%giphy%'
AND NOT domain LIKE '%flickr%'
AND NOT domain LIKE '%deviantart.com'
AND NOT domain LIKE '%facebook%'
AND NOT domain LIKE '%instagram%'
AND NOT domain LIKE '%twitter%'
AND NOT domain LIKE '%pinimg%'
AND NOT domain LIKE '%gyazo%'
AND NOT domain LIKE '%artstation%'
AND NOT domain LIKE '%ytimg%'
AND NOT domain LIKE '%imgflip%'
AND NOT domain LIKE '%soundcloud%'
AND NOT domain LIKE '%soundcloud%'
AND NOT domain LIKE '%ppy.sh%'
AND NOT over_18
)
WHERE count_dom>20
GROUP BY 1, 2, 3
ORDER BY 4 DESC

Want more?

Want more stories? Check my Medium, follow me on twitter, and subscribe to reddit.com/r/bigquery. And try BigQuery — every month you get a full terabyte of analysis for free.

--

--

Felipe Hoffa

Data Cloud Advocate at Snowflake ❄️. Originally from Chile, now in San Francisco and around the world. Previously at Google. Let’s talk data.