# of unique reddit accounts per day commenting “OK boomer”

“OK Boomer” escalated quickly — a reddit+BigQuery report

Let’s use BigQuery to find the first time that someone commented “OK Boomer” on reddit. Turns out it happened hours and even years before the alleged first tweet. Was this all an attempt by teenagers to replace the word “old” with “boomer”? Let’s dig in.

5 min readNov 9, 2019

Important update: I left Google and joined Snowflake in 2020 — so I’m unable to keep my older posts updated. If you want to try Snowflake, join us — I’m having a lot of fun ❄️.

Loading 261GB of reddit comments into Snowflake

I used to load reddit comments onto BigQuery, now it’s time to upgrade my pipelines to Snowflake. Let’s get started…

medium.com

“OK boomer” growth by subreddit:

# of unique accounts per sub that have commented “OK Boomer”

The first “OK boomer”

According to Know Your Meme this tweet marks the first appearance of “ok boomer”:

The first tweet for “Ok Boomer”. Source: Know Your Meme

And the author has been celebrating it:

But was this the real first appearance? 90 minutes earlier, that same day — April 12, 2018— someone else had immortalized “OK BOOMERS” on reddit:

An “OK boomer” on reddit, 90 minutes before that first tweet.

Turns out there are 9 reddit comments even before that, with “Ok boomer” starting in September 2009:

“OK boomer” reddit prehistory: 9 comments between 2009 and April 12, 2018

And things didn’t escalate quickly back then. It took 9 years to get the first 9 “OK boomer” on reddit, and then we have to jump until October 2018. That’s when we start seeing an almost daily dose of the phrase:

Waiting from April 2018 until October 2018 to find an almost daily dose of “OK Boomer” on reddit.

And it had spread through different subreddits! From /r/Tinder, to /r/worldnews, to /r/xboxone, to /r/golang — redditors started embracing it.

And something happened on March 2019. Until then we had a spotty supply of the phrase, but on March 24 six different accounts decide to comment “OK boomer”. Since that day, the trend continues going up until June 2019, with more than 30 accounts commenting “OK boomer” per day.

The embrace of “OK boomer” from Aug 2018 to Jun 2019 on reddit.

And then we have a gap in our data — you can draw your own guess on how the usage continued escalating to ~200 accounts commenting daily by mid October — with a huge peak on September 9, 2019:

What happened on September 9? This post by /r/teenagers:

Oh yeah. This was a petition by teenagers to replace the world “old” with “boomer” — and 30k upvotes agreed with it.

But for the real explosion in usage, you have to wait until October 30, 2019:

October 30, 2019: 2,640 accounts post “OK boomer” on reddit

That’s a day after the NY Times reported on the phrase:

'OK Boomer' Marks the End of Friendly Generational Relations

In a viral audio clip on TikTok, a white-haired man in a baseball cap and polo shirt declares, "The millennials and…

www.nytimes.com

And what happened next? “OK boomer” escalated quickly. With a peak on November 6, when more than 12k accounts posted the phrase:

And you can find these comments all over reddit:

You can now play with the interactive dashboard, to find all sorts of patterns within these comments:

Play with the interactive report, or load it full size.

How-to

I used two different sources of data:

Historical reddit archives stored in BigQuery, for all comments before June 2019.
Live pushshift.io API for all newer comments. Extracted like described in my previous post.

To extract all of the historical reddit comments, I used this query:

CREATE TABLE `reddit_extracts.201906_all_okboomer`
PARTITION BY fake_date
CLUSTER BY subreddit, ts
AS
SELECT TIMESTAMP_SECONDS(created_utc) ts, *, DATE('2000-01-01') fake_date
FROM `reddit_comments.2*`
WHERE REGEXP_CONTAINS(body, r'(?i)\bok boomer')
AND _table_suffix >= '018_03'; 
INSERT INTO `reddit_extracts.201906_okboomer_all`SELECT TIMESTAMP_SECONDS(created_utc) ts, *, DATE('2000-01-01') fake_date
FROM `reddit_comments.2*`
WHERE REGEXP_CONTAINS(body, r'(?i)\bok boomer')
AND _table_suffix BETWEEN '0' AND  '018_02'

Find the shared table here: console.cloud.google.com/bigquery?p=fh-bigquery&d=reddit_extracts&t=201906_okboomer_all&page=table

To create the table summarizing live and archived comments for Data Studio:

CREATE OR REPLACE TABLE `reddit_extracts.201911_okboomer_day`
ASWITH data AS (
  SELECT ts, author, subreddit, score, body
    , permalink
  FROM `reddit_extracts.201910_live_okboomer`  
  UNION ALL
  SELECT ts, author, subreddit, score, body
    , CONCAT('/r/',subreddit,'/comments/', REGEXP_REPLACE(link_id, 't3_', ''), '//', id, '/') permalink
  FROM `reddit_extracts.201906_okboomer_all` 
)SELECT day, authors
  , (SELECT STRING_AGG(value, ', ') FROM UNNEST(top_reddit)) top_subs
  , top_comm 
FROM (
  SELECT TIMESTAMP_TRUNC(ts, DAY) day, COUNT(DISTINCT author) authors
  , APPROX_TOP_COUNT(subreddit, 3) top_reddit
  , ARRAY_AGG(STRUCT(score, subreddit, permalink, body) ORDER BY score DESC LIMIT 1)[OFFSET(0)] top_comm 
  FROM data
  WHERE body NOT IN ('[deleted]', '[removed]')
  AND REGEXP_CONTAINS(body, r'(?i)\bok boomer[s]?\b')
  AND ts < TIMESTAMP_TRUNC(CURRENT_TIMESTAMP(), DAY)
  GROUP BY 1
)
ORDER BY 1 DESC

The animated chart

I used a Data Studio custom visualization created by Michael Whitaker.

The query to count uniques uses some HLL magic for efficiency:

CREATE OR REPLACE TABLE `reddit_extracts.201911_okboomer_subgrowth`
ASWITH data AS (
  SELECT ts, author, subreddit, score, body
    , permalink
  FROM `reddit_extracts.201910_live_okboomer`  
  UNION ALL
  SELECT ts, author, subreddit, score, body
    , CONCAT('/r/',subreddit,'/comments/', REGEXP_REPLACE(link_id, 't3_', ''), '//', id, '/') permalink
  FROM `reddit_extracts.201906_okboomer_all` 
)
, sketches AS (
  SELECT TIMESTAMP_TRUNC(ts, DAY) day
  , subreddit 
  , HLL_COUNT.INIT(author) sketch
  , COUNT(DISTINCT author) authors
  FROM data
  WHERE body NOT IN ('[deleted]', '[removed]')
  AND REGEXP_CONTAINS(body, r'(?i)\bok boomer[s]?\b')
  GROUP BY 1,2
), notable_early AS (
  SELECT DISTINCT subreddit
  FROM data
  WHERE ts < '2018-04-15'
), overall_notable AS (
  SELECT subreddit, COUNT(DISTINCT author) c
  FROM data
  GROUP BY 1
  ORDER BY c DESC
  LIMIT 30
)SELECT * 
FROM (
  SELECT *, IFNULL(authors - LAG(authors) OVER(PARTITION BY subreddit ORDER BY day), authors) authors_diff
  FROM (
    SELECT a.day
      , CASE 
          WHEN subreddit IN (SELECT * from notable_early UNION ALL (SELECT subreddit from overall_notable)) THEN subreddit
--           ELSE '[other]'
        END AS subreddit
      , COUNT(*) c, HLL_COUNT.MERGE(sketch) authors
    FROM (
      SELECT DISTINCT  CASE
        WHEN  day< '2019-07-01' THEN TIMESTAMP_TRUNC(day, QUARTER)
        WHEN  day< '2019-09-01' THEN TIMESTAMP_TRUNC(day, MONTH)
        WHEN  day< '2019-10-30' THEN TIMESTAMP_TRUNC(day, WEEK)
        ELSE day
      END AS day
      FROM sketches) a
    JOIN sketches b
    ON b.day<=a.day
    GROUP BY 1,2
  )
)
WHERE subreddit IS NOT null

Want more?

Check my previous post for more details on collecting live data from pushshift.io. Thanks Jason Baumgartner for the constant supply of data!

I’m Felipe Hoffa, a Developer Advocate for Google Cloud. Follow me on @felipehoffa, find my previous posts on medium.com/@hoffa, and all about BigQuery on reddit.com/r/bigquery.

“OK Boomer” escalated quickly — a reddit+BigQuery report

Let’s use BigQuery to find the first time that someone commented “OK Boomer” on reddit. Turns out it happened hours and even years before the alleged first tweet. Was this all an attempt by teenagers to replace the word “old” with “boomer”? Let’s dig in.

Loading 261GB of reddit comments into Snowflake

I used to load reddit comments onto BigQuery, now it’s time to upgrade my pipelines to Snowflake. Let’s get started…

“OK boomer” growth by subreddit:

The first “OK boomer”

'OK Boomer' Marks the End of Friendly Generational Relations

In a viral audio clip on TikTok, a white-haired man in a baseball cap and polo shirt declares, "The millennials and…

How-to

The animated chart

Want more?

Sign up to discover human stories that deepen your understanding of the world.

Free

Membership

Written by Felipe Hoffa

Responses (1)