Felipe Hoffa
2 min readMar 7, 2020

--

Great post! Please keep sharing your knowledge on DAU/MAU and other metrics relevant to startups.

My contribution to the effort: Can I clean up the shared query?

This looks much cleaner, and should have the same numbers:

DECLARE STARTDATE DATE DEFAULT '2020-03-01';
DECLARE NDAYS INT64 DEFAULT 30;
DECLARE STARTDATE DATE DEFAULT '2020-03-01';
DECLARE NDAYS INT64 DEFAULT 30;
WITH data AS (
SELECT * FROM (
SELECT *, PARSE_DATE('%Y%m%d', '2020'||_TABLE_SUFFIX) day, actor.id user_pseudo_id
FROM `githubarchive.day.2020*` -- `PROJECT.DATABASE.events_*`
) WHERE day <= STARTDATE
), daysN AS (
SELECT * FROM data
WHERE day >= DATE_ADD(STARTDATE, INTERVAL -NDAYS DAY)
), days60 AS (
SELECT * FROM data
WHERE day >= DATE_ADD(STARTDATE, INTERVAL -60 DAY)
), stopdays AS (
SELECT DISTINCT day stopday FROM data
WHERE day >= DATE_ADD(STARTDATE, INTERVAL -NDAYS DAY)
), maus AS (
SELECT stopday, COUNT(DISTINCT(user_pseudo_id)) mau
FROM days60 JOIN stopdays
ON day BETWEEN DATE_ADD(stopday, INTERVAL -NDAYS DAY) AND stopday
GROUP BY stopday
)
SELECT day, dau, mau, 100*dau/mau daumau
FROM (
SELECT day, COUNT(DISTINCT user_pseudo_id) dau
, (SELECT mau FROM maus WHERE a.day=maus.stopday) mau
FROM daysN a
GROUP BY day
)
ORDER BY day

DAU/MAU for GitHub:

DAU/MAU for GitHub

The cleaned-up query, compared to the original one:

Thanks, and looking forward to your next post!

--

--

Felipe Hoffa
Felipe Hoffa

Written by Felipe Hoffa

Data Cloud Advocate at Snowflake ❄️. Originally from Chile, now in San Francisco and around the world. Previously at Google. Let’s talk data.

Responses (1)