2 min readMar 7, 2020
Great post! Please keep sharing your knowledge on DAU/MAU and other metrics relevant to startups.
My contribution to the effort: Can I clean up the shared query?
This looks much cleaner, and should have the same numbers:
DECLARE STARTDATE DATE DEFAULT '2020-03-01';
DECLARE NDAYS INT64 DEFAULT 30;DECLARE STARTDATE DATE DEFAULT '2020-03-01';
DECLARE NDAYS INT64 DEFAULT 30;WITH data AS (
SELECT * FROM (
SELECT *, PARSE_DATE('%Y%m%d', '2020'||_TABLE_SUFFIX) day, actor.id user_pseudo_id
FROM `githubarchive.day.2020*` -- `PROJECT.DATABASE.events_*`
) WHERE day <= STARTDATE
), daysN AS (
SELECT * FROM data
WHERE day >= DATE_ADD(STARTDATE, INTERVAL -NDAYS DAY)
), days60 AS (
SELECT * FROM data
WHERE day >= DATE_ADD(STARTDATE, INTERVAL -60 DAY)
), stopdays AS (
SELECT DISTINCT day stopday FROM data
WHERE day >= DATE_ADD(STARTDATE, INTERVAL -NDAYS DAY)
), maus AS (
SELECT stopday, COUNT(DISTINCT(user_pseudo_id)) mau
FROM days60 JOIN stopdays
ON day BETWEEN DATE_ADD(stopday, INTERVAL -NDAYS DAY) AND stopday
GROUP BY stopday
)SELECT day, dau, mau, 100*dau/mau daumau
FROM (
SELECT day, COUNT(DISTINCT user_pseudo_id) dau
, (SELECT mau FROM maus WHERE a.day=maus.stopday) mau
FROM daysN a
GROUP BY day
)
ORDER BY day
DAU/MAU for GitHub:
The cleaned-up query, compared to the original one:
Thanks, and looking forward to your next post!