Source: https://commons.wikimedia.org/wiki/File:Basketball_game.jpg

Easy pivot() in BigQuery, in one step

Introducing the easiest way to get a pivot done in BigQuery. This has been one of the most recurrent BigQuery open questions. And now, thanks to dynamic SQL and stored procedures, we can finally make it easy for everyone.

Quick pivots

CALL fhoffa.x.pivot(
'bigquery-public-data.iowa_liquor_sales.sales' # source table
, 'fh-bigquery.temp.test' # destination table
, ['date'] # row_ids
, 'store_number' # pivot_col_name
, 'sale_dollars' # pivot_col_value
, 30 # max_columns
, 'SUM' # aggregation
, '' # optional_limit
);
Original table to be pivoted
Pivoted table, with one store per column
CALL fhoffa.x.pivot(
'bigquery-public-data.iowa_liquor_sales.sales' # source table
, 'fh-bigquery.temp.test_pivot' # destination table
, ['DATE_TRUNC(date, MONTH) AS month'] # row_ids
, 'store_number' # pivot_col_name
, 'sale_dollars' # pivot_col_value
, 30 # max_columns
, 'SUM' # aggregation
, '' # optional_limit
);
CALL fhoffa.x.pivot(
'bigquery-public-data.ghcn_d.ghcnd_2019' # source table
, 'fh-bigquery.temp.test_pivotted' # destination table
, ['id', 'date'] # row_ids
, 'element' # pivot_col_name
, 'value' # pivot_col_value
, 30 # max_columns
, 'AVG' # aggregation
, 'LIMIT 10' # optional_limit
);
CREATE OR REPLACE VIEW `fh-bigquery.temp.a` AS (
SELECT * EXCEPT(SensorName), REGEXP_REPLACE(SensorName, r'.*/', '') SensorName
FROM `data-sensing-lab.io_sensor_data.moscone_io13`
);
CALL fhoffa.x.pivot(
'fh-bigquery.temp.a'
, 'fh-bigquery.temp.delete_pivotted' # destination table
, ['MoteName', 'TIMESTAMP_TRUNC(Timestamp, HOUR) AS hour'] # row_ids
, 'SensorName' # pivot_col_name
, 'Data' # pivot_col_value
, 8 # max_columns
, 'AVG' # aggregation
, 'LIMIT 10' # optional_limit
);
CREATE VIEW `fh-bigquery.temp.stack_unnest_tags`
AS
SELECT *
FROM `bigquery-public-data.stackoverflow.posts_questions`, UNNEST(SPLIT(tags,'|')) tag;
CALL fhoffa.x.pivot(
'fh-bigquery.temp.stack_unnest_tags' # source table
, 'fh-bigquery.temp.stack_pivot' # destination table
, ['EXTRACT(YEAR FROM creation_date) AS year'] # row_ids
, 'tag' # pivot_col_name
, 'view_count' # pivot_col_value
, 10 # max_columns
, 'COUNT' # aggregation
, '' # optional_limit
);
Visualizing the # of Stack Overflow questions per year on Sheets

How-to

CREATE OR REPLACE FUNCTION 
`fhoffa.x.normalize_col_name`(col_name STRING) AS (
REGEXP_REPLACE(col_name,r'[/+#|]', '_'
)
CREATE OR REPLACE PROCEDURE `fhoffa.x.pivot`(
table_name STRING
, destination_table STRING
, row_ids ARRAY<STRING>
, pivot_col_name STRING
, pivot_col_value STRING
, max_columns INT64
, aggregation STRING
, optional_limit STRING
)
BEGIN
DECLARE pivotter STRING; EXECUTE IMMEDIATE (
"SELECT STRING_AGG(' "||aggregation
||"""(IF('||@pivot_col_name||'="'||x.value||'", '||@pivot_col_value||', null)) e_'||fhoffa.x.normalize_col_name(x.value))
FROM UNNEST((
SELECT APPROX_TOP_COUNT("""||pivot_col_name||", @max_columns) FROM `"||table_name||"`)) x"
) INTO pivotter
USING pivot_col_name AS pivot_col_name, pivot_col_value AS pivot_col_value, max_columns AS max_columns;
EXECUTE IMMEDIATE (
'CREATE OR REPLACE TABLE `'||destination_table
||'` AS SELECT '
||(SELECT STRING_AGG(x) FROM UNNEST(row_ids) x)
||', '||pivotter
||' FROM `'||table_name||'` GROUP BY '
|| (SELECT STRING_AGG(''||(i+1)) FROM UNNEST(row_ids) WITH OFFSET i)||' ORDER BY '
|| (SELECT STRING_AGG(''||(i+1)) FROM UNNEST(row_ids) WITH OFFSET i)
||' '||optional_limit
);
END;

Next steps

Want more?

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Felipe Hoffa

Data Cloud Advocate at Snowflake ❄️. Originally from Chile, now in San Francisco and around the world. Previously at Google. Let’s talk data.