I used to load reddit comments onto BigQuery, now it’s time to upgrade my pipelines to Snowflake — and to share some of the nice surprises I found. Let’s get started with 261GB of them.

  • Working with semi-structured data in Snowflake is fast, easy, and fun.
  • Snowflake understands JSON objects on load, and optimizes the syntax and storage for them — without the need to predefine a schema.
  • Snowflake stores data compressed — in this case with a ratio better than 1:10 compared with the original files.
  • Snowflake supports advanced SQL syntax for recursive queries — which is awesome to analyze threaded…

The Braze Engagement Benchmarks give Snowflake users access to industry-by-industry data on message engagement, app retention, user acquisition, and purchasing behavior, updated daily. All data in Benchmarks are anonymized and aggregated. The data are pulled from their customer base of over 1,000 global brands across 14 major industries and encompass the past year from the current date. Find here how to query them.

Photo by Anne Nygård on Unsplash

NOAA GSOD’s daily worldwide weather data is updated daily in Snowflake, and in this post we’ll make it even more useful. Check inside for pivots, geo-joins, finding the closest station to each city, and pattern matching with MATCH_RECOGNIZE().

Video on Youtube

The source

Knoema’s Environment Data Atlas in Snowflake
  • Having this data automatically refreshed in your account is cool!
  • Making this table useful is not that easy, because:
  • Stations have a lat, lon — but not a way to tell to which city or zip-code they belong.
  • The original NOAA rows for each day have been split into multiple rows for each day — with each row containing only one value for…

Snowflake now let’s you easily create Java UDFs, which is an incredibly powerful and versatile feature. Let’s check it out with by running a library written in Kotlin — to detect written languages. Out of GitHub and into your SQL code, in 3 easy steps.

Detect written languages with a Java UDF

Quick example: Detect written languages

create function add(x integer, y integer)
returns integer
language java
class Test {
public static int add(int x, int y) {
return x + y;
select add(1, 3)// 4
create or replace function detect_lang(x string)
returns string
language java
imports = ('@~/lingua-1.1.0-with-dependencies.jar')

It’s exactly 10 years since my first day at Google. What I’ve learnt:

Same story, but as a Twitter thread
  1. In 2011 there was a career path for me, but I had to leave Chile to find it.
  2. There is a career path outside management: At Google I discovered that software engineers could keep moving forward without the need to become a manager.
  3. There are jobs that combine passions with skills: I wasn’t the best SWE at Google, but someone saw my passions and skills, and invited me to join Developer Relations.
  4. Know your strengths: Taking the Clifton StrengthsFinder test changed how I viewed myself. …

Car rental rates are surprisingly high — and I have the data to prove it. In this post learn where to find all this data and how to analyze it with SQL, thanks to QL2 and Snowflake.

Watch on Youtube
Average car rental rates in Las Vegas, LA, Miami, Chicago, and San Francisco — starting in 2019.

By creating a Snowflake external function we can get predictions out of Facebook Prophet for any time series. Learn how to easily integrate these forecasts into your SQL pipelines — with Snowflake connecting to Prophet running on a Docker container inside Google Cloud Run.

Photo: Pixabay

Source: Pixabay

Your customers trust you to protect their privacy and identities — and Snowflake makes this easy with Dynamic Data Masks. With this you can get rid of complex systems of secure views or copies of data: Just create policies that allow everyone to share the same tables and queries, while protecting sensitive data.

Watch on Youtube #SnowflakeBytes

There’s plenty of research establishing that boards should refresh their members regularly. Let’s see the numbers for the Free Software Foundation, Apache Software Foundation, Python Software Foundation and Open Source Initiative.

Rotation in open source boards: Average # of years served by members

Why boards should rotate

Eva’s challenge: Produce a new daily visualization with Snowflake and Tableau for 30 days. Let’s check on her progress during her first week of this ongoing challenge.

Visualizing Snowflake’s Marketplace Data in Tableau
Visualizing Snowflake’s Marketplace Data in Tableau

The visualizations

Felipe Hoffa

Data Cloud Advocate at Snowflake ❄️. Originally from Chile, now in San Francisco and around the world. Previously at Google. Let’s talk data.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store