Static JavaScript code analysis inside a SQL query: JSHint+GitHub+BigQuery

Can we run a static code analysis tool for JavaScript inside BigQuery? Yes we can.

This is a BigQuery query that runs JSHint, and reports on the most common errors in a sample of all open source JavaScript code in GitHub:

#set UDF Source URI option to "gs://fh-bigquery/js/jshint-2.5.11.js"SELECT x error, COUNT(*) files_affected
FROM js(
(
SELECT content, sample_path, sample_repo_name
FROM [fh-bigquery:github_extracts.contents_js]
WHERE LENGTH(content) BETWEEN 1000 AND 1800
AND ABS(HASH(id))%1000=0 # sampling
),
content, sample_path, sample_repo_name,
"[
{name: 'x', type:'string'},
{name: 'sample_path', type:'string'},
{name: 'sample_repo_name', type:'string'},
{name: 'content', type:'string'}]",
"function(r, emit) {
JSHINT(r.content, {'maxdepth':2});
// data = JSHINT.data();
errors = JSHINT.errors;
set_errors=new Set(errors.map(
function(x) {
if(x && 'raw' in x) {return x.raw}}));
set_errors.forEach(function(x) {
if(!x) {return;}
emit({
x: x,
sample_repo_name: r.sample_repo_name,
sample_path: r.sample_path,
});
});
}")
GROUP BY 1
ORDER BY 2 DESC
LIMIT 100

7.4s elapsed, 103 GB processed

Some notes:

  • I’m running this code over a sample of all JS files (see query for current filters). There’s a lot that you can do with BigQuery and SQL, and as we push the boundaries some code will run better if we work over smaller datasets. In the meantime it would be nice if there was a lighter weight JSHint equivalent.
  • I’m using JSHint 2.5.11 as newer versions fail. Ping me if you find out how to solve this.
  • The above query does not following the official BigQuery UDF supported syntax. See the docs for the correct format, but I’m using this style as it’s easier to share this way.

More resources for GitHub on BigQuery: https://medium.com/@hoffa/b3576fd2b150

--

--

Data Cloud Advocate at Snowflake ❄️. Originally from Chile, now in San Francisco and around the world. Previously at Google. Let’s talk data.

Love podcasts or audiobooks? Learn on the go with our new app.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Felipe Hoffa

Data Cloud Advocate at Snowflake ❄️. Originally from Chile, now in San Francisco and around the world. Previously at Google. Let’s talk data.