The real top Stack Overflow questions

It’s easy to find the top Stack Overflow questions of all time — but the *current* top questions gives us what’s important now. Find the how-to here, and an interactive dashboard to get to the top trends.

Update 2020 — Check my new post “Stack Overflow in 2023: Predicting with ARIMA and BigQuery

Interactive dashboard. Play with it to find the top questions for any tag.

The top Stack Overflow questions — all time vs current

The top Stack Overflow questions of all time. What’s missing?
The top Stack Overflow questions of 2018Q4. What tag is missing now?
Top 10 and top 30 Stack Overflow questions. Current vs all time

So what are the top questions for each tag

JavaScript, Python, Go: Different challenges for each

Top 10 questions Q42018 for Python
Top 10 questions Q42018 for JavaScript
Top 10 questions Q42018 for Go

TensorFlow struggles

Top 10 questions Q42018 for TensorFlow

Kotlin top 10: Now and then

Top 10 Kotling questions in 2017Q2 vs 2018 Q4

Redis top questions — focusing on Java vs C#

Top 10 questions for Redis
Top 10 questions for Redis + Java
Top 10 questions for Redis + C#

One question, so many replies

Top questions for Rust and Go

Dig beyond the top 10

Top 21 to 30 questions for jQuery

How-to: Queries

You can find all this data in BigQuery. Every 3 months Stack Overflow publishes a snapshot of their latest data, and we make sure to have a fresh copy ready to be queried.

Top Stack Overflow questions, current vs all time

SELECT (
SELECT tag
FROM UNNEST(tags)
ORDER BY view_count DESC LIMIT 1
) tag
, * EXCEPT(tags)
FROM (
SELECT quarter_views , view_count
, ROW_NUMBER() OVER(ORDER BY quarter_views DESC) q_ranking
, ROW_NUMBER() OVER(ORDER BY view_count DESC) ranking
, ARRAY(
SELECT AS STRUCT tag, b.view_count
FROM UNNEST(tags) tag
JOIN `fh-bigquery.stackoverflow_archive_questions.merged_aux_tags` b
ON tag=b.tag
) tags, title
FROM `fh-bigquery.stackoverflow_archive_questions.merged`
WHERE quarter='2018-12-01'
AND view_count > 50000
)
WHERE q_ranking<30 OR ranking <30
ORDER BY 1 DESC

Finding the number of pageviews for each question through time

CREATE OR REPLACE TABLE `stackoverflow_archive_questions.merged` 
AS
SELECT
IFNULL(
view_count -
LAG(view_count) OVER(PARTITION BY id ORDER BY view_count)
, view_count) quarter_views, *
FROM (
SELECT PARSE_DATE('%Y%m',_table_suffix) quarter
, id, view_count
, SPLIT(tags, '|') tags
, score, creation_date, answer_count
, accepted_answer_id, title
FROM `fh-bigquery.stackoverflow_archive_questions.q*`
)

Top questions per tag

#standardSQL
SELECT title, quarter_views, view_count
FROM `fh-bigquery.stackoverflow_archive_questions.merged`
WHERE 'google-cloud-dataflow' IN UNNEST(tags)
AND quarter='2018-12-01'
ORDER BY quarter_views DESC
LIMIT 10
Top current questions for Dataflow

Top current questions that haven’t been updated in more than a year

#standardSQL    
WITH top_questions AS (
SELECT id, title, quarter_views, view_count
FROM `fh-bigquery.stackoverflow_archive_questions.merged`
WHERE 'google-cloud-dataflow' IN UNNEST(tags)
AND quarter='2018-12-01'
), latest_answer AS (
SELECT parent_id, DATE(MAX(COALESCE(last_edit_date, last_activity_date, creation_date))) answer_last_edit_date
FROM `bigquery-public-data.stackoverflow.posts_answers` b
GROUP BY parent_id
)
SELECT SUBSTR(title, 0,80) title, quarter_views, view_count, answer_last_edit_date
FROM top_questions a
JOIN latest_answer b
ON a.id=b.parent_id
WHERE DATE_DIFF(CURRENT_DATE(), answer_last_edit_date, DAY)>360
ORDER BY quarter_views DESC
LIMIT 10
Top current questions for Dataflow, that haven’t been updated in a year

Go beyond

Next steps

Data Cloud Advocate at Snowflake ❄️. Originally from Chile, now in San Francisco and around the world. Previously at Google. Let’s talk data.