Cost Analysis3 min read

BigQuery Top Queries by Duration

By durationBy cost By frequency By complexity

Long-running queries tie up resources, delay downstream pipelines, and frustrate users waiting for results. This query identifies the longest-running queries in your project with their execution time, cost, and a runtime-to-bytes ratio that helps distinguish legitimate large scans from poorly optimized queries.

Why This Matters

Execution time matters beyond just cost. A query that runs for 30 minutes blocks a data pipeline, delays dashboard refreshes, or causes an analyst to context-switch. Under Editions pricing, long-running queries also consume more slot-hours (minimum 1-minute billing). Reducing query duration improves both cost and productivity.

How It Works

The query calculates execution time as the TIMESTAMP_DIFF between end_time and start_time. It also computes a seconds-per-GiB ratio: high values indicate queries that are slow relative to the amount of data they process, suggesting optimization potential.

SQL Query

Fill in your details to get a ready-to-run query:

GCP Project ID

Dataset Region

Lookback (days)

SQL

-- Slowest queries ranked by wall-clock execution time

DECLARE lookback_days INT64 DEFAULT 7;

WITH jobs AS (
  SELECT
    user_email,
    query,
    project_id,
    start_time,
    end_time,
    TIMESTAMP_DIFF(end_time, start_time, SECOND) AS duration_sec,
    COALESCE(total_bytes_billed, 0) AS bytes_billed,
    total_slot_ms,
    ROW_NUMBER() OVER (PARTITION BY job_id ORDER BY end_time DESC) AS rn
  FROM `your-project`.`region-us`.INFORMATION_SCHEMA.JOBS_BY_PROJECT
  WHERE creation_time >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL lookback_days DAY)
    AND job_type = 'QUERY' AND state = 'DONE' AND total_slot_ms IS NOT NULL
),
deduplicated AS (
  SELECT *, ROW_NUMBER() OVER (PARTITION BY query ORDER BY bytes_billed DESC) AS qrn
  FROM jobs WHERE rn = 1
)
SELECT
  user_email,
  query,
  project_id,
  start_time,
  duration_sec,
  ROUND(SAFE_DIVIDE(total_slot_ms, duration_sec * 1000), 0) AS avg_slots,
...

Replace your-project and region-us with your GCP project and dataset region.

Query Explanation

Two levels of deduplication: first by job_id to remove duplicate job entries, then by query text to show only the most expensive execution of each unique query. The sec_per_gib metric divides execution time by bytes billed — high values mean the query is slow relative to how much data it reads.

Key Insights

lightbulb
Queries over 10 minutes are likely scanning unpartitioned tables or performing expensive shuffles.
lightbulb
A high sec_per_gib means the query is compute-bound (complex JOINs, sorts, window functions) rather than I/O-bound.
lightbulb
A low sec_per_gib means the query is scanning a lot of data quickly — it could benefit from partition pruning.
lightbulb
Long-running queries in pipelines can cascade delays to all downstream dependencies.

Best Practices

1
Set query timeout limits to prevent runaway queries from consuming resources indefinitely.
2
For ETL pipelines, break long-running monolithic queries into smaller, staged transformations.
3
Use EXPLAIN PLAN to identify the slowest stages and optimize them specifically.
4
Consider approximate aggregation functions (APPROX_QUANTILES, APPROX_TOP_COUNT) for exploratory queries.

Want CloudClerk to find these savings automatically?

Our platform connects to your BigQuery project, runs these analyses automatically, and delivers AI-powered optimisation recommendations — all with your data fully anonymised.

Get Your Free Report Access the platform →

Related Guides

trending_upCost Analysis

BigQuery Top Queries by Cost

Find the most expensive BigQuery queries by on-demand cost. Rank queries by total bytes billed to identify the biggest cost drivers in your project.

Read guide

psychologyCost Analysis

BigQuery Top Queries by Complexity

Find the most compute-intensive BigQuery queries ranked by slot usage. Identify queries consuming disproportionate computational resources.

Read guide

repeatCost Analysis

BigQuery Top Queries by Frequency

Identify the most frequently executed BigQuery queries. Find repeated queries that are candidates for caching, materialized views, or consolidation.

Read guide

scheduleSlot Usage

BigQuery Slot Usage by Hour

Monitor hourly BigQuery slot consumption to identify peak usage windows and optimize your reservation scheduling.

Read guide