Get a free observability report to evaluate the potential savingsContact us →
Cost Analysis3 min read

BigQuery Top Queries by Cost

This is the single most valuable query for BigQuery cost optimization. It returns every query execution ranked by on-demand cost, showing you exactly which SQL statements are consuming the most resources in your project.

Why This Matters

In most BigQuery projects, the Pareto principle applies: 20% of queries account for 80% of costs. Finding and optimizing those top queries delivers the highest ROI. Without this visibility, teams waste time optimizing queries that barely move the needle.

How It Works

The query reads job metadata from INFORMATION_SCHEMA.JOBS_BY_PROJECT, deduplicates by job_id, and sorts by on-demand cost descending. Cost is estimated from total_bytes_billed at $6.25/TiB. It also calculates approximate slot usage for each query.

SQL Query

Fill in your details to get a ready-to-run query:

SQL
-- Most expensive queries ranked by on-demand cost

DECLARE lookback_days INT64 DEFAULT 14;

WITH jobs AS (
  SELECT
    user_email,
    query,
    job_id,
    project_id,
    start_time,
    end_time,
    COALESCE(total_bytes_billed, 0) AS bytes_billed,
    total_slot_ms,
    ROW_NUMBER() OVER (PARTITION BY job_id ORDER BY end_time DESC) AS rn
  FROM `your-project`.`region-us`.INFORMATION_SCHEMA.JOBS_BY_PROJECT
  WHERE creation_time >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL lookback_days DAY)
    AND job_type = 'QUERY' AND state = 'DONE' AND total_slot_ms IS NOT NULL
)
SELECT
  user_email,
  query,
  project_id,
  start_time,
  end_time,
  ROUND(bytes_billed / POW(1024, 4), 4) AS tib_scanned,
  ROUND(bytes_billed / POW(1024, 4) * 6.25, 2) AS on_demand_cost_usd,
  ROUND(SAFE_DIVIDE(total_slot_ms,
    TIMESTAMP_DIFF(end_time, start_time, MILLISECOND)), 0) AS avg_slots
FROM jobs
...
Replace your-project and region-us with your GCP project and dataset region.

Query Explanation

INFORMATION_SCHEMA.JOBS_BY_PROJECT is filtered to completed query jobs within the lookback window. ROW_NUMBER() deduplicates on job_id. On-demand cost is calculated as total_bytes_billed / 1024^4 * $6.25. Approximate slot count shows computational intensity: total_slot_ms / execution_time_ms.

Key Insights

  • lightbulb

    The top 10 most expensive queries typically account for 50-80% of total project costs.

  • lightbulb

    Queries appearing multiple times in the top results are candidates for materialisation or caching.

  • lightbulb

    High-cost queries with low slot counts are scanning too much data — add partition filters or select fewer columns.

  • lightbulb

    High-cost queries with high slot counts are compute-bound — simplify JOINs, reduce DISTINCT operations, or use approximate functions.

Best Practices

  1. 1

    Review the top 10 queries weekly and create optimisation tickets for each one.

  2. 2

    Add partition filters to eliminate full table scans on time-partitioned tables.

  3. 3

    Replace SELECT * with explicit column lists to reduce bytes scanned.

  4. 4

    Use BigQuery's query plan explanation to identify stages that process the most data.

Want CloudClerk to find these savings automatically?

Our platform connects to your BigQuery project, runs these analyses automatically, and delivers AI-powered optimisation recommendations — all with your data fully anonymised.

Related Guides