PostgreSQL Query Optimization: A Practical Workflow Guide

Background

As a data analyst on critical projects, I optimize PostgreSQL queries to ensure end users receive accurate data because even a single second can make a difference in time-sensitive analyses. In resource-constrained environments, cloud computing costs can also limit access to essential tools, slowing decision-making, reducing team transparency, and even threatening a project’s success. That’s why SQL performance tuning matters. Over the past year, I’ve gained substantial experience in query optimization, yet I was surprised by how scarce practical resources were when I was actively tackling these challenges. Many nuances were obscure, and some even eluded more senior colleagues.

One tool that has become indispensable in my tuning toolkit is EXPLAIN (ANALYZE, BUFFERS), which surfaces critical query performance metrics. This guide walks you through identifying common areas for query optimization, reading and interpreting EXPLAIN (ANALYZE, BUFFERS) output, and measuring improvements effectively.

The Query Optimization Workflow

Query optimization is an iterative process. Here’s the workflow I follow:

Baseline measurement: Run EXPLAIN (ANALYZE, BUFFERS) on the original query 3-5 times to establish baseline performance
Identify bottlenecks: Analyze the output to find slow operations, memory spills, or excessive disk I/O
Apply optimization techniques: Implement targeted improvements (see common techniques below)
Re-measure: Run EXPLAIN (ANALYZE, BUFFERS) again to verify the impact
Iterate: Repeat steps 2-4 until the query meets performance requirements

Common Optimization Techniques

Based on your query plan analysis, consider these approaches:

Indexing

Add indexes on columns used in WHERE, JOIN, and ORDER BY clauses
Use composite indexes for multi-column conditions
Consider hash indexes for equality comparisons on large tables

Query rewriting

Replace NOT IN with LEFT JOIN and IS NULL checks for better performance with large exclusion lists
Avoid functions in WHERE clauses (e.g., CONVERT(date_column)) as they prevent index usage
Use EXISTS instead of IN for subqueries when appropriate
Use COUNT(*) instead of COUNT(DISTINCT column) when counting rows with primary keys
Remove unnecessary OUTER JOINs—use INNER JOIN when you don’t need unmatched rows
Avoid calculated fields in JOIN and WHERE clauses (e.g., WHERE YEAR(date_column) = 2024)
Limit your working data set—filter early and often to reduce rows processed
Select only the columns you need rather than using SELECT *

Resource configuration

Increase work_mem if you see “Batches > 1” in hash operations (memory spilling)
Adjust shared_buffers to improve cache hit rates
Update table statistics with ANALYZE to help the query planner

Data structure improvements

Implement table partitioning for very large tables
Use appropriate clustering keys for frequently-joined tables
Remove unnecessary columns from SELECT when aggregating millions of rows

The key is to make one change at a time and measure its impact with EXPLAIN ANALYZE BUFFERS before proceeding to the next optimization.

Example Output

-- Source - https://stackoverflow.com/q/79846354
-- Posted by Oliver S, modified by community. See post 'Timeline' for change history
-- Retrieved 2025-12-21, License - CC BY-SA 4.0
-- Table and column names modified for educational purposes
Aggregate  (cost=80.48..80.49 rows=1 width=8) (actual time=0.530..0.532 rows=1 loops=1)
  Buffers: shared hit=22
  ->  Hash Join  (cost=70.41..80.05 rows=172 width=8) (actual time=0.395..0.507 rows=110 loops=1)
        Hash Cond: (loc.company_id = comp.id)
        Buffers: shared hit=22
        ->  Hash Join  (cost=14.74..23.88 rows=188 width=16) (actual time=0.098..0.185 rows=187 loops=1)
              Hash Cond: (dev.location_id = loc.id)
              Buffers: shared hit=15
              ->  Seq Scan on devices dev  (cost=0.00..8.64 rows=188 width=16) (actual time=0.009..0.062 rows=187 loops=1)
                    Filter: ((name)::text !~~ '%test%'::text)
                    Rows Removed by Filter: 24
                    Buffers: shared hit=6
              ->  Hash  (cost=11.55..11.55 rows=255 width=16) (actual time=0.084..0.084 rows=267 loops=1)
                    Buckets: 1024  Batches: 1  Memory Usage: 21kB
                    Buffers: shared hit=9
                    ->  Seq Scan on locations loc  (cost=0.00..11.55 rows=255 width=16) (actual time=0.003..0.054 rows=267 loops=1)
                          Buffers: shared hit=9
        ->  Hash  (cost=49.05..49.05 rows=530 width=8) (actual time=0.293..0.293 rows=542 loops=1)
              Buckets: 1024  Batches: 1  Memory Usage: 30kB
              Buffers: shared hit=7
              ->  Seq Scan on companies comp  (cost=0.00..49.05 rows=530 width=8) (actual time=0.004..0.237 rows=542 loops=1)
                    Filter: (id <> ALL ('{161,162,163,...}'::bigint[]))
                    Rows Removed by Filter: 50
                    Buffers: shared hit=7
Planning:
  Buffers: shared hit=29
Planning Time: 0.393 ms
Execution Time: 0.563 ms

Understanding Performance Variation

Before diving into metrics, recognize that query runtimes naturally fluctuate between executions. Minor differences often reflect normal variation rather than meaningful changes.

Rule of thumb: Consider an optimization meaningful when runtime consistently improves by 20-30% across multiple executions (3-5 runs), or when variance tightens noticeably. For example:

10s → 9.4s: Likely noise
10s → 7s (repeatedly): Real improvement

Key Metrics to Track

1. Execution Time

Found at the bottom of the output:

Execution Time: 0.553 ms

This is your primary performance indicator—the total wall-clock time for query execution.

2. Memory Usage

Memory appears in hash nodes and aggregation operations:

Batches: 1  Memory Usage: 40kB

How to interpret:

Focus on the largest Memory Usage among all hash nodes
This determines whether the query fits in work_mem without spilling to disk
Memory is shown per plan node as peak usage, not cumulative
Smaller nodes are usually negligible
Watch for “Batches: >1” which indicates memory spilling (performance penalty)

3. Buffer Activity (I/O)

Buffers track data access patterns:

Buffers: shared hit=22

PostgreSQL tracks how queries access data in memory and on disk using buffers. The Buffers: line in EXPLAIN (ANALYZE, BUFFERS) tells you where the data came from and how much I/O occurred.

Key buffer types:

shared hit: The data was already in PostgreSQL’s shared memory cache. This is very fast because the query didn’t need to touch the disk.
shared read: The data had to be read from disk because it wasn’t in memory. Disk reads are much slower than memory hits.
temp read/written: PostgreSQL had to create temporary files on disk because the query exceeded work_mem. This is usually a sign of heavy sorting, hashing, or joins that didn’t fit in memory, resulting in slower performance.

How to interpret:

High “shared hit” with zero “shared read” means all data came from cache (ideal)
The top node’s buffer count covers the entire query
More reads = more disk I/O = slower performance. If shared read or temp read/written counts are high, your query is hitting the disk often. This is a signal that you may need to increase work_mem, optimize indexes, or reduce data scanned.

4. Actual Time vs Cost

Each node shows both estimated cost and actual timing:

HashAggregate  (cost=49.84..51.77 rows=193 width=45) 
               (actual time=0.493..0.504 rows=45 loops=1)

The planner is PostgreSQL’s internal optimizer that decides how to run your query. In this example, the planner thought it would output 193 rows, but the query actually produced 45. Large differences here often indicate outdated statistics or non-representative data distribution, which can lead to suboptimal plans. Small differences between estimated cost/rows and actual time/rows mean planner is accurate and the query is likely well-optimized.

cost: Planner’s estimate (unitless)
actual time: Real wall-clock milliseconds (start..end)
rows: Estimated vs actual row counts

Common Misconceptions

CPU Time: The “actual time” field shows wall-clock time, not CPU time. True CPU metrics require pg_stat_statements or OS-level monitoring tools, which may require elevated privileges.

Planning vs Execution: Don’t overlook planning overhead:

Planning Time: 0.410 ms

For frequently-executed queries, consider prepared statements to amortize planning cost.

Practical Example Analysis

Given this output summary:

Execution Time: 0.553 ms
Memory Usage: 40 kB (largest node)
Buffers: shared hit=22 (all cache hits, no disk reads)

Interpretation:

Query is very fast (sub-millisecond)
Memory footprint is small, no spilling concerns
All data served from cache—excellent I/O profile
Minimal optimization needed unless running millions of times

Optimization Workflow

Baseline: Run query 3-5 times, record metrics
Identify bottleneck:
- Slow node with high actual time?
- Memory spilling (Batches > 1)?
- Excessive disk reads (shared read)?
Apply change: Add index, increase work_mem, rewrite query
Measure: Run 3-5 times again
Validate: Improvement ≥20-30%? Variance tighter?

Quick Reference

Metric	Where to Look	Good	Warning / Bad
Execution Time	Execution Time in summary (bottom of output, includes triggers)	<10 ms for simple queries	Seconds+ for small datasets
Memory	Memory Usage in hash / aggregate nodes; check Batches	Fits in work_mem	Batches >1 (spills to disk)
Buffers / I/O	Top-most node for total; child nodes for detailed view	Mostly shared hit	High shared read or temp read/written (disk I/O)
Row estimates	rows in each node (estimated vs actual)	Within ~2×	Off by 10×+ (may indicate outdated statistics or skewed data)

Remember: Focus on the metrics that matter for your specific workload. A query that runs once per hour can tolerate higher latency than one executing thousands of times per second.

I’m not an expert in this area, so I welcome corrections and insights from readers or advanced users. When interpreting PostgreSQL’s EXPLAIN (ANALYZE, BUFFERS) output, I encountered differing opinions from SQL practitioners: some favored using the summary for total elapsed time, others suggested focusing on top nodes; memory could be reported as the largest hash node usage or roughly summed across nodes; and I/O is best viewed at the top node but may omit triggers. I followed the broadest consensus I could find, acknowledging that exact measurement can be tricky.

While this article focuses on PostgreSQL, please note that the execution plan tooling is not standardized across SQL databases. PostgreSQL’s EXPLAIN ANALYZE is unusually transparent: it executes the query and reports real timings, row counts, memory usage, and I/O at each plan node. However, other systems like MySQL take different approaches and may expose fewer internals. As a result, advice that works well for PostgreSQL plan analysis may not translate directly to other dialects, even when the commands look similar. Always check what your database is actually measuring (and what it isn’t) before drawing performance conclusions.

Background#

The Query Optimization Workflow#

Common Optimization Techniques#

Example Output#

Understanding Performance Variation#

Key Metrics to Track#

1. Execution Time#

2. Memory Usage#

3. Buffer Activity (I/O)#

4. Actual Time vs Cost#

Common Misconceptions#

Practical Example Analysis#

Optimization Workflow#

Quick Reference#