Subqueries and CTEsLesson 5.2

Correlated subqueries - what they are and when they are slow

correlated subquery, outer query reference, EXISTS, NOT EXISTS, performance implications, when to use vs rewrite

Correlated Subqueries Reference the Outer Query

A regular subquery runs once. A correlated subquery references a column from the outer query, so it runs once per outer row. This makes them powerful but potentially slow on large tables.

-- For each order, find orders with higher amount by the same customer
SELECT o1.id, o1.amount
FROM orders o1
WHERE o1.amount > (
  SELECT AVG(o2.amount)
  FROM orders o2
  WHERE o2.customer_id = o1.customer_id  -- references outer query
);

EXISTS Is the Best Use of Correlated Subqueries

EXISTS stops as soon as it finds one matching row - it doesn't build the full result set. Use it to check for the presence of related rows:

-- Customers who have at least one completed order
SELECT c.id, c.name
FROM customers c
WHERE EXISTS (
  SELECT 1 FROM orders o
  WHERE o.customer_id = c.id
  AND o.status = 'completed'
);

The SELECT 1 inside EXISTS is conventional - EXISTS only cares whether rows exist, not what columns are returned. NOT EXISTS is also safe with NULLs, unlike NOT IN.