Denormalization: when to break the rules deliberately
denormalization definition, read vs write tradeoff, reporting tables, materialized views, redundant columns for performance, when not to denormalize
When 3NF Slows You Down
Normalization optimizes for write correctness. Denormalization trades write safety for read performance. You denormalize deliberately, after profiling proves a query is too slow, not as a default approach.
Practical Scenarios
Pre-aggregated totals: An orders table stores an order_total column even though it could be derived by summing order_items. The aggregate query runs millions of times per day; storing it once is cheaper.
Flattened reporting tables: An analytics schema might duplicate customer name and region into every fact row to avoid joining a 200-million-row fact table to a dimension table on every dashboard load.
Rules for Safe Denormalization
-- Redundant computed column (denormalized)
ALTER TABLE orders
ADD COLUMN item_count INT NOT NULL DEFAULT 0;
-- Keep it consistent with a trigger or application logic
-- Document the redundancy explicitly in a comment
COMMENT ON COLUMN orders.item_count IS
'Denormalized: mirrors COUNT(*) from order_items. Updated by app layer.';- Always document why the redundancy exists.
- Enforce consistency through triggers or application-layer logic.
- Never denormalize as a first step โ only after measuring.
