What is database normalization and why it matters
normalization definition, update anomaly, insert anomaly, delete anomaly, normal form levels overview, denormalization tradeoff
What Normalization Solves
Normalization is the process of restructuring a relational schema to reduce data redundancy and eliminate update anomalies. It works by decomposing wide tables into smaller, focused tables and connecting them with foreign keys.
The Three Anomalies
Without normalization, a flat table suffers from three problems:
- Update anomaly: A customer's address is stored in every order row. Change it in one row; forget the others. Now you have contradictory data.
- Insert anomaly: You cannot record a new supplier until they have at least one product in the table, because the supplier data lives in the product row.
- Delete anomaly: Deleting the last order from a customer destroys the only record of that customer's contact details.
Normal Forms
Each normal form is a rule that, when followed, eliminates a specific class of redundancy:
- 1NF โ atomic values, no repeating groups
- 2NF โ no partial dependencies on a composite key
- 3NF โ no transitive dependencies
BCNF and 4NF exist for edge cases in advanced schemas. For 95% of production databases, 3NF is the right target. Denormalization (intentionally violating normal forms for read performance) is a later, deliberate optimization โ not a design starting point.
