Font Awesome Free 5.13.0 by @fontawesome - https://fontawesome.com License - https://fontawesome.com/license/free (Icons: CC BY 4.0, Fonts: SIL OFL 1.1, Code: MIT License)
Skip to main content

Database normalization

An overview of database normalization for relational databases

Basic idea

Database normalization is the process of structuring a relational database in accordance with a series of so-called normal forms in order to reduce data redundancy and improve data integrity.

Basic idea: avoid the existence of redundant data in your database because it is a source of possible inconsistencies.

The way to get rid of this redundant data is generally to introduce new tables.

Example: Table with of items with item id, description, manufacturer name and manufacturer country. Storing the manufacturer’s country in every row is a form of redundant data: if we know the manufacturer, we also know the country. And what if we change the manufacturer for a row but forget to change the country? The solution here is to create a separate table for manufacturers, with their country, and refer to that table from the items table.

Normal forms

Normal forms = formally defined "levels of database normalization"

Normal forms from least normalized to most normalized (see below for ones in bold)

1NF (first normal form)

Meaning: values in columns are atomic (each cell has only a single value)

Example:

Book table

TitleSubject
Beginning MySQL Database Design and OptimizationMySQL, Database, Design

Problem: Subject has multiple values in a single cell

Solution: make Subject into its own table

Book table

Title
Beginning MySQL Database Design and Optimization

Subject table

IDName
1MySQL
2Database
3Design

Book - Subject table (many-to-many relationship, needed because book can have multiple subjects and multiple books can share the same subject)

TitleSubject ID
Beginning MySQL Database Design and Optimization1
Beginning MySQL Database Design and Optimization2
Beginning MySQL Database Design and Optimization3

2NF (second normal form)

Meaning: 1NF + no partial dependencies (values depend on the whole of every candidate key)

Candidate key: minimal set of columns whose values uniquely identify a single row in the table

Example:

Book table

TitleFormatAuthorPrice
Beginning MySQL Database Design and OptimizationHardcoverChad Russell49.99
Beginning MySQL Database Design and OptimizationE-bookChad Russell22.34
The Relational Model for Database Management: Version 2E-bookE.F.Codd13.88
The Relational Model for Database Management: Version 2PaperbackE.F.Codd22.34

This table has only one candidate key ({ Title, Format })

Problem: only price depends on the entire candidate key, author does not

Solution: extract formats and prices into different table

Book table

TitleAuthor
Beginning MySQL Database Design and OptimizationChad Russell
The Relational Model for Database Management: Version 2E.F.Codd

Format - Price table

TitleFormatPrice
Beginning MySQL Database Design and OptimizationHardcover49.99
Beginning MySQL Database Design and OptimizationE-book22.34
The Relational Model for Database Management: Version 2E-book13.88
The Relational Model for Database Management: Version 2Paperback39.99

3NF (third normal form)

Meaning: 2NF + no transitive dependencies (see example below)

Note: Database is often considered "normalized" if it meets 3NF

Example:

Book table

TitleAuthorGenre IDGenre Name
Beginning MySQL Database Design and OptimizationChad Russell1Tutorial
The Relational Model for Database Management: Version 2E.F.Codd2Popular science

Problem: Genre ID and Genre Name both depend on the primary key { Title } but are not independent of each other. Dependency Title -> Genre Name can be deduced from Title -> Genre ID and Genre ID -> Genre Name (this means we have a transitive dependency)

Solution: separate Genre table

Book table

TitleAuthorGenre ID
Beginning MySQL Database Design and OptimizationChad Russell1
The Relational Model for Database Management: Version 2E.F.Codd2

Genre table

Genre IDGenre Name
1Tutorial
2Popular science

Resources