Normalize a database


The goal of normalization is to remove these dangerous features from the database:

  • Duplication of data.
  • Inconsistent data.
  • Ambiguous data.

Normalization starts as a mechanical process where you just follow the rules but the final design will need to be adjusted by an intelligent hand.

Stages

There are six stages to Normalization but the first three - 1st, 2nd, and 3rd Normal Form - are adequate for most situations. The other stages - Boyce-Codd, 4th and 5th - are more relevant to academic study.

1st Normal Form (1NF)
There is a Primary Key that uniquely identifies each record.
There are no repeating fields - you don't have a series of fields named 'Item1', 'Item2', 'Item3', etc.
Remove any such fields into a separate table and include a foreign key to refer back to the parent table.
2nd Normal Form (2NF)
The table is in 1st Normal Form and all fields depend on the entire primary key. This happens automatically if you are using a simple primary key.
If you are using a composite key made up of two or more fields then you have to check that no other field relies on just one of them.
Remove any such fields into a separate table, together with that part of the primary key on which they depend.
3rd Normal Form (3NF)
The table in 2nd Normal Form and no fields depend on anything but the primary key.
Remove any such fields to another table, leaving the key in the original table.

Result of normalization

You start the process with all your information in a single flat table and normalization gives you a number of smaller tables that hold the same information but which hold it in an efficient and unambiguous way. The process so far has been mechanical - in fact Access has a Wizard to do it for you - but the next three steps need some intelligence.

One-to-one relations

Normalization will never generate two tables in a one-to-one relationship. There is no theoretical reason to separate fields like this but you might want to split a table into two for practical reasons such as:

  • to reduce the number of fields in a table and meet some limit in the programming language.
  • to separate rarely-used fields so that you do not waste time transferring and processing them in routine operations.
  • to separate sensitive information from commonly-used fields so that the tables can be stored and backed-up separately.

Many-to-many relations

A many-to-many relationship cannot be implemented directly and has to be represented by an intermediate table that shows the many links between the two tables. Sometimes this intermediate table will just hold these two foreign keys, for example the student ID and the course code. Sometimes more information will be held, for example the mark that this student achieved on this course.

Normalization will tell you that Students and Courses should be separated from the original Enrolments table but normalization will not tell you how best to implement the link between these tables.

Denormalization

Normalized tables store data efficiently as safely without duplication but you may need to denormalize for the following reasons:

  • to make reporting easier for non-technical staff.
  • to create a Data Warehouse.
  • to improve the speed of execution.
  • to reduce the length of an SQL statement and comply with language restrictions.

Remember that the theory of normalization was developed in a time when a megabyte of disk storage space cost over £100. Prices now are less than a tenth of a penny a megabyte and there is no longer the pressure for extreme efficiency. Developer time is more expensive than storage space.

As an extreme example, you could normalize an address table into county, town, district, street, house name, and house number tables. This would save disk space but a simple address list would require an awkward and inefficient six-table query. This normalization is correct in theory but impractical.


Hints & tips

The textbox class in Visual FoxPro 9 has a new Autocomplete property which shows the user the previous values that have been entered in that textbox.
Autocomplete in VFP 9

Your Access database will look more impressive if you add custom toolbars...
Custom toolbars

FoxPro has always had functions to read and write files at a low level...
Foxpro low level file functions

More...
More pages of hints and tips for users of Microsoft FoxPro and Access databases.

Site Map