Normalization is the process of converting a database design into a standard format. The conversion is done into a normal form - which is the standard of designing a database.
There are 3 common normal forms: First Normal Form (1NF), Second Normal Form (2NF) and Third Normal Form (3NF). There are several other normal forms that follow after the 3NF.
The first normal form (1NF) is the foundation of database normalization.
Atomic Values: Each cell in a table should contain a single, indivisible value. This means no lists, multiple values separated by commas, or nested tables within a cell.
No Repeating Groups: A table shouldn't have repeating groups of columns to store related data. This avoids data redundancy and makes it difficult to manage and update the data.
Unique Rows: Each row in a table should be uniquely identifiable. This usually requires having a primary key, which is a column or a set of columns whose values uniquely identify each row in the table.
Example: Name is not a unique key because it can be the same for several people. It doesn't uniquely identify a row - a set of information - a person inside a table.
688 x 281, 13.9 KB, PNG
Each table should have a primary key that uniquely identify a record/row in a table
A Primary Key can be a combination of two or more other fields if when combined, can uniquely identify a row in a table. In this case, NO additional field for Primary Key need to be added.
For example: a combination of Date of Birth and Phone Number could be unique, so it can be a potential candidate for PK. However, when considering the immutability of the PK, we may not choose them , see below.
When choosing a Primary Key, consider the followings:
Uniqueness: This is the absolute requirement; no two rows can have the same primary key.
Non-null values: The PK should never contain null values.
Immutability: The values in the PK should ideally never change. Updates to the PK can cause issues with Foreign Key (FK) relationship and data integrity. Example: Phone Number should not be a PK even though it may be unique because if user changes their number, other pieces of information that link to this phone number may be affected.
An order can contain many items, and an item also can belong to many orders.
Many-to-many relationship is not ideally good, because it introduces data redundancy and potential for inconsistency. So, it is common to create a joining table that contains the FK of two original tables.
The joining table separate many-to-many relationship to two one-to-many relationships.
3005 x 1405, 109.7 KB, PNG
A joining table contain foreign keys of two connected tables.
This is when a table joins to itself, for example, an employee can have a manager, or a manager can have many employees, but they are in the same table "User".
“Fulfil the requirements of First Normal Form and Each non-key attribute must be functionally dependent on the primary key.
In the Second Normal Form, the 1NF must be met, and then all partial dependency must be eliminated.
Partial dependency occurs when a non-key attribute relies on only a portion of the primary key to determine its value.
3350 x 1205, 152.4 KB, PNG
Course Name (non-key attribute) causes this Enrollment table to violate the 2NF because it partly depends on the primary key (Student ID + Course ID)
To resolve, we could separate those attributes that partly depend on the primary key to a separate table, so they can fully/solely depend on their primary key and put only the ID of the new table as a foreign key in the original tables.
Determine which one has many of the other, then put the Foreign Key in the table of the other's table.
For example, a subject can have a teacher, but a teacher can teach many subjects, in this case, the foreign key should be put inside the subject table.
2805 x 705, 57.4 KB, PNG
In this case, a teacher has many courses, so the Course table would store the Teacher ID as a foreign key.
Just following up on the 2NF, Third Normal Form requires the Second Normal Form to be fulfilled (thus the 1NF), and all transitive dependencies should be eliminated.
This means that, no non-key attribute should be indirectly dependent on the primary key through another non-key attribute.
4740 x 1565, 303.1 KB, PNG
An example of transitive dependency
In this example, it follows the Second Normal Form, because there is no non-key attribute that depends on a portion the primary key. However, it violates the Third Normal Form.
3655 x 705, 80.5 KB, PNG
Split the transitive dependency attributes to a separate table.
To resolve this, we split the table into two, the new one is to store customer data including Customer Name.
Thanks for reading and please let me know if you have any questions or any comments.