最近看见了 database normalization. 之前老师上课没好好讲, 现在看见了就自己再总结一下.

Edgar F. Codd 有三个 definations, 甚至更多???

ZNF - zero normalized from Link to heading

A non-normalized table structure is considered to be zero normal form (ZNF)

1NF - first normalized form Link to heading

Atomicity: cannot be decomposed into smaller pieces by the DBMS (excluding certain special functions)

  • no repeating groups
  • Identify each set of related data with a primary key
  • Create a separate table for each set of related data.

compound/multiple primary keys doesn’t voilate 1NF

2NF - second normalized form Link to heading

no partial dependencies

the table is in 1NF with no partial dependencies; that is, fields in a table that are dependent on part of the primary key and all fields are dependent on a single primary key or on all of the fields in a composite primary key.

for example:

CREATE TABLE CUSTOMERS(
   CUST_ID    INT              NOT NULL,
   CUST_NAME VARCHAR (20)      NOT NULL,
   ORDER_ID   INT              NOT NULL,
   ORDER_DETAIL VARCHAR (20)  NOT NULL,
   SALE_DATE  DATETIME,
   PRIMARY KEY (CUST_ID, ORDER_ID)
);

the table is in 1nf form. Noticed that CUST_ID and ORDER_ID, some others related to each of them but not the other one. CUST_NAME 依靠 CUST_ID, ORDER_DETAIL 不依靠 CUST_ID 而依靠 ORDER_ID. 所以, 如果想要 database 符合2nf, 我们需要把一张表分为3张表.

CREATE TABLE CUSTOMERS(
   CUST_ID    INT              NOT NULL,
   CUST_NAME VARCHAR (20)      NOT NULL,
   PRIMARY KEY (CUST_ID)
);

CREATE TABLE ORDERS(
   ORDER_ID   INT              NOT NULL,
   ORDER_DETAIL VARCHAR (20)  NOT NULL,
   PRIMARY KEY (ORDER_ID)
);

CREATE TABLE CUSTMERORDERS(
   CUST_ID    INT              NOT NULL,
   ORDER_ID   INT              NOT NULL,
   SALE_DATE  DATETIME,
   PRIMARY KEY (CUST_ID, ORDER_ID)
);

这样, database 符合 2nf

3NF - third normalization form Link to heading

the table is in 2NF with no transitive dependencies

  • It is in second normal form.
  • All nonprimary fields are dependent on the primary key. (cannot be fields dependent on nonprimary keys)

for example:

CREATE TABLE CUSTOMERS(
   CUST_ID       INT              NOT NULL,
   CUST_NAME     VARCHAR (20)      NOT NULL,
   DOB           DATE,
   STREET        VARCHAR(200),
   CITY          VARCHAR(100),
   STATE         VARCHAR(100),
   ZIP           VARCHAR(12),
   EMAIL_ID      VARCHAR(256),
   PRIMARY KEY (CUST_ID)
);

in this table STREET, CITY, STATE all depends on ZIP. it is transitive dependencies.

break down to two tables.

CREATE TABLE ADDRESS(
   ZIP           VARCHAR(12),
   STREET        VARCHAR(200),
   CITY          VARCHAR(100),
   STATE         VARCHAR(100),
   PRIMARY KEY (ZIP)
);

CREATE TABLE CUSTOMERS(
   CUST_ID       INT              NOT NULL,
   CUST_NAME     VARCHAR (20)      NOT NULL,
   DOB           DATE,
   ZIP           VARCHAR(12),
   EMAIL_ID      VARCHAR(256),
   PRIMARY KEY (CUST_ID)
);

credit to