Wednesday, 9 May 2012

Normalizing for OLTP and OLAP Databases

Its easy to figure out which tables are better for OLAP databases by
just looking at the size and simplicity of your SQL Query.

If the SQL Query has to many joins then for sure you sql engine is
going to be doing a lot of reads/fetches from different tables.

If the SQL Query has simple select statements with just WHERE
clauses that is the one you want to use to design your OLAP tables.



In OLAP repeated attribute values are acceptable but for OLTP
databases, it means to many redundant rows filling up your table.

In OLAP where and when ever there is to much redundant and un-unique
rows its time to start thinking of Normalization as below:



1st Normal Form

  1. Every row should be unique :- Achive this with setting primary
    key.
  2. No Repeating Groups


Supplier



Col Name

Data Type

Allow Nulls

PK

supplierid

int

NO



suppliername varchar(50) NO













Product



Col Name

Data Type

Allow Nulls



productid

int

NO



productname varchar(50) NO



supplierid

int





2nd Normal Form

  1. This has to be in complient with 1st Normarl Form.
  2. There should be no partial key dependencies
  3. All the Attributes must be dependant on the key




Products



Col Name

Data Type

Allow Nulls



productid

int

NO



productname varchar(50) NO



store

varchar(50)

NO



price

money

NO



  1. We will have multiple redundant entries for a similar
    productid,productname and price in different stores combination.
    Hence this is not in 1st Normal Form.
  2. We will need to remove the store and give it a seperate table
    as below:




Products



Col Name

Data Type

Allow Nulls



productid

int

NO



productname varchar(50) NO



price

money

NO





Stores



Col Name

Data Type

Allow Nulls



storeid

int

NO



storename varchar(50) NO



productname

varchar(50)

NO



  1. Hence we have broken it down so that store has its own storeid
    with storename and its productname
  2. In 2nd Normal Form its ok to have productname duplicated in
    two tables as above which is ok till 2nd normal form but doesn't
    conform to 3rd Normal Form

  3. This good in OLAP databases but it causes redundancy hence
    still need to be removed and normalized more for 3rd Normal
    Form.



3rd Normal Form

  1. This has to be in complient with 1st Normarl Form.
  2. There should be no partial key dependencies
  3. In 3rd Normal Form all the attributes must be *solely*
    dependant on the key


Products



Col Name

Data Type

Allow Nulls



productid

int

NO



productname varchar(50) NO



price

money

NO



Stores



Col Name

Data Type

Allow Nulls



storeid

int

NO



storename varchar(50) NO













Inventory



Col Name

Data Type

Allow Nulls



storeid

int

NO



productid char(10) NO











  • The Above tables have been normalized for OLTP databases which
    is good only for OLTP transactions.
  • When coming to OLAP databases the 2nd Normal Form tables are
    better suited with the redundant productname columns, since it
    makes for simpler query statements.

No comments:

Post a Comment