Tuesday, February 4, 2014

And the fact continues..!

So after about 3 weeks I think I am getting a hang of Business Intelligence. I am going to continue from where I left in my last blog. So as far as I understand, it all comes down to identifying grain, facts and dimensions.

       Grain: Identifying the grain pretty much defines the type of data that we are modelling. The grain establishes exactly what a single fact row table represents. The grain have to be identified before capturing the facts and dimensions required for the modelling. Atomic grain refers to the lowest level at which data is captured by the business.

         Facts: Facts are the things we measure as a result of a business process and is almost always a numeric value. Within a fact table, only facts consistent with the declared grain are allowed. It is the fact table that gives us the insight into all the aggregation and reports. The measures stored in the fact table is used for the calculations. Fact table mostly consists of foreign keys.

        Dimensions: it is a collection of reference information about a measurable fact. The data that doesn’t change very often are stored in the dimension table. Dimension table will have one primary key and all the related data.

Star schema typically consists of a fact table and all the related dimension tables. The facts and dimensions are joined together in a star schema. Below is an example of a star schema. Retail sales is the fact table with dimensions product, promotion, clerk, customer, store and date.



Four-step Dimensional Design Process
The major steps involved during the design phase for dimensional modelling are:
  •  Select the business process: Understanding of the business process is required for us to understand the information that we are trying to model.
  • Declare the grain: After understanding the business, we should be able to identify and declare the grain.
  • Identify the dimensions: Model the data which is going to be either constant or going to vary after a long time. This will be stored in the dimension table. Also assign the primary key.
  • Identify the facts: Understand and identify the measurable data which goes as measures and dimensions which needs to be stored as foreign keys in the fact table.
References:
The Data Warehouse Toolkit Third Edition - Ralph Kimball, Margy Ross

No comments:

Post a Comment