So after about 3
weeks I think I am getting a hang of Business Intelligence. I am going to continue from
where I left in my last blog. So as far as I understand, it all comes down to
identifying grain, facts and dimensions.
Grain: Identifying the grain pretty much defines the type of
data that we are modelling. The grain establishes exactly what a single fact
row table represents. The grain have to be identified before capturing the
facts and dimensions required for the modelling. Atomic grain refers to the
lowest level at which data is captured by the business.
Facts: Facts are the things we measure as a result of a
business process and is almost always a numeric value. Within a fact table,
only facts consistent with the declared grain are allowed. It is the fact table
that gives us the insight into all the aggregation and reports. The measures
stored in the fact table is used for the calculations. Fact table mostly
consists of foreign keys.
Dimensions: it is a collection of reference information
about a measurable fact. The data that doesn’t change very often are stored in
the dimension table. Dimension table will have one primary key and all the
related data.
Star schema typically consists of
a fact table and all the related dimension tables. The facts and dimensions are
joined together in a star schema. Below is an example of a star schema. Retail
sales is the fact table with dimensions product, promotion, clerk, customer,
store and date.
Four-step Dimensional Design Process
The major steps involved during
the design phase for dimensional modelling are:
- Select the business process: Understanding of the business process is required for us to understand the information that we are trying to model.
- Declare the grain: After understanding the business, we should be able to identify and declare the grain.
- Identify the dimensions: Model the data which is going to be either constant or going to vary after a long time. This will be stored in the dimension table. Also assign the primary key.
- Identify the facts: Understand and identify the measurable data which goes as measures and dimensions which needs to be stored as foreign keys in the fact table.
References:
The Data Warehouse Toolkit Third Edition - Ralph Kimball, Margy Ross
No comments:
Post a Comment