After having
gone through the basics of dimensional modelling, it is time to get into a bit
of details. It’s hard to sequentially go over all the topics of this course
because the volume is enormous. Hence I am going to pick one topic (somewhat
related to what I spoke about in my last blog) and then go deep into it.
There are
certain dimensions that change over time and they are referred to as Slowly Changing Dimensions. Some of the
dimensions that could change over time are selling prices, names of people,
cities and so on. There are different techniques used to accommodate these
changes into our dimensional model. They are called as different types of
slowly changing dimensions.
Type 0: Retain original
With type 0, the
dimension value never changes and hence the original always remains the same.
An example of a type 0 is date dimension. Hence there is no need to change
anything in the model.
Type 1: Overwrite
In this method,
the new value overwrites the old value. It always reflects the most recent
data. Hence no history is maintained. This method is easy to implement.
For example, if
the state of a customer changes:
Customer ID
|
Name
|
State
|
C001
|
James
|
Tucson
|
After update:
Customer ID
|
Name
|
State
|
C001
|
James
|
Phoenix
|
Type 2: Add New Row
Type 2 requires
us to add an additional row with the updated information. The old information
is retained in the table. This helps us to track the history of the data. Three
new columns will be added to indicate the latest information.
- Row effective date or date timestamp
- Row expiration date or date timestamp
- Current row indicator
Customer ID
|
Name
|
State
|
C001
|
James
|
Tucson
|
After update:
Customer ID
|
Name
|
State
|
Row effective date
|
Row expiration date
|
Current row indicator
|
C001
|
James
|
Tucson
|
2/2/2014
|
2/11/2014
|
0
|
C002
|
James
|
Phoenix
|
2/11/2014
|
1
|
Type 3: Add new
attribute
A new attribute
is added to the dimension to preserve the old attribute values; the new value
overwrites the old value just like in type 1. This is also referred to as
alternate reality and this technique is used infrequently.
Customer ID
|
Name
|
State
|
C001
|
James
|
Tucson
|
After update:
Customer ID
|
Name
|
Original state
|
Current state
|
C001
|
James
|
Tucson
|
Phoenix
|
Kimball talks about another 4 types of slowly changing dimensions in his book.
Type 4: Mini-dimension
When a group of attributes are changed
suddenly and they kind of form another group from the dimension, it is referred
to as mini-dimension. Using this we capture the base dimension and the
mini-dimension. Also called as rapidly changing monster dimension.
Mini-dimension also requires its own primary key.
Type 5: Add
mini-dimension and Type 1 Outrigger
Type 5 builds on
type 4 by embedding a current type 1 reference to the mini-dimension in the base
table. This technique is used to preserve the historical attribute values.
Type 6: Add Type 1 Attributes to Type 2 dimension
Similar to type
5, type 6 builds on the type 2 technique by embedding the current type 1
versions of the same attributes in the dimension row so that fact rows can be
filtered or grouped.
Type 7: Dual Type 1 and
Type 2 Dimensions
This is
the final hybrid technique used to support both history and current attribute
values. A fact table can be accessed from the type 1 dimension model which will
give the current values or from the type 2 dimension model which will give the
historical profiles.
References:
The Data Warehouse Toolkit Third Edition - Ralph Kimball & Margy Ross
http://dwhlaureate.blogspot.com/2012/09/
References:
The Data Warehouse Toolkit Third Edition - Ralph Kimball & Margy Ross
http://dwhlaureate.blogspot.com/2012/09/
comprehensive!! Though in type 2 can you keep the row expiration date empty?
ReplyDeleteConsidering that we do not know when the current data is going to expire, I believe it should be alright to leave it as a N/A value. But good question to think about!
ReplyDeleteI think putting dummy data like 12/31/9999 would be a better option.
ReplyDeleteData warehouses can help you gain valuable insight into your business by integrating multiple data sources. This makes them invaluable tools for business intelligence and data analysis. However, in order to take full advantage of your data warehouse, you'll need to understand thearchitecture of a data warehouse
ReplyDelete