Tuesday, February 11, 2014

Slowly Changing !

After having gone through the basics of dimensional modelling, it is time to get into a bit of details. It’s hard to sequentially go over all the topics of this course because the volume is enormous. Hence I am going to pick one topic (somewhat related to what I spoke about in my last blog) and then go deep into it.

There are certain dimensions that change over time and they are referred to as Slowly Changing Dimensions. Some of the dimensions that could change over time are selling prices, names of people, cities and so on. There are different techniques used to accommodate these changes into our dimensional model. They are called as different types of slowly changing dimensions.

Type 0: Retain original
With type 0, the dimension value never changes and hence the original always remains the same. An example of a type 0 is date dimension. Hence there is no need to change anything in the model.

Type 1: Overwrite
In this method, the new value overwrites the old value. It always reflects the most recent data. Hence no history is maintained. This method is easy to implement.

For example, if the state of a customer changes:

Customer ID
Name
State
C001
James
Tucson

After update:

Customer ID
Name
State
C001
James
Phoenix

Type 2: Add New Row
Type 2 requires us to add an additional row with the updated information. The old information is retained in the table. This helps us to track the history of the data. Three new columns will be added to indicate the latest information.
  • Row effective date or date timestamp
  • Row expiration date or date timestamp
  • Current row indicator

Customer ID
Name
State
C001
James
Tucson

After update:

Customer ID
Name
State
Row effective date
Row expiration date
Current row indicator
C001
James
Tucson
2/2/2014
2/11/2014
0
C002
James
Phoenix
2/11/2014

1


Type 3: Add new attribute
A new attribute is added to the dimension to preserve the old attribute values; the new value overwrites the old value just like in type 1. This is also referred to as alternate reality and this technique is used infrequently.

Customer ID
Name
State
C001
James
Tucson

After update:

Customer ID
Name
Original state
Current state
C001
James
Tucson
Phoenix

Kimball talks about another 4 types of slowly changing dimensions in his book.

Type 4: Mini-dimension
 When a group of attributes are changed suddenly and they kind of form another group from the dimension, it is referred to as mini-dimension. Using this we capture the base dimension and the mini-dimension. Also called as rapidly changing monster dimension. Mini-dimension also requires its own primary key.

Type 5: Add mini-dimension and Type 1 Outrigger
Type 5 builds on type 4 by embedding a current type 1 reference to the mini-dimension in the base table. This technique is used to preserve the historical attribute values.

Type 6:  Add Type 1 Attributes to Type 2 dimension
Similar to type 5, type 6 builds on the type 2 technique by embedding the current type 1 versions of the same attributes in the dimension row so that fact rows can be filtered or grouped.

Type 7: Dual Type 1 and Type 2 Dimensions
This is the final hybrid technique used to support both history and current attribute values. A fact table can be accessed from the type 1 dimension model which will give the current values or from the type 2 dimension model which will give the historical profiles. 

References:

The Data Warehouse Toolkit Third Edition - Ralph Kimball & Margy Ross
http://dwhlaureate.blogspot.com/2012/09/

4 comments:

  1. comprehensive!! Though in type 2 can you keep the row expiration date empty?

    ReplyDelete
  2. Considering that we do not know when the current data is going to expire, I believe it should be alright to leave it as a N/A value. But good question to think about!

    ReplyDelete
  3. I think putting dummy data like 12/31/9999 would be a better option.

    ReplyDelete
  4. Data warehouses can help you gain valuable insight into your business by integrating multiple data sources. This makes them invaluable tools for business intelligence and data analysis. However, in order to take full advantage of your data warehouse, you'll need to understand thearchitecture of a data warehouse

    ReplyDelete