Thursday, September 30, 2010
In SQL Server 2008 and SQL Server 2005, there are three storage modes available for use: Multidimensional OLAP, Relational OLAP and Hybrid OLAP. I have tried to compare merits and demerits of each storage option in this post. With storage option, SQL Server analysis services provides one more useful functionality “Proactive caching” which I will cover in my next blog.
Well now question arises, when each storage modes are useful?
There are few factors that plays important role in deciding storage modes approach-
- Query Response time
- Data Volume (Analysis Server)
- Processing Time
Detailed Data, Aggregation and Metadata are stored on Analysis Server in MOLAP. Analysis Services pulls complete detailed data from relational database on analysis server and then creates aggregations. MOLAP keep data (detailed data and aggregations) on same analysis server so query response time is significantly better than other two approaches. MOLAP has high latency because it remains disconnected from relational DB and connects again for pulling new changed data. If data volume is huge then latency will be higher. We can overcome at certain extent by using incremental processing.
Detailed data and aggregations will reside in relational database and metadata will be on Analysis server. In ROLAP, analysis server sends request to relational database for each query. Among all storage modes Query performance will be lowest in ROLAP (in most of the cases) because each query needs to be addressed by relation DB and these will be needed extra time to pass result over the network to OLAP server. Latency will be low in ROLAP as data and aggregations are on relational database.
HOLAP is middle approach which is somewhere between MOLAP and ROLAP. Detailed data reside in relational database but aggregations and metadata reside in analysis server. If query hits aggregations (query can be addressed by aggregations) then HOLAP query response time is similar to MOLAP but if query needs to be addressed by relational database then Query response time will be similar to ROLAP. Latency is somewhere between ROLAP and MOLAP.
If relational database size in few terabytes (<=5TB), enough hardware (memory and CPU power) and incremental frequency in days, week or month etc. then prefer MOLAP. It provides best Query response time.
ROLAP and HOLAP are mostly useful when latency is quite low (in seconds, minutes and hours). Mostly project go with MOLAP and rarely with ROLAP approach. I don’t have any industry stats but if I find something, I will provide information in this post.
I have not seen HOLAP implantation related case studies/articles so far on the net. If you have come across HOLAP implementation then please share the details.