SSAS: xVelocity Technology (Vertipaq )

References:

http://msbiacademy.com/?p=2861 //Part 1 (nice)

http://msbiacademy.com/?p=2581 //Part 2 (nice)

xVelocity Technology (Vertipaq)

Summary in short:

The Vertipaq (xVelocity) engine is the underlying data storage and query engine that’s used for tabular data models which includes Analysis Services running in Tabular mode as well as PowerPivot for Excel, PowerPivot for SharePoint data models that we can query.

Traditionally on multidimensional cubes we read data from the data warehouse and put it into a multidimensional cube and store it on disk in that cube.

On the Tabular side, we read that data out and store it in RAM. So you start to get an idea as to why this is so fast. We’re not reading our cube data from disk anymore–we’re reading that data into RAM and just querying it right from memory.

On the Tabular side you send DAX queries, but you can send MDX.

Transcript of Part 1 Video

[00:00:00.00]

So what is the Vertipaq (xVelocity) engine that we talk about? The Vertipaq (xVelocity) engine is the underlying data storage and query engine that’s used for tabular data models which includes Analysis Services running in Tabular mode as well as PowerPivot for Excel, PowerPivot for SharePoint data models that we can query.

[00:00:26.13]

What Vertipaq’s really known for is being blindingly fast. Compared to doing queries in relational databases or in many cases against multidimensional cubes, Vertipaq can outperform these other technologies by orders of magnitude. So let’s talk about why is it so fast.

[00:00:44.29]

This diagram shows a SQL Server data warehouse on the bottom, and we’re sourcing data into a couple of different models. On the left is the Vertipaq (xVelocity) engine that’s a Tabular model that could also be PowerPivot. On the right is a traditional multidimensional OLAP cube. When we put data into these models, normally we’re using a cached mode. We can use ROLAP or DirectQuery, but the normal case is that we’re using the cached data storage mode.

[00:01:17.03]

When we process these models we’re reading data into them. Traditionally on multidimensional cubes we read data from the data warehouse and put it into a multidimensional cube and store it on disk in that cube.

On the Tabular side, we read that data out and store it in RAM. So you start to get an idea as to why this is so fast. We’re not reading our cube data from disk anymore–we’re reading that data into RAM and just querying it right from memory.

[00:01:48.08]

So if we send a query to each of these models, normally on the Tabular side you send DAX queries, but you can send MDX. This is just a simple MDX query, and we could send it to either of these models. But the process will be different when we do that.

[00:02:06.00]

When we send that query to the multidimensional cube, the cube data will need to be read from disk (at least initially). There will be some caching so the second time the same query is run it will be faster. But the initial query is going to read data from disk, cache it in RAM and sort it out.

This is quite a bit slower than if you were just storing the data in RAM as we do with the Vertipaq (xVelocity) engine and Tabular models. With the Tabular query–which is the same query–the data is already in RAM, so there isn’t that I/O latency while going out to the disk or SAN and pulling data from there.

[00:02:44.12]

We can just read the data in much quicker. RAM access can be 1 million times faster than disk access, so even though the Vertipaq (xVelocity) engine is doing quite a bit of “brute force” scanning, that scanning is a million times faster, so we get better performance.

[00:03:00.07]

Let’s look at an example of storing all this information in RAM and what impact that has on queries. I built a quick data model with Analysis Services in Tabular mode, and brought in about 215 million rows of sales data. So this gives us a relatively large data set to work with–more rows than will fit in RAM on a standard server using most technologies.

[00:03:26.14]

This will give us a good test. We have this loaded already. Let’s look at a query against the original relational data source. This is an unindexed table–we haven’t added indexes or made other performance tuning. We’re just going to scan through the 215 million rows from disk and summarize by some random attribute, in this case the sum of sales by time for a specific item.

[00:04:04.05]

I press the execute button and let it run. While this is running I’ll take a peek on the database server and see what it’s performance is like If I look at my overview here my CPU is only running about 5% utilization. This server has four CPUs and there isn’t much CPU pressure at all.

If I look at the disk, I have about 200MB/sec being read, so I can see that SQL Server is very busy reading pages from the disk.

[00:04:38.01]

Our query is finished. Notice that took 3 minutes 39 seconds to run. Considering a pretty big table scan and some summarizing, it’s not terrible. Let’s look at doing the same thing using our Vertipaq (xVelocity) based Tabular model.

[00:04:54.04]

This is a DAX query that will do the same thing as the SQL query we just ran. The syntax is quite different, but it asks the Tabular model to create the same result.

[00:05:21.21]

I’ll press the Execute button again. The query is completed, and it took 1 second instead of 3.5 minutes!

Transcript of Part 2 Video

[00:00:00.00]

In the last lesson we saw what kind of performance we can get by storing our data in RAM using Vertipaq (xVelocity) instead of storing it on disk. The next question we have is “but how did we get all that data into RAM in the first place?”

Let’s first look at how information is normally stored in the database on disk using SQL Server. The way that works is to take rows of data one row at a time and tart filling pages of disk storage with those rows.

[00:00:34.24]

We fill the page with rows until the page is filled. When it’s full we create another page, until we have our entire table filled out into disk storage. When SQL Server needs that information back–for example it needs row three–it retrieves page #1 to load in row #3. This is done transparently to the user, but this is how data is stored.

[00:01:04.09]

The compression we can get really is limited to what can be compressed within the page. If there would be a benefit to compressing data across pages, there’s really not much that we can do about that because of the way data is stored row by row within pages

[00:01:17.15]

But if you think about it, we compress data by finding redundancies and factoring them out. If we have the same value within a column on many rows, we can compress that by not storing the value over and over. If we look at the factSales table that we were querying in the previous lesson, we can see that in all these columns we had a lot of redundancy. The StoreKey appears over and over again—the TimeKey as well. The ItemKey is less redundant, but we have many values in many columns that are stored over and over and over again.

[00:02:00.24]

We can’t really take advantage of the redundancy when storing rows on pages like this. We’re going to need a lot more disk storage, and because these pages take up so much space we can’t efficiently store so many of these pages in RAM.

[00:02:17.23]

Let’s contrast that with the way Vertipaq (xVelocity) is storing data. Instead of a row store strategy, Vertipaq (xVelocity) uses a column store strategy. What do we mean by that?

In the previous slide we saw that pages of information were stored with many rows of data. What Vertipaq (xVelocity) does is to create pages sort of like the ones we saw before, but its storing data within a single column on a page together. So if you think about the columns that had the redundant values on rows, all those rows are going to be together on the same set of pages. So we instantly we can get tremendous compression because now if the entire page, for example, was all NULL, we only need to store NULL once and then just indicate the value is NULL for all rows on the page (or something like that).

[00:03:07.21]

If the value is a string that’s very long, we can store that string once and then make a footnote that says “by the way, all the rows on this page are the same string as the first one”. Now we can get 10X more rows on the page. We can get much better compression characteristics, and as the columns are loaded into the column store, this is what it begins to look like.

[00:03:36.13]

Column #1 is on a couple pages in this diagram Column #2 is on a couple pages. We’re taking advantage of tremendous compression opportunities to store many, many rows on each columnar page. And since each of these pages has so many rows, when we store each page in RAM, we’re actually storing the value for many more rows in RAM at once compared to the row oriented storage strategy.

[00:03:53.07]

That’s fundamentally why Vertipaq (xVelocity) is able to store so much data in memory. This diagram summarizes the tables we were just querying and shows the benefit of using Vertipaq (xVelocity) for the column storage rather than the traditional row storage we use with SQL Server.

In our 250 million row database, the fact table is by far the most of data storage. This is typical for a data warehouse. In this column of the diagram we have the storage size in MB for the factSales table. This is an uncompressed SQL Storage of about 30GB. If we were to compress that with SQL page compression, perhaps we could get that down to 20GB or so.

[00:04:46.07]

But it would still be a substantial pile of data to store in RAM if that’s what we were trying to do— probably not that practical. It wouldn’t take many tables to saturate our server’s RAM if we started storing that much data in RAM and querying it from RAM.

[00:05:01.13]

With Vertipaq (xVelocity), because of the compression levels it gets–which exceed 10:1 in this case–the data size is only about 2.7GB. This server has 32GB of RAM, so a 2.7GB memory footprint is well worth the performance improvement we get.

[00:05:28.09]

So that’s the reason that Vertipaq (xVelocity) works the way it does. That’s the reason we can store so much in RAM. And because we can query from RAM we get blindingly fast query performance.

Now, certainly there are other optimizations that make Vertipaq (xVelocity) fast, but fundamentally this is the biggest portion of why we get the performance we do from Vertipaq (xVelocity).

SQL Performance: Avoiding Parameter Sniffing in SQL Server

References: http://www.sql-server-performance.com/2013/avoiding-parameter-sniffing-in-sql-server/

Avoiding Parameter Sniffing in SQL Server

By Dinesh Asanka

Parameter sniffing is when SQL Server compiles a stored procedure’s execution plan with the first parameter that has been used and then uses this plan for subsequent executions regardless of the parameters.

First let’s look at a worked example. Assume we have table as below.

CREATE Table TblData
(ID INT IDENTITY PRIMARY KEY ,
Name Char(100),
Loc char(5),
CreateDateTime datetime DEFAULT getdate(),
Status char(2) default 'AC')
CREATE INDEX idx_Loc on tblData(Loc)
 

the table is populated with sample data as below:

 INSERT INTO TblData
(Name,Loc)
VALUES
('Test1','LocA')
GO 100000
INSERT INTO TblData
(Name,Loc)
VALUES
('Test1','LocB')
GO 10
 

Let us run following two queries and analyze the relevant query plans.

 SELECT * FROM tblData
WHERE Loc = 'LocA'
 

It is quite obvious as we are selecting more 90% of data, rather than using the non-clustered index it will be better to use the Clustered Index Scan.

SELECT * FROM tblData
WHERE Loc = 'LocB'
 

Now let us create a stored procedure with a Loc parameter.

CREATE PROC GetDataForLoc
(@Loc as CHAR(5))
AS
BEGIN
SET NOCOUNT ON
SELECT * FROM tblData
WHERE Loc = @Loc
END
 

Now let us execute this with one of the parameters.

GetDataForLoc 'LocA'

And the execution plan is :

This is expected. Next, execute the procedure with the other parameter.

GetDataForLoc 'LocB'

Now, this should take non clustered index as we experienced when we executed the query before. However, since the procedure is cached from the first execution, subsequent executions will use the same execution plan.

This behavior is called parameter sniffing as SQL Server will compile a stored procedure’s execution plan with the first parameter that has been used.

Solution

Clear the procedure cache.

First of all, let us check existing query plan. For this you can use, following query.

 SELECT OBJECT_NAME(s.object_id) SP_Name
 ,eqp.query_plan
FROM sys.dm_exec_procedure_stats s
CROSS APPLY sys.dm_exec_query_plan (s.plan_handle) eqp
WHERE DB_NAME(database_id) = 'ParamSnif'
 

By running following script, we can clear the procedure cache. If you run the previous script again, no records will be returned which means there is no cache for the relevant proc. So the next execution will create new query plan.

sp_recompile 'GetDataForLoc'

Create Procedure WITH RECOMPILE

You can add a WITH RECOMPILE option when creating a stored procedure as shown below. This will cause the stored procedure to be recompiled every time it is executed.

CREATE PROC [dbo].[GetDataForLoc]
(@Loc as CHAR(5)) WITH RECOMPILE
AS
BEGIN
SET NOCOUNT ON
SELECT * FROM tblData
WHERE Loc = @Loc
END
 

After this you can execute stored procedure with two different parameters.

EXEC [dbo].[GetDataForLoc] @Loc = N'LocB'
EXEC [dbo].[GetDataForLoc] @Loc = N'LocA'
 

Unlike last time, these two executions will have two different execution plans. If you examine the cached query plan there will be no saved query plans.

3. Execute with RECOMPILE

If you add WITH RECOMPILE to the execute statement, SQL Server will recompile the procedure before running it and will not store the query plan.

EXEC [dbo].[GetDataForLoc] @Loc = N'LocA' WITH RECOMPILE

Creating Procedure with OPTIMIZE FOR option

This is a query hint which will allow users to specify the value of a parameter that should be used when compiling the procedure, regardless of the actual value of the parameter that users are passing into the store procedure.

 CREATE PROC [dbo].[GetDataForLoc]
(@Loc as CHAR(5))
AS
BEGIN
SET NOCOUNT ON
SELECT * FROM tblData
WHERE Loc = @Loc
OPTION ( OPTIMIZE FOR (@Loc = 'LocA') )
END
 

FREE SQL SERVER WEBINARS

SQL Analysis Server: Analysis Services Storage Modes

References: http://www.msbiconcepts.com/2010/09/analysis-services-storage-modes.html

Thursday, September 30, 2010

Analysis Services Storage Modes

In SQL Server 2008 and SQL Server 2005, there are three storage modes available for use: Multidimensional OLAP, Relational OLAP and Hybrid OLAP. I have tried to compare merits and demerits of each storage option in this post. With storage option, SQL Server analysis services provides one more useful functionality “Proactive caching” which I will cover in my next blog.

Well now question arises, when each storage modes are useful?

There are few factors that plays important role in deciding storage modes approach-

  1. Query Response time
  2. Latency
  3. Data Volume (Analysis Server)
  4. Processing Time
Storage
Mode
Query
Response time
Latency Data
Volume(Analysis Server)
Processing
Time
MOLAP Low High High High
ROLAP High Low Low Low
HOLAP Medium Medium Medium High

1. MOLAP

Detailed Data, Aggregation and Metadata are stored on Analysis Server in MOLAP. Analysis Services pulls complete detailed data from relational database on analysis server and then creates aggregations. MOLAP keep data (detailed data and aggregations) on same analysis server so query response time is significantly better than other two approaches. MOLAP has high latency because it remains disconnected from relational DB and connects again for pulling new changed data. If data volume is huge then latency will be higher. We can overcome at certain extent by using incremental processing.

2. ROLAP

Detailed data and aggregations will reside in relational database and metadata will be on Analysis server. In ROLAP, analysis server sends request to relational database for each query. Among all storage modes Query performance will be lowest in ROLAP (in most of the cases) because each query needs to be addressed by relation DB and these will be needed extra time to pass result over the network to OLAP server. Latency will be low in ROLAP as data and aggregations are on relational database.

3. HOLAP

HOLAP is middle approach which is somewhere between MOLAP and ROLAP. Detailed data reside in relational database but aggregations and metadata reside in analysis server. If query hits aggregations (query can be addressed by aggregations) then HOLAP query response time is similar to MOLAP but if query needs to be addressed by relational database then Query response time will be similar to ROLAP. Latency is somewhere between ROLAP and MOLAP.

Conclusion

If relational database size in few terabytes (<=5TB), enough hardware (memory and CPU power) and incremental frequency in days, week or month etc. then prefer MOLAP. It provides best Query response time.

ROLAP and HOLAP are mostly useful when latency is quite low (in seconds, minutes and hours). Mostly project go with MOLAP and rarely with ROLAP approach. I don’t have any industry stats but if I find something, I will provide information in this post.

I have not seen HOLAP implantation related case studies/articles so far on the net. If you have come across HOLAP implementation then please share the details.