SQL Partition: How to Partition an existing SQL Server Table

Reference: http://www.mssqltips.com/sqlservertip/2888/how-to-partition-an-existing-sql-server-table/?utm_source=dailynewsletter&utm_medium=email&utm_content=headline&utm_campaign=20130221

How to Partition an existing SQL Server Table

By: Ben Snaidero | Read Comments (1) | Related Tips: More > Partitioning

A business can’t function on bad, faulty data. Learn how to improve your data quality with SSIS.

Problem

I have read many articles on SQL Server partitioning and how to create a partitioned table, but I have an existing SQL Server database that has a few very large tables that could benefit from partitioning. What are the steps required to partition an already existing table? Check out this tip to learn more.

Solution

There are two different approaches we could use to accomplish this task. The first would be to create a brand new partitioned table (you can do this by following this tip) and then simply copy the data from your existing table into the new table and do a table rename. Alternatively, as I will outline below, we can partition the table in place simply by rebuilding or creating a clustered index on the table.

Sample SQL Server Table and Data to Partition

--Table/Index creation
CREATE TABLE [dbo].[TABLE1] 
([pkcol] [int] NOT NULL,
 [datacol1] [int] NULL,
 [datacol2] [int] NULL,
 [datacol3] [varchar](50) NULL,
 [partitioncol] datetime)
GO
ALTER TABLE dbo.TABLE1 ADD CONSTRAINT PK_TABLE1 PRIMARY KEY CLUSTERED (pkcol) 
GO
CREATE NONCLUSTERED INDEX IX_TABLE1_col2col3 ON dbo.TABLE1 (datacol1,datacol2)
 WITH (STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, 
 ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) 
 ON [PRIMARY]
GO

SQL Performance: Introduction to Filtered Index – Improve performance with Filtered Index

Reference: http://blog.sqlauthority.com/2008/09/01/sql-server-2008-introduction-to-filtered-index-improve-performance-with-filtered-index/

SQL SERVER – 2008 – Introduction to Filtered Index – Improve performance with Filtered Index

September 1, 2008 by pinaldave

Filtered Index is a new feature in SQL SERVER 2008. Filtered Index is used to index a portion of rows in a table that means it applies filter on INDEX which improves query performance, reduce index maintenance costs, and reduce index storage costs compared with full-table indexes.

When we see an Index created with some WHERE clause then that is actually a FILTERED INDEX.

For Example,

If we want to get the Employees whose Title is “Marketing Manager”, for that let’s create an INDEX on EmployeeID whose Title is “Marketing Manager” and then write the SQL Statement to retrieve Employees who are “Marketing Manager”.

CREATE NONCLUSTERED INDEX NCI_Department
ON HumanResources.Employee(EmployeeID)
WHERE Title= 'Marketing Manager'

Points to remember when creating Filtered Index:

– They can be created only as Nonclustered Index
– They can be used on Views only if they are persisted views.
– They cannot be created on full-text Indexes.

Let us write simple SELECT statement on the table where we created Filtered Index.

SELECT he.EmployeeID,he.LoginID,he.Title
FROM HumanResources.Employee he
WHERE he.Title = 'Marketing Manager'

Now we will see the Execution Plan and compare the performance before and after the Filtered Index was created on Employee table.

As we can see, in first case the index scan in 100% done on Clustered Index taking 24% of total cost of execution. Once the Filtered index was created on Employee table, the Index scan is 50% on Clustered Index and 50% on Nonclustered Index which retrieves the data faster taking 20% of total cost of execution compared to 24% on table with no index.

If we have table with thousands of records and we are only concern with very few rows in our query we should use Filtered Index.

Conclusion:

A filtered Index is an optimized non clustered Index which is one of the great performance improvements in SQL SERVER 2008 reducing the Index storage cost and reduces maintenance cost.

SQL : Stored Procedures and SQL Injection

http://www.sqlservercentral.com/articles/Security/96328/

Stored Procedures and SQL Injection

By Brian Kelley, 2013/02/18

The fact of the matter is that SQL injection is still a problem at the end of 2012. Web sites are still being hacked by using SQL injection on a regular basis. The usual response from Microsoft.NET developers is that they can parameterize in their code, which solves things. That’s very true. That’s effective. However, the problem is with a series of unspoken assumptions.

The first unspoken assumption is that only developers who know what they are doing with respect to defensive coding will ever touch the web application in question. The second unspoken assumption is that should another application hit against the same database (or set of databases), you’ll have a similar level of competence. A final unspoken assumption is that the application will be in a Microsoft.NET language.

That’s an awful lot of assumptions and as a security professional, I am uneasy when I see a list of assumptions like that. Anyone who has been a developer for a reasonable amount of time eventually runs across a case where a previous developer did a shoddy job, either due to being rushed or due to a lack of ability/skill/experience. Since experienced developers have all run across this situation, those assumptions I mentioned earlier mean we can be in real trouble from the SQL Server side. That’s why I still like stored procedures for accessing data in SQL Server.

The Disclaimer:

I realize that there are those who are hard core with respect to not using stored procedures. Questions about performance, ease of use, extensibility, platform agnostic coding, etc., should all factor into how you build anything. Looking at any one piece without the rest of the considerations often leads to poor architecture decisions. This isn’t some polemic defense for stored procedures. Rather, I’m merely trying to show how stored procedures can be used to make SQL injection attacks more difficult to execute successfully. Whether or not you use stored procedures in your solutions should be weighed with all of the other factors considered.

Information Disclosure in the Database:

Prior to SQL Server 2005, if you queried the sysobjects table in a database, you saw all the objects. Even if you didn’t have access to an object, you still saw it. If you’re still using SQL Server 2000, this article doesn’t really apply to you, and let me strongly encourage you to consider upgrading if you aren’t already pushing the subject within your organization.

As of SQL Server 2005, when you query sysobjects or sys.objects or any of the other views that reveal metadata about the database, you only see information on the objects you have permissions to access. Therefore, if you don’t have SELECT, INSERT, UPDATE, or DELETE permissions against a table, you can’t learn of its existence. If you don’t have EXECUTE rights against a stored procedure, you don’t know it’s there. The reason I’m making this point is that it’s nearly impossible to attack something you don’t know of within SQL Server. This isn’t physical warfare where you can serendiptiously hit an enemy formation with a stray bomb or missile.

Demonstration:

We can talk about this information disclosure protection or we can see it in action. Let’s do the latter. First, we want to create a sample database along with a database only user. This database only user will serve the purpose of being what we test with to demonstrate how SQL Server protects information.

CREATE DATABASE WebTest;
GO 
USE WebTest;
GO 
CREATE USER WebUser WITHOUT LOGIN;
GO 

Now that we have that done, let’s create some objects to work with. We’ll create a schema, Internet, because later I’m going to give the user EXECUTE rights against that schema. We’ll also create a table.

CREATE SCHEMA Internet AUTHORIZATION dbo;
GO 
CREATE TABLE Internet.ExampleTable (
SomeID INT IDENTITY(1,1) NOT NULL,
SomeColumn CHAR(5) NOT NULL,
CONSTRAINT PK_ExampleTable PRIMARY KEY (SomeID)
);
GO

With that done, let’s see the difference in what we can see with a super user account and what the limited account WebUser can see.

SQL Performance: SQL Server Performance Tuning and Monitoring

http://www.mssqltips.com/sqlservertutorial/276/sql-server-performance-tuning-and-monitoring/

SQL Server Performance Tuning and Monitoring

(Introduction)

Overview

SQL Server is a great platform to get your database application up and running fast. The graphical interface of SQL Server Management Studio allows you to create tables, insert data, develop stored procedures, etc… in no time at all. Initially your application runs great in your production, test and development environments, but as use of the application increases and the size of your database increases you may start to notice some performance degradation or worse yet, user complaints.

This is where performance monitoring and tuning come into play. Usually the first signs of performance issues surface from user complaints. A screen that used to load immediately now takes several seconds. Or a report that used to take a few minutes to run now takes an hour. As I mentioned these issues usually arise from user complaints, but with a few steps and techniques you can monitor these issues and tune accordingly, so that your database applications are always running at peak performance.

In this tutorial we will cover some of the common issues with performance such as:

  • deadlocks
  • blocking
  • missing and unused indexes
  • I/O bottlenecks
  • poor query plans
  • statistics
  • wait stats
  • fragmentation

We will look at basic techinques all DBAs and Developers should be aware of to make sure their database applications are performing at peak performance.

http://www.mssqltips.com/sqlservertutorial/282/performance-related-tools/

Performance Related Tools

Overview

In order to monitor and improve performance for your SQL Server environment you need to know what tools are available and how to use each of these tools.

In this section we will look at the following tools to give you an introduction as to what they are used for an how you can use them to collect performance related data.

  • Dynamic Management Views (DMVs) and System Catalog Views
  • Profiler and Server Side Traces
  • Windows Performance Monitor
  • Built in performance reports in SSMS
  • Query Plans
  • Database Tuning Advisor

http://www.mssqltips.com/sqlservertutorial/273/dynamic-management-views/

Dynamic Management Views

(Tools)

Overview

With the introduction of SQL Server 2005, Microsoft introduced Dynamic Management Views (DMVs) which allow you to get better insight into what is happening in SQL Server. Without these new tools a lot of the information was unavailable or very difficult to obtain.

DMVs are a great tool to help troubleshoot performance related issues and once you understand their power they will become a staple for your Database Administration.

Explanation

The DMVs were introduced in SQL 2005 and with each new release, Microsoft has been adding additional DMVs to help troubleshoot issues. DMVs actually come in two flavors DMVs (dynamic management views) and DMFs (dynamic management functions) and are sometimes classified as DMOs (dynamic management objects). The DMVs act just like any other view where you can select data from them and the DMFs require values to be passed to the function just like any other function.

The DMVs are broken down into the following categories:

Here are some of the more useful DMVs that you should familiarize yourself with:

Additional Information

Here are some additional articles about DMVs.

http://www.mssqltips.com/sqlservertutorial/264/performance-issues/

Performance Issues

Overview

There are several factors that can degrade SQL Server performance and in this section we will investigate some of the common areas that can effect performance. We will look at some of the tools that you can use to identify issues as well as review some possible remedies to fix these performance issues.

We will cover the following topics:

  • Blocking
  • Deadlocks
  • I/O
  • CPU
  • Memory
  • Role of statistics
  • Query Tuning Bookmark Lookups
  • Query Tuning Index Scans

SQL Performance: Investigating I/O bottlenecks

http://www.mssqltips.com/sqlservertutorial/254/investigating-io-bottlenecks/

Investigating I/O bottlenecks

(Performance Issues)

Overview

SQL Server is usually a high I/O activity process and in most cases the database is larger than the amount of memory installed on a computer and therefore SQL Server has to pull data from disk to satisfy queries. In addition, since the data in databases is constantly changing these changes need to be written to disk. Another process that can consume a lot of I/O is the TempDB database. The TempDB database is a temporary working area for SQL Server to do such things as sorting and grouping. The TempDB database also resides on disk and therefore depending on how many temporary objects are created this database could be busier than your user databases.

Since I/O is such an important part of SQL Server performance you need to make sure your disk subsystem is not the bottleneck. In the old days this was much easier to do, since most servers had local attached storage. These days most SQL Servers use SAN or NAS storage or to further complicate things more and more SQL Servers are running in a virtualized environment.

Explanation

There are several different methods that can be used to track I/O performance, but as mentioned above with SAN / NAS storage and virtualized SQL Server environments, this is getting harder and harder to track as well as the rules have changed as far as what should be tracked to determine if there is an I/O bottleneck. The advantage is that there are several tools available at both the storage level and the virtual level to aid in performance, but we will not cover these here.

There are basically two options that you have to monitor I/O bottlenecks, SQL Server DMVs and Performance Monitor counters. There are other tools as well, but these are two options that will exist in every SQL Server environment.

DMV – sys.dm_io_virtual_file_stats

This DMV will give you cumulative file stats for each database and each database file including both the data and log files. Based on this data you can determine which file is the busiest from a read and/or write perspective.

The output also includes I/O stall information for reads, writes and total. The I/O stall is the total time, in milliseconds, that users waited for I/O to be completed on the file. By looking at the I/O stall information you can see how much time was waiting for I/O to complete and therefore the users were waiting.

The data that is returned from this DMV is cumulative data, which means that each time you restart SQL Server the counters are reset. Since the data is cumulative you can run this once and then run the query again in the future and compare the deltas for the two time periods. If the I/O stalls are high compared to the length of the that time period then you may have an I/O bottleneck.

SELECT 
cast(DB_Name(a.database_id) as varchar) as Database_name,
b.physical_name, * 
FROM 
sys.dm_io_virtual_file_stats(null, null) a 
INNER JOIN sys.master_files b ON a.database_id = b.database_id and a.file_id = b.file_id
ORDER BY Database_Name

Here is partial output from the above command.

Performance Monitor

Performance Monitor is a Windows tool that let’s you capture statistics about SQL Server, memory usage, I/O usage, etc… This tool can be run interactively using the GUI or you can set it up to collected information behind the scenes which can be reviewed at a later time. This tool is found in the Control Panel under Administrative tools.

There are several counters related to I/O and they are located under Physical Disk and Logical Disk. The Physical Disk performance object consists of counters that monitor hard or fixed disk drive on a computer. The Logical Disk performance object consists of counters that monitor logical partitions of a hard or fixed disk drives. For the most part, they both contain the same counters. In most cases you will probably use the Physical Disk counters. Here is a partial list of the available counters.

Now that storage could be either local, SAN, NAS, etc… these two counters are helpful to see if there is a bottleneck:

  • Avg. Disk sec/Read is the average time, in seconds, of a read of data from the disk.
  • Avg. Disk sec/Write is the average time, in seconds, of a write of data to the disk.

The recommendation is that the values for both of these counters be less than 20ms. When you capture this data the values will be displayed as 0.000, so a value of 0.050 equals 50ms.

Resource Monitor

Another tool that you can use is the Resource Monitor. This can be launched from Task Manager or from the Control Panel.

Below you can see the Disk tab that shows current processes using disk, the active disk files and storage at the logical and physical level. The Response Time (ms) is helpful to see how long it is taking to service the I/O request.

Additional Information

I/O issues may not always be a problem with your disk subsystem. Just because you see a slow down or I/O waits occurring there may be other issues that you need to consider such as missing indexes, poorly written queries, fragmentation or out of date statistics. We will cover these topics as well in this tutorial.

Here are some additional articles about I/O performance.

SQL Performance: Discovering Unused Indexes

http://www.mssqltips.com/sqlservertutorial/256/discovering-unused-indexes/

Discovering Unused Indexes

(Performance Issues)

Overview

To ensure that data access can be as fast as possible, SQL Server like other relational database systems utilizes indexing to find data quickly. SQL Server has different types of indexes that can be created such as clustered indexes, non-clustered indexes, XML indexes and Full Text indexes.

The benefit of having more indexes is that SQL Server can access the data quickly if an appropriate index exists. The downside to having too many indexes is that SQL Server has to maintain all of these indexes which can slow things down and indexes also require additional storage. So as you can see indexing can both help and hurt performance.

In this section we will focus on how to identify indexes that exist, but are not being used and therefore can be dropped to improve performance and decrease storage requirements.

Explanation

When SQL Server 2005 was introduced it added Dynamic Management Views (DMVs) that allow you to get additional insight as to what is going on within SQL Server. One of these areas is the ability to see how indexes are being used. There are two DMVs that we will discuss. Note that these views store cumulative data, so when SQL Server is restated the counters go back to zero, so be aware of this when monitoring your index usage.

DMV – sys.dm_db_index_operational_stats

This DMV allows you to see insert, update and delete information for various aspects for an index. Basically this shows how much effort was used in maintaining the index based on data changes.

If you query the table and return all columns, the output may be confusing. So the query below focuses on a few key columns. To learn more about the output for all columns you can check out Books Online.

SELECT OBJECT_NAME(A.[OBJECT_ID]) AS [OBJECT NAME], 
 I.[NAME] AS [INDEX NAME], 
 A.LEAF_INSERT_COUNT, 
 A.LEAF_UPDATE_COUNT, 
 A.LEAF_DELETE_COUNT 
FROM SYS.DM_DB_INDEX_OPERATIONAL_STATS (db_id(),NULL,NULL,NULL ) A 
 INNER JOIN SYS.INDEXES AS I 
 ON I.[OBJECT_ID] = A.[OBJECT_ID] 
 AND I.INDEX_ID = A.INDEX_ID 
WHERE OBJECTPROPERTY(A.[OBJECT_ID],'IsUserTable') = 1

Below we can see the number of Inserts, Updates and Deletes that occurred for each index, so this shows how much work SQL Server had to do to maintain the index.

DMV – sys.dm_db_index_usage_stats

This DMV shows you how many times the index was used for user queries. Again there are several other columns that are returned if you query all columns and you can refer to Books Online for more information.

SELECT OBJECT_NAME(S.[OBJECT_ID]) AS [OBJECT NAME], 
 I.[NAME] AS [INDEX NAME], 
 USER_SEEKS, 
 USER_SCANS, 
 USER_LOOKUPS, 
 USER_UPDATES 
FROM SYS.DM_DB_INDEX_USAGE_STATS AS S 
 INNER JOIN SYS.INDEXES AS I ON I.[OBJECT_ID] = S.[OBJECT_ID] AND I.INDEX_ID = S.INDEX_ID 
WHERE OBJECTPROPERTY(S.[OBJECT_ID],'IsUserTable') = 1
 AND S.database_id = DB_ID()

Here we can see seeks, scans, lookups and updates.

  • The seeks refer to how many times an index seek occurred for that index. A seek is the fastest way to access the data, so this is good.
  • The scans refers to how many times an index scan occurred for that index. A scan is when multiple rows of data had to be searched to find the data. Scans are something you want to try to avoid.
  • The lookups refer to how many times the query required data to be pulled from the clustered index or the heap (does not have a clustered index). Lookups are also something you want to try to avoid.
  • The updates refers to how many times the index was updated due to data changes which should correspond to the first query above.

Identifying Unused Indexes

So based on the output above you should focus on the output from the second query. If you see indexes where there are no seeks, scans or lookups, but there are updates this means that SQL Server has not used the index to satisfy a query but still needs to maintain the index. Remember that the data from these DMVs is reset when SQL Server is restarted, so make sure you have collected data for a long enough period of time to determine which indexes may be good candidates to be dropped.

Additional Information

Here are some additional articles about indexes.

SQL Performance: Eliminating bookmark (key/rid) lookups

http://www.mssqltips.com/sqlservertutorial/258/eliminating-bookmark-keyrid-lookups/

Eliminating bookmark (key/rid) lookups

(Performance Issues)

Overview

When we were looking at the index scan and table scan section we were able to eliminate the scan which was replaced with an index seek, but this also introduced a Key Lookup which is something else you may want to eliminate to improve performance.

A key lookup occurs when data is found in a non-clustered index, but additional data is needed from the clustered index to satisfy the query and therefore a lookup occurs. If the table does not have a clustered index then a RID Lookup occurs instead.

In this section we will look at how to find Key/RID Lookups and ways to eliminate them.

Explanation

The reason you would want to eliminate Key/RID Lookups is because they require an additional operation to find the data and may also require additional I/O. I/O is one of the biggest performance hits on a server and any way you can eliminate or reduce I/O is a performance gain.

So let’s take a look at an example query and the query plan. Before we do this we want to first add the nonclustered index on LastName.

USE [AdventureWorks]
GO
CREATE NONCLUSTERED INDEX [IX_LastName]
ON [Person].[Contact] ([LastName])
GO

Now we can use Ctrl+M to turn on the actual execution plan and run the select.

SELECT * FROM Person.Contact WHERE LastName = 'Russell'

If we look at the execution plan we can see that we have an Index Seek using the new index, but we also have a Key Lookup on the clustered index. The reason for this is that the nonclustered index only contains the LastName column, but since we are doing a SELECT * the query has to get the other columns from the clustered index and therefore we have a Key Lookup. The other operator we have is the Nested Loops this joins the results from the Index Seek and the Key Lookup.

So if we change the query as follows and run this again you can see that the Key Lookup disappears, because the index includes all of the columns.

SELECT LastName FROM Person.Contact WHERE LastName = 'Russell'

Here we can see that we no longer have a Key Lookup and we also no longer have the Nested Loops operator.

If we run both of these queries at the same time in one batch we can see the improvement by removing these two operators.

SELECT * FROM Person.Contact WHERE LastName = 'Russell'
SELECT LastName FROM Person.Contact WHERE LastName = 'Russell'

Below we can see that the first statement takes 99% of the batch and the second statement takes 1%, so this is a big improvement.

This should make sense that since the index includes LastName and that is the only column that is being used for both the SELECTed columns and the WHERE clause the index can handle the entire query. Another thing to be aware of is that if the table has a clustered index we can include the clustered index column or columns as well without doing a Key Lookup.

The Person.Contact table has a clustered index on ContactID, so if we include this column in the query we can still do just an Index Seek.

SELECT ContactID, LastName FROM Person.Contact WHERE LastName = 'Russell'

Here we can see that we only need to do an Index Seek to include both of these columns.

So that’s great if that is all you need, but what if you need to include other columns such as FirstName. If we change the query as follows then the Key Lookup comes back again.

SELECT FirstName, LastName FROM Person.Contact WHERE LastName = 'Russell'

Luckily there are a few options to handle this.

Creating a Covering Index

A covering index basically does what it implies, it covers the query by including all of the columns that are needed. So if our need is to always include FirstName and LastName we can modify our index as follows to include both LastName and FirstName.

DROP INDEX [IX_LastName] ON [Person].[Contact]
GO
CREATE NONCLUSTERED INDEX [IX_LastName]
ON [Person].[Contact] ([LastName], [FirstName])
GO

And if we look at the execution plan we can see that we eliminated the Key Lookup once again.

Creating an Index with Included Columns

Another option is to use the included columns feature for an index. This allows you to include additional columns so they are stored with the index, but are not part of the index tree. So this allows you to take advantage of the features of a covering index and reduces storage needs within the index tree. Another benefit is that you can include additional data types that can not be part of a covering index.

The syntax for the the index with included columns is as follows:

DROP INDEX [IX_LastName] ON [Person].[Contact]
GO
CREATE NONCLUSTERED INDEX [IX_LastName]
ON [Person].[Contact] ([LastName]) 
INCLUDE ([FirstName])
GO

Here we can see the exuection plan is the same for both options.

Additional Information

Here are some additional items related to the Key/RID Lookups.