SQL Server RANK, ROW_NUMBER, and Partition

Reference: http://mitchelsellers.com/blogs/2008/11/01/sql-server-rank–row_number–and-partition.aspx

SQL Server RANK, ROW_NUMBER, and Partition

Posted by Mitchel on Saturday, November 01, 2008Permalink

Finally getting the opportunity to get back to SQL Server 2005 development, and coming to the conclusion that NOT all of my projects have to support SQL Server 2000 I started looking all of those "fun" new items that we all really wanted. This article is going to go through a scenario that demonstrates how to use Common Table Expressions, Rank() and Partition to get results for a very common data scenario. We will present this information with an introduction to the problem, scripts to setup the problem in a test environment, and lastly the implementation, with a review/summary at the end.

The Problem

Having an input source of data that looks similar to the below table it is easy to see how this could be real data. Take for example a data log file, where you have multiple error types that can occur with different messages.

RecordId | TypeId | ActivityDate | MessageText

Advertisements

SQL: RANK() VS DENSE_RANK() With an Example using SQL Server 2005/2008

Reference: http://www.sqlservercurry.com/2009/04/rank-vs-denserank-with-example-using.html

RANK() VS DENSE_RANK() With an Example using SQL Server 2005/2008

Friday, April 03, 2009 Posted bySuprotim Agarwal
Labels:Transact SQL T-SQL

The RANK()function in SQL Server returns the position of a value within the partition of a result set, with gaps in the ranking where there are ties.

The DENSE_RANK() function in SQL Server returns the position of a value within the partition of a result set, leaving no gaps in the ranking where there are ties.

Let us understand this difference with an example and then observe the results while using these two functions:

We will run two queries, one using RANK() and the other using DENSE_RANK() and observe the difference in the results. We will be using the ORDERS table of the NORTHWIND database to demonstrate the difference. The query will fetch the list of Customers ordered by the highest number of orders each has placed.

Using the RANK() function

SELECT RANK() OVER (ORDER BY TotCnt DESC) AS TopCustomers, CustomerID, TotCnt

FROM (SELECT CustomerID, COUNT(*) AS TotCnt

FROM Orders Group BY CustomerID) AS Cust

OUTPUT

As shown in the results above, while using the RANK() function, if two or more rows tie for a rank, each tied rows receives the same rank, however with gaps in the ranking where there are ties. For example, Customers ‘FOLKO’ and ‘HUNGO’ have the same number of orders(i.e. 19), so they are both ranked 4. The Customers with the next highest order(HILAA, BERGS, RATTC) are ranked number 6 instead of 5, because there are two rows that are ranked higher at 4.

Using the DENSE_RANK() function

SELECT DENSE_RANK() OVER (ORDER BY TotCnt DESC) AS TopCustomers, CustomerID, TotCnt

FROM (SELECT CustomerID, COUNT(*) AS TotCnt

FROM Orders Group BY CustomerID) AS Cust

OUTPUT

As shown in the results above, while using the DENSE_RANK() function, if two or more rows tie for a rank in the same partition, each tied rows receives the same rank, however leaving no gaps in the ranking where there are ties. Customers ‘FOLKO’ and ‘HUNGO’ have the same number of orders(i.e. 19), so they are both ranked 4. The Customers with the next highest order(HILAA, BERGS, RATTC) are ranked number 5. This is not the same as the RANK() function where the Customer with the next highest number of orders were ranked number 6.

Well I hope after seeing these example, you will understand the difference between the RANK() and DENSE_RANK() and will know where to use what.

SQL 2008: Change Tracking Overview

Change Tracking Overview

http://msdn.microsoft.com/en-us/library/bb933875(v=sql.105).aspx –Overview of change tracking

http://msdn.microsoft.com/en-us/library/bb964713(v=sql.105).aspx –Configuring change tracking in database and table level

http://msdn.microsoft.com/en-us/library/bb934145(v=sql.105).aspx –Query change tracking information using changetable function

ALTER DATABASE Test

SET CHANGE_TRACKING = ON

(CHANGE_RETENTION = 1 DAYS, AUTO_CLEANUP = ON);

ALTER TABLE IsolationTests

ENABLE CHANGE_TRACKING

WITH (TRACK_COLUMNS_UPDATED = ON);

SELECT * FROM sys.change_tracking_databases

select * FROM sys.change_tracking_tables

select object_name(2137058649)

–SELECT * INTO IsolationTestBackup FROM IsolationTests

select * FROM IsolationTests

select * FROM IsolationTestBackup

update IsolationTests

SET Col1 = 4 WHERE Id=2

–delete from IsolationTestBackup WHERE id = 8 OR id = 9

SELECT e.Id, e.Col1, e.Col2, e.Col3,

c.SYS_CHANGE_VERSION, c.SYS_CHANGE_CONTEXT

FROM IsolationTests AS e

CROSS APPLY CHANGETABLE

(VERSION IsolationTests,(ID),(e.Id)) AS c;

DECLARE @last_sync_version bigint;

–SET @last_sync_version = <value obtained from query>;

SELECT id

SYS_CHANGE_VERSION, SYS_CHANGE_OPERATION,

SYS_CHANGE_COLUMNS, SYS_CHANGE_CONTEXT

FROM CHANGETABLE (CHANGES IsolationTests, @last_sync_version) AS C;

SQL: Deleting Data in SQL Server with TRUNCATE vs DELETE commands

http://www.mssqltips.com/sqlservertip/1080/deleting-data-in-sql-server-with-truncate-vs-delete-commands/?utm_source=dailynewsletter&utm_medium=email&utm_content=headline&utm_campaign=201319

Deleting Data in SQL Server with TRUNCATE vs DELETE commands

Problem
There are two main keywords used for deleting data from a table: TRUNCATE and DELETE. Although each achieves the same result, the methods employed for each vastly differ. There are advantages, limitations, and consequences of each that you should consider when deciding which method to use.

Solution
Deleting Data Using TRUNCATE TABLE

TRUNCATE TABLE is a statement that quickly deletes all records in a table by deallocating the data pages used by the table. This reduces the resource overhead of logging the deletions, as well as the number of locks acquired; however, it bypasses the transaction log, and the only record of the truncation in the transaction logs is the page deallocation. Records removed by the TRUNCATE TABLE statement cannot be restored. You cannot specify a WHERE clause in a TRUNCATE TABLE statement-it is all or nothing. The advantage to using TRUNCATE TABLE is that in addition to removing all rows from the table it resets the IDENTITY back to the SEED, and the deallocated pages are returned to the system for use in other areas.

In addition, TRUNCATE TABLE statements cannot be used for tables involved in replication or log shipping, since both depend on the transaction log to keep remote databases consistent.
TRUNCATE TABLE cannot used be used when a foreign key references the table to be truncated, since TRUNCATE statements do not fire triggers. This could result in inconsistent data because ON DELETE/UPDATE triggers would not fire. If all table rows need to be deleted and there is a foreign key referencing the table, you must drop the index and recreate it. If a TRUNCATE TABLE statement is issued against a table that has foreign key references, the following error is returned:

Deleting Data Using DELETE FROM Statement

DELETE TABLE statements delete rows one at a time, logging each row in the transaction log, as well as maintaining log sequence number (LSN) information. Although this consumes more database resources and locks, these transactions can be rolled back if necessary. You can also specify a WHERE clause to narrow down the rows to be deleted. When you delete a large number of rows using a DELETE FROM statement, the table may hang on to the empty pages requiring manual release using DBCC SHRINKDATABASE (db_name).
When large tables require that all records be deleted and TRUNCATE TABLE cannot be used, the following statements can be used to achieve the same result as TRUNCATE TABLE:

  • DELETE from "table_name"
  • DBCC CHECKIDENT("table_name", RESEED, "reseed_value")

Where does Outlook 2007 store opened attachments temporarily?

http://superuser.com/questions/159025/where-does-outlook-2007-store-opened-attachments-temporarily

Where does Outlook 2007 store opened attachments temporarily?

I had the exact same problem.

Like BBlake said, do this: *if all else fails, you can check in the registry to see where the temporary files are stored: HKEY_CURRENT_USER\Software\Microsoft\Office\12.0\Outlook\Security\OutlookSecureTempFolder*

It will point to a folder in c:…\Temporary Internet Files…

If you look in explorer and try to navigate to that folder, you won’t see it. Instead, copy the entire path in to explorer. It will get there (at least it did for me). Not sure why it is hidden. I have my folder options to show all hidden files, and it still doesn’t show up.

SQL: How To Build a Cube Without a Data Source Using SQL Server Analysis Services

http://www.mssqltips.com/sqlservertip/1529/how-to-build-a-cube-without-a-data-source-using-sql-server-analysis-services/

How To Build a Cube Without a Data Source Using SQL Server Analysis Services

By: Ray Barley

Problem
We are new to SSAS and are struggling with how to get started. We don’t yet have a relational data warehouse that we can use to populate a cube. When you go to build a cube there is an option to build the cube without a data source. Can you provide us with the details on how to go about doing this?

Solution
There are two ways to build a cube using Business Intelligence Developer Studio (BIDS). If you have existing relational data source(s) you can create a Data Source View (DSV) then proceed to build the cube using the DSV. If you do not have a relational data source, you can build a cube with BIDS by choosing to build a cube without a data source. In this case you focus on designing the dimensions and measures for your cube, then allow BIDS to generate the relational schema that you need to populate the cube.

In this tip we will walk through the steps to build a cube from scratch without using an existing relational data source. You will find that this approach is often convenient because to build a cube you really should have dimension and fact tables in an existing relational data warehouse as your starting point. Having a relational data source that supports your business operations as your starting point just isn’t a great idea since the operational database schema is designed to facilitate transaction processing as opposed to querying. You can jump start your effort to create a cube by designing the cube first, rather than designing the data warehouse then creating a cube from it.

Background

Before we create the cube, let’s provide a high level overview of the key concepts:

  • Dimension – a table that represents a real world entity; e.g. Customer, Product, Region, State, Calendar
  • Fact – a table that contains foreign keys to dimension tables and the numeric values (i.e. measures) that we want to summarize, average, etc; e.g. Orders
  • Dimensional Model – a group of dimension and fact tables designed for analyzing a business process

We join fact tables to dimension tables to slice and dice the data, typically filtering on columns in the dimension tables such as a date range, product category, customer type, etc. In SSAS a fact table is called a measure group.

Building the Cube

BIDS is the integrated development environment that comes with SQL Server 2005. Launch it from the Microsoft SQL Server 2005 program group. Let’s walk through the steps to build a cube from scratch.

Step 1: Create a new project by clicking File, New, Project from the menu then fill in the dialog as follows:

Step 2: Launch the cube wizard; right click Cubes in the Solution Explorer, then select New Cube from the context menu. Click the radio button to build the cube without a data source:

Step 3: Define the measures. Measures are the numeric values you want to analyze. Note that a Measure Group is represented as a single table in the relational schema that populates the cube. You can have multiple measure groups; we will just use one.

Step 4: Define the cube dimensions. Click the checkbox next to Time to add a dimension that will contain dates and various attributes such as year, quarter and month. I like to call this dimension Calendar. Add other dimensions such as Customer and Product.

Step 5: Define time periods. The Calendar dimension will be populated automatically; select the time periods that you want to use to analyze your data. The date range chosen matches the order date range in the AdventureWorksDW database, which we will use later to populate the cube with some sample data. This dimension is also used to perform detailed analysis using all sorts of criteria such as previous year, previous quarter, same period previous year, etc. For additional details see our earlier tip Using the Built-In Time Dimension and Time Intelligence in SQL Server Analysis Server.

Step 6: Specify additional calendars. We are skipping this step for our example. If you want to perform analysis based on a fiscal year calendar or some other calendar, you would set that up here.

Step 7: Specify the relationships between the measure group and dimensions; in our case the single measure group uses each dimension. In the generated relational schema, the fact_Order table will have a foreign key to each of the dimension tables checked.

Step 8: Complete the wizard. Enter the Cube name and click the Generate schema now checkbox; this will launch the Schema Generation Wizard immediately to walk through generating the relation schema that will be used to populate the cube. You could leave the Generate schema now checkbox unchecked and add some attributes to the dimensions; you can still generate the relational schema from the Database menu in the SSAS project.

Step 9: Create the Data Source View (DSV). The cube is populated from a DSV which is a logical view on top of one or more data sources. Click the New button (to the right of the Data Source drop down, not shown) to define a data source. The data source is the database where the generated relational schema will be added.

Step 10: Fill in the Connection Manager dialog; the database you enter or select must already exist.

Step 11: Choose how SSAS will connect to the data source. SSAS needs to query the data source to populate the cube.

Step 12: Provide a name for the new data source.

Step 13: Specify the schema generation options (just accept the defaults). Select Populate from the Populate time table(s) drop down list to automatically populate the Calendar dimension based on the options specified in Step 5 above.

Step 14: Specify the naming conventions (just accept the defaults).

At this point you will see the following Data Source View in BIDS that was generated based on the options selected in the preceding steps:

Step 15: Populate the data source with sample data from the AdventureWorksDW database that comes with SQL Server 2005. The purpose of this step is to put some data in the data source so we can use it to populate our cube. Run the script below:

USE MSSQLTipsDW
GO

SQL: Troubleshooting_Memory_Pressure

References: : http://www.mssqltips.com/sqlservertip/2827/troubleshooting-sql-server-resourcesemaphore-waittype-memory-issues/?utm_source=dailynewsletter&utm_medium=email&utm_content=headline&utm_campaign=201313

select * from sys.dm_exec_requests –Check the wait type, if the wait type has RESOURCE_SEMAPHORE then it is due to memory pressure.

SELECT * FROM sys.dm_exec_query_resource_semaphores –waiter_count is the number of queries waiting in queue for memory to get allocated

SELECT * FROM sys.dm_exec_query_memory_grants –grant time and granted_memory_kb is null means queries waiting for memory to get allocated. Take the note of the plan_hanlde and sql_handle.

SELECT * FROM sys.dm_exec_sql_text(0x030005008A543939E047AF0040A100000100000000000000) –pass the sql handle to get the name of the query waiting for memory to be allocated.

SELECT * FROM sys.dm_exec_query_plan(0x050005008A54393940419C24020000000000000000000000) –pass the plan_handle to get the execution plan for the above query waiting for memory to be allocated.