Blog Home  Home Feed your aggregator (RSS 2.0)  
Scott Klueppel's Blog - SQL
.NET Discourse
 
# Saturday, September 06, 2008

I really need to read up on new features when a major release comes out. Just a few weeks ago I learned of a great "new" SQL 2005 function... ROW_NUMBER(). Just in time since SQL 2008 is already out.

For me, this function means a lot less temp tables. I would typically create a temp table with an ID INT IDENTITY(1,1) column to create an DisplayOrder, BatchID, etc. used to group or join on later. Books Online describes the function as "Returns the sequential number of a row within a partition of a result set, starting at 1 for the first row in each partition." The syntax is simple, and looks like:

ROW_NUMBER() OVER (ORDER BY ID DESC)

For this example, the data I want to bring back with a DisplayOrder column looks like:

pers_subs_data

Without ROW_NUMBER(), using a table variable with an identity column:

DECLARE @Subs TABLE (DisplayOrder INT IDENTITY(1,1), [Address] VARCHAR(100), Operation VARCHAR(50), [Contract] VARCHAR(50))

INSERT INTO @Subs ([Address], Operation, [Contract])
SELECT [Address]
    , Operation
    , [Contract]
FROM PersistentSubscribers
WHERE Operation = 'OnEvent2'
ORDER BY ID DESC

SELECT * FROM @Subs

With ROW_NUMBER(), look how beautiful:

SELECT DisplayOrder = ROW_NUMBER() OVER (ORDER BY ID DESC)
    , [Address]
    , Operation
    , [Contract]
FROM PersistentSubscribers
WHERE Operation = 'OnEvent2'

The results from both methods looks like:

 

row_number_results

Saturday, September 06, 2008 10:33:25 PM (Eastern Standard Time, UTC-05:00)  #    Comments   SQL  | 
# Saturday, June 21, 2008

A few months after SQL 2005 was released and hit the productions servers, some people started experiencing some odd behavior in their stored procedures. Simple stored procedures that normally return in 0 seconds would take upwards of a minute to return. Even more strange was the fact that the same query, outside of a stored procedure, would still return in 0 seconds.

It never affected me personally... until today. Three years late to the party. It's funny how much more interested I am in the causes and solutions for this apparent problem when it affects me. "Parameter Sniffing" is the term Microsoft uses to describe the feature that causes this odd behavior. While it appeared as an issue when I encountered it today, I found that the feature is not only well-intentioned but quite useful.

The execution plan is generated and cached the first time your stored procedure is called. When the execution plan is being created, SQL Server reads the input parameters and uses them to optimize the execution plan for those parameters. This is called "parameter sniffing." If the input parameters used in the first call to the stored procedure are atypical for the overall use of the stored procedure, a less than ideal execution plan will be cached for all subsequent calls.

Simply dropping and recompiling the stored procedure does not seem to affect the cached execution plan. Updating statistics on the tables used in the stored procedure will cause the execution plan to be regenerated on the next call of the stored procedure. However, if the same or similar atypical parameters are used on the first execution of the stored procedure, an equally sub-optimal execution plan will be cached.

You can turn off parameter sniffing. This is accomplished by assigning the input parameter values to local variables inside the stored procedure and then using the local variables within the stored procedure. When the execution plan is created, SQL Server will look at the table statistics to optimize the query for the "average" use. It does this by looking at the tables used in the query and analyzing row counts, etc. to find a reasonable plan that will likely suit a majority of situations.

My stored procedure was bringing back multiple resultsets to be used to create a hierarchical structure in code. It works essentially like the following:

CREATE PROCEDURE [dbo].[usp_Order_GetOrderDetails]
(
   @StartOrderId INT,
   @EndOrderId INT
)
AS
BEGIN

   SELECT *
   FROM Order
   WHERE OrderId BETWEEN @StartOrderId AND @EndOrderId
 
   SELECT *
   FROM OrderLineItem
   WHERE OrderId BETWEEN @StartOrderId AND @EndOrderId
END

I was testing the stored procedure for full day using the same ID for @StartOrderId and @EndOrderId. Since the intended use of this stored procedure is almost always @EndOrderId = @StartOrderId + 1000, this makes a big difference when calculating the estimate number of rows returned. I forced SQL Server to assume that my execution plan should be based on an ID range of 1 instead of 1000. Turning off parameter sniffing lessens these effects.

To turn off parameter sniffing, it would look like this:

CREATE PROCEDURE [dbo].[usp_Order_GetOrderDetails]
(
   @StartOrderId INT,
   @EndOrderId INT
)
AS
BEGIN
   DECLARE @Start INT
   DECLARE @End INT
   SET @Start = @StartOrderId
   SET @End = @EndOrderId
 

   SELECT *
   FROM Order
   WHERE OrderId BETWEEN @Start AND @End
 
   SELECT *
   FROM OrderLineItem
   WHERE OrderId BETWEEN @Start AND @End
END

This immediately improved the performance of my stored procedure. The time to complete reduced from ~2 minutes to ~2 seconds for my typical 1000 ID range (I know 2 seconds is a lot, but these tables have millions and millions of rows). But only one piece of code in the application calls this stored procedure, and 99 out of 100 times it will have a range of 1000 IDs. Why would I want SQL Server to guess how many Orders I will typically bring back when I know the exact number?

I should have the optimal execution plan if I update statistics on Order and OrderLineItem, and then call usp_Order_GetOrderDetails 1, 1000 after I compile this stored procedure. This sounds like a lot of work to me, and I did not notice any performance boost by doing this. I chose to leave parameter sniffing off.

The only drawbacks to turning off parameter sniffing is the weird looking SQL and the inevitable questions during code review about the crazy input parameter to variable mapping. But when you school the doubters on the causes and effects of parameter sniffing, it will put another notch in your guru stick.

From what I have read, this was not a new feature in SQL 2005. I can't, however, find any mention of it in SQL 2000 books online, and this feature never showed its face in SQL 2000.

Saturday, June 21, 2008 9:24:02 PM (Eastern Standard Time, UTC-05:00)  #    Comments   Database | SQL  | 
Copyright © 2008 Scott Klueppel. All rights reserved.