Friday, February 24, 2012

A question about execution plans

Hello, I'm using SQL 2000 with the latest updates.
I have a large table Call_Record (5 million rows) that has three indexes:
1. A clustered index on account_no ASC, date_start DESC
2. A non-clustered index on date_end DESC
3. A non-clustered index on call_record_id ASC
I'm querying it for failed calls in the last hour:
SELECT COUNT(*) AS failure_count, C.master_id_carrier, C.location_name,
D.description AS disconnect_reason
FROM dbo.Call_Record c
INNER JOIN dbo.Disconnect_Code D ON C.disconnect_code = D.code
WHERE (c.date_end >= (GetUtcDate() - (1.0/24.0)) )
AND D.is_failure = 1
GROUP BY C.master_id_carrier, C.location_name, D.description
The problem is that the execution plan shows that it is using index 1 to
perform this query, whereas index 2 is clearly the best choice. If I
replace the call to GetUtcDate() with a literal date constant like so:
SELECT COUNT(*) AS failure_count, C.master_id_carrier, C.location_name,
D.description AS disconnect_reason
FROM dbo.Call_Record c
INNER JOIN dbo.Disconnect_Code D ON C.disconnect_code = D.code
WHERE (c.date_end >= '2005/11/23')
AND D.is_failure = 1
GROUP BY C.master_id_carrier, C.location_name, D.description
then it does use index (2) as expected, and executes in a fraction of
the time. My question is, why does it pick the "incorrect" index for
the first query, and is there any way to force it to pick index (2)?
MikeMike
I tried to rewrite a little bit your SELECT
DBCC FREEPROCCACHE
GO
SELECT COUNT(*) AS failure_count, C.master_id_carrier, C.location_name,
D.description AS disconnect_reason
FROM dbo.Call_Record c
INNER JOIN dbo.Disconnect_Code D ON C.disconnect_code = D.code
WHERE c.date_end >=dateadd(hour,-1,GetUtcDate()) and c.date_end <
dateadd(day,+1,GetUtcDate()) --replace with the date that is relevant for
the searching
(GetUtcDate() - (1.0/24.0)) )
AND D.is_failure = 1
GROUP BY C.master_id_carrier, C.location_name, D.description
Do you see now any changes in the execution plan , I'd put the CI on
date_end column since your criteria is based on range date seraching and CI
is probably a good choice for it, but you'll have to test it.
"Mike Chamberlain" <none@.hotmail.com> wrote in message
news:%23ibexQI8FHA.2676@.TK2MSFTNGP15.phx.gbl...
> Hello, I'm using SQL 2000 with the latest updates.
> I have a large table Call_Record (5 million rows) that has three indexes:
> 1. A clustered index on account_no ASC, date_start DESC
> 2. A non-clustered index on date_end DESC
> 3. A non-clustered index on call_record_id ASC
> I'm querying it for failed calls in the last hour:
> SELECT COUNT(*) AS failure_count, C.master_id_carrier, C.location_name,
> D.description AS disconnect_reason
> FROM dbo.Call_Record c
> INNER JOIN dbo.Disconnect_Code D ON C.disconnect_code = D.code
> WHERE (c.date_end >= (GetUtcDate() - (1.0/24.0)) )
> AND D.is_failure = 1
> GROUP BY C.master_id_carrier, C.location_name, D.description
> The problem is that the execution plan shows that it is using index 1 to
> perform this query, whereas index 2 is clearly the best choice. If I
> replace the call to GetUtcDate() with a literal date constant like so:
> SELECT COUNT(*) AS failure_count, C.master_id_carrier, C.location_name,
> D.description AS disconnect_reason
> FROM dbo.Call_Record c
> INNER JOIN dbo.Disconnect_Code D ON C.disconnect_code = D.code
> WHERE (c.date_end >= '2005/11/23')
> AND D.is_failure = 1
> GROUP BY C.master_id_carrier, C.location_name, D.description
> then it does use index (2) as expected, and executes in a fraction of the
> time. My question is, why does it pick the "incorrect" index for the
> first query, and is there any way to force it to pick index (2)?
> Mike|||Mike Chamberlain (none@.hotmail.com) writes:
> I have a large table Call_Record (5 million rows) that has three indexes:
> 1. A clustered index on account_no ASC, date_start DESC
> 2. A non-clustered index on date_end DESC
> 3. A non-clustered index on call_record_id ASC
> I'm querying it for failed calls in the last hour:
> SELECT COUNT(*) AS failure_count, C.master_id_carrier, C.location_name,
> D.description AS disconnect_reason
> FROM dbo.Call_Record c
> INNER JOIN dbo.Disconnect_Code D ON C.disconnect_code = D.code
> WHERE (c.date_end >= (GetUtcDate() - (1.0/24.0)) )
> AND D.is_failure = 1
> GROUP BY C.master_id_carrier, C.location_name, D.description
> The problem is that the execution plan shows that it is using index 1 to
> perform this query, whereas index 2 is clearly the best choice. If I
> replace the call to GetUtcDate() with a literal date constant like so:
> SELECT COUNT(*) AS failure_count, C.master_id_carrier, C.location_name,
> D.description AS disconnect_reason
> FROM dbo.Call_Record c
> INNER JOIN dbo.Disconnect_Code D ON C.disconnect_code = D.code
> WHERE (c.date_end >= '2005/11/23')
> AND D.is_failure = 1
> GROUP BY C.master_id_carrier, C.location_name, D.description
> then it does use index (2) as expected, and executes in a fraction of
> the time. My question is, why does it pick the "incorrect" index for
> the first query, and is there any way to force it to pick index (2)?
When making the choice between scanning a clustered index, or using a
non-clustered index + bookmark lookup, the optimizer always have a
delicate choice. If the condition on the column in the NC-index hits
few rows is small, the NC index is good. But if the condition hits many
rows, the NC index is a lot worse than the table scan, as SQL Server
would have to access many data pages more than once.
To determine which to use, SQL Server makes estimates from statistics
saved for the table. When you put in a date literal, SQL Server can see
that the query will only hit a small number of rows, and thus the index
is good.
But for the first query, the problem is that getutcdate() is a non-
deterministic function, and thus will return different values each
time. I guess, therefore, the optimizer does not care about the
expression, but uses the clustered index instead. Since you have a
condition with >= there could potentially be many rows that are
hit in the condition.
In a situation like this an index hint may be a good idea:
FROM dbo.Call_Record c WITH (INDEX = DateEnd_ix)
Erland Sommarskog, SQL Server MVP, esquel@.sommarskog.se
Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/pr...oads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodin...ions/books.mspx

No comments:

Post a Comment