Tuesday, March 6, 2012

A Range Query Optimization

Hi,
Need help in optimizing a query in SQL Server.
Following is the problem statement.
There are two tables;
1st table (t1) has a KEY ( char(8) ) column, with a clustered index.
this is not the primary key. The table can have billions of records;
in test environment, we are having 3,000,000 records
2nd table (t2) has two columns a from_Range and to_Range, both
char(8). this table has lesser number of records, in thousands.
Clustered index is on the primary key.
However there is no relation whatsoever between the KEY and the
from/to range.
We need to find matching records where Key is found between the from
and to range :
select t1.id, t2.id from t1, t2
where t1.KEY between t2.from_range and t2.to_range
( The ids form part of primary keys in both tables. )
The plan shows a loop, with t1 using clustered index of KEY and t2
using clustered index of the primary key.
This query is taking around 14 seconds on SQL server 2000 in win2kpro
with P4 and 512 MB RAM.
Is there any way this can be reducd to a subsecond performance ? This
query forms the core of most of the processing, and any reduction here
will have recursive effect all over.
Thanks in advance,
roy.
aroy (anindya.roy@.rave-tech.com) writes:
> There are two tables;
> 1st table (t1) has a KEY ( char(8) ) column, with a clustered index.
> this is not the primary key. The table can have billions of records;
> in test environment, we are having 3,000,000 records
> 2nd table (t2) has two columns a from_Range and to_Range, both
> char(8). this table has lesser number of records, in thousands.
> Clustered index is on the primary key.
> However there is no relation whatsoever between the KEY and the
> from/to range.
> We need to find matching records where Key is found between the from
> and to range :
> select t1.id, t2.id from t1, t2
> where t1.KEY between t2.from_range and t2.to_range
In general, it it best to post the CREATE TABLE and CREATE INDEX statements
for the table, as the narrative easily can be misunderstood.
But it sounds like a problem that Mischa Sandberg ran into recently,
and you can review that thread starting on
http://groups.google.com/groups?hl=s...396%40edtnps84
Erland Sommarskog, SQL Server MVP, esquel@.sommarskog.se
Books Online for SQL Server SP3 at
http://www.microsoft.com/sql/techinf...2000/books.asp
|||On 29 Jul 2004 00:43:27 -0700, aroy wrote:
(snip)
See my reply in comp.databases.ms-sqlserver.
Best, Hugo
(Remove _NO_ and _SPAM_ to get my e-mail address)
|||Funny enough, but this is just what I've been working on,
for lookup of about 50M rows in a table of 10M ranges.
(discussion in another thread, in m.p.s.programming).
(As an aside, "KEY" is a lousy column name; you'll be hearing from J.C.)
Given:
T1(id INT, "Key" CHAR(8), ...)
T2(id INT, from_Range CHAR(8), to_Range CHAR(8))
Gert-Jan Strik came up with the single-query answer
(paraphrased from the query framed for my problem)
SELECT T1.id, T2.id
FROM T1
JOIN T2
ON T2.from_Range= (
SELECT MAX(from_Range) FROM T2
WHERE T1.Key BETWEEN T2 .from_Range AND T2 .to_Range
)
"aroy" <anindya.roy@.rave-tech.com> wrote in message
news:be87bcc0.0407282343.1bbda35f@.posting.google.c om...
> Hi,
> Need help in optimizing a query in SQL Server.
> Following is the problem statement.
> There are two tables;
> 1st table (t1) has a KEY ( char(8) ) column, with a clustered index.
> this is not the primary key. The table can have billions of records;
> in test environment, we are having 3,000,000 records
> 2nd table (t2) has two columns a from_Range and to_Range, both
> char(8). this table has lesser number of records, in thousands.
> Clustered index is on the primary key.
> However there is no relation whatsoever between the KEY and the
> from/to range.
> We need to find matching records where Key is found between the from
> and to range :
> select t1.id, t2.id from t1, t2
> where t1.KEY between t2.from_range and t2.to_range
> ( The ids form part of primary keys in both tables. )
> The plan shows a loop, with t1 using clustered index of KEY and t2
> using clustered index of the primary key.
> This query is taking around 14 seconds on SQL server 2000 in win2kpro
> with P4 and 512 MB RAM.
> Is there any way this can be reducd to a subsecond performance ? This
> query forms the core of most of the processing, and any reduction here
> will have recursive effect all over.
> Thanks in advance,
> roy.

No comments:

Post a Comment