Thursday, March 22, 2012

A Very Strange Problem

I have a simple query which ran fine for the last 12 months. All of sudden, it's dragging and couldn't finish in 6 hours. I tried to trace down where the problem resided but it's such a simple query there is nothing to break down. By chance, I commented out all the column names and replace with ' SELECT * ', the query finished in 14 seconds. Once I replace the * with column names, it ran over 20 minutes and I had to cancel it because it doesn't seem to be returning any results. Any help would be appreciated. Thanks.DBCC Checktable|||Thanks. I ran the check and there was no erros. Any more that I can try?|||Have a look at the estmated execution plan of both queries. Most likely there is a difference. You may need to either update statistics on some of the underlying tables, or clear the procedure cache.|||The two execution plans are the same. I cleared the procedure cache, but the problem continues. Any more ideas?|||When was the last time the box was bounced?|||If it's so simple, why don't you post the code?|||Are you certain it is the same execution plan? I can not think how the two execution times (< 1 minute vs > 20 minutes) can be reconciled with the same plan. Did you comment out a group by clause, or a bunch of aggregates? Maybe user defined function calls?|||I found what's wrong but not quite sure how to resolve it at this moment. The database is actually replicated from the publisher. There is another subscriber(server) pulling in the exact same data from the publisher. When I ran the same query(with column names) it only took 20 seconds. Then I check the execution plan on that server, sure enough it's totally different. It's 85% bookmark, which means it full use the clustered index. Yet, on the one that runs forever, it is 58% scan. I check all the tables and indexes. There's no difference. I update the stats yet it is still running forever. Any one has any more suggestions? I did restart the box this morning, by the way.|||So on one server the query takes 20 seconds, while on the other, it takes more than 20 minutes? Are the subscriptions exactly the same? Or are both servers getting different slices of the data?|||Both subscriptions are from the same publication, and they are both full subscriptions without any filters at all. One is the main reporting server, the other is a backup right now. The configurations are almost exactly the same. Only the horsepower is different. One is an 8-way box with 20 GB memory, the other is a 4-way box with 8 GB memory. Both run Win 2003 and SQL 2000 sp3a with AWE on. Yet the same query was interpreted differently.|||Can take help of PROFILER to see the activity while running this query.
ANy differences between 2 servers settings such as db-option, recovery model etc.

No comments:

Post a Comment