Tuesday, March 27, 2012

Abnormal Timeout Issue on Production System

Hi all,
My production system is experiencing abnormal timeouts during posting of
transaction. The abnormal part is:
System usually works fine for about 6 ws; but one unlucky day, suddenly
for no apparent reason, system experiences massive timeout for about 2 - 3
hours; after that everything is fine again.
A background on the production system:
? It is a sales and stock control system. With a fair number (nothing
massive) of sales transactions every operation days
? Approx 50 client users
? Developed using VB .Net, VS 2003
? Running on .Net Framework 2.0
? Running on SQL Server database
? Supports multi-user, single database
It is also running on a pretty high-end server (from what I was told) with
spec:
? HP ML 530 G2 machine
? 2-processors, each 3 GHz, running Xeon HT
? 4 GB ECC RAM
? HDD running RAID 0 and RAID 1; Total disk space 200 GB, with 72 GB free
? SQL Server 2003 standard edition with sp3
? Windows 2003 Server standard edition
? Running other s/w, mainly Norton Anti-Virus 9.0 corp. version and Verita
s
9.1
USUALLY DURING THE FINE DAYS…
System works fine during both peak and off-peak hours with very rare
timeouts. Those timeouts are caused mainly by updating of "constant" tables,
such as stock when two customers are buying the same item at the same time.
That is acceptable as data integrity on those kinds of tables must be
maintained.
Database maintenance plan is in-placed to reorganized data and index pages,
scheduled to run once a w, during non-operation hours.
HDD defragmentation is scheduled to run twice a w, also during
non-operation hours.
The codes for the frequently used sales modules have been optimized to
ensure that transaction control is well handled, and unnecessary queries
removed.
HOWEVER WHEN THE UNLUCKY DAY OCCURRED…
Timeout occurs when user posts a sales transaction during the peak hours.
But the volume of transaction is similar to other peak hours during other
fine days!
I tried restarting the server and the client machines and applications but
the posting still timeout.
I even tried having only one user posting a transaction but that also timeou
t!
I checked the Profiler and it reveals that SQL Server is “stuck” while
attempting to insert some rows into a few tables at the start of the
transaction. These tables have small number of rows, from 10K to 200K each.
I
repeated the same test for only one user, and SQL Server remains “stuck”
as
well. BUT during fine days, SQL Server never has problem with those tables.
It can even insert data quickly into big tables with 2M+ rows!
I also checked the SQL EM Current Activities, and can find no tables being
blocked when only one user is posting. Tables are only blocked when more tha
n
one user is posting, and that is understandable coz SQL Server is “stuck
on
the 1st user’s transaction, thus blocking everyone else.
The SQL Server just become spastic, obsessed with the 1st (or only) user
transaction, never wanting to finish it off that I know it can.
And the weird thing is after 2-3 hours, everything goes back to normal, with
SQL Server at its best again.
PUZZLED…
The server specs is top-end; The program should be well optimized as far as
the frequently used sales modules are concerned; Maintenance Plans and
defragmentation scheduled tasks are in place; Heavy jobs such as Veritas
backup are only done during non-operation hours; It works fine for about
six-w, then it gets “stuck” on a few small tables, only to “unstuck
” after
2-3 hours break; The weirdest of all, it even timeouts when only one user
(having all the server and database resources to him /herself) is posting a
sales transaction!
Can anyone please kindly advice what are the possible causes and resolutions
for this anomalies be? Can the hardware spec affect it, even though it is
supposed to be high-end? Could other applications such as Veritas co-exist
with the SQL Server? Is SQL Server running any system jobs, e.g. Ghost
Cleanup, that can slow down its performance?
TQ in advance.Hi
The fact that it is stuck sounds like blocking issues.
If the application does not finish it's transaction on a specific set of
row(s) in a table, and another one comes in to work on the same row(s), the
2nd one has to wait.
When this happens, run sp_who2 and see what is blocking what.
Look up "blocks, avoiding" in books online.
Regards
--
Mike Epprecht, Microsoft SQL Server MVP
Zurich, Switzerland
IM: mike@.epprecht.net
MVP Program: http://www.microsoft.com/mvp
Blog: http://www.msmvps.com/epprecht/
"HardKhor" <HardKhor@.discussions.microsoft.com> wrote in message
news:B1960834-C93C-458E-BC88-792045D88F2C@.microsoft.com...
> Hi all,
> My production system is experiencing abnormal timeouts during posting of
> transaction. The abnormal part is:
> System usually works fine for about 6 ws; but one unlucky day, suddenly
> for no apparent reason, system experiences massive timeout for about 2 - 3
> hours; after that everything is fine again.
>
> A background on the production system:
> . It is a sales and stock control system. With a fair number (nothing
> massive) of sales transactions every operation days
> . Approx 50 client users
> . Developed using VB .Net, VS 2003
> . Running on .Net Framework 2.0
> . Running on SQL Server database
> . Supports multi-user, single database
> It is also running on a pretty high-end server (from what I was told) with
> spec:
> . HP ML 530 G2 machine
> . 2-processors, each 3 GHz, running Xeon HT
> . 4 GB ECC RAM
> . HDD running RAID 0 and RAID 1; Total disk space 200 GB, with 72 GB free
> . SQL Server 2003 standard edition with sp3
> . Windows 2003 Server standard edition
> . Running other s/w, mainly Norton Anti-Virus 9.0 corp. version and
> Veritas
> 9.1
>
> USUALLY DURING THE FINE DAYS.
> System works fine during both peak and off-peak hours with very rare
> timeouts. Those timeouts are caused mainly by updating of "constant"
> tables,
> such as stock when two customers are buying the same item at the same
> time.
> That is acceptable as data integrity on those kinds of tables must be
> maintained.
> Database maintenance plan is in-placed to reorganized data and index
> pages,
> scheduled to run once a w, during non-operation hours.
> HDD defragmentation is scheduled to run twice a w, also during
> non-operation hours.
> The codes for the frequently used sales modules have been optimized to
> ensure that transaction control is well handled, and unnecessary queries
> removed.
>
> HOWEVER WHEN THE UNLUCKY DAY OCCURRED.
> Timeout occurs when user posts a sales transaction during the peak hours.
> But the volume of transaction is similar to other peak hours during other
> fine days!
> I tried restarting the server and the client machines and applications but
> the posting still timeout.
> I even tried having only one user posting a transaction but that also
> timeout!
> I checked the Profiler and it reveals that SQL Server is "stuck" while
> attempting to insert some rows into a few tables at the start of the
> transaction. These tables have small number of rows, from 10K to 200K
> each. I
> repeated the same test for only one user, and SQL Server remains "stuck"
> as
> well. BUT during fine days, SQL Server never has problem with those
> tables.
> It can even insert data quickly into big tables with 2M+ rows!
> I also checked the SQL EM Current Activities, and can find no tables being
> blocked when only one user is posting. Tables are only blocked when more
> than
> one user is posting, and that is understandable coz SQL Server is "stuck"
> on
> the 1st user's transaction, thus blocking everyone else.
> The SQL Server just become spastic, obsessed with the 1st (or only) user
> transaction, never wanting to finish it off that I know it can.
> And the weird thing is after 2-3 hours, everything goes back to normal,
> with
> SQL Server at its best again.
>
> PUZZLED.
> The server specs is top-end; The program should be well optimized as far
> as
> the frequently used sales modules are concerned; Maintenance Plans and
> defragmentation scheduled tasks are in place; Heavy jobs such as Veritas
> backup are only done during non-operation hours; It works fine for about
> six-w, then it gets "stuck" on a few small tables, only to "unstuck"
> after
> 2-3 hours break; The weirdest of all, it even timeouts when only one user
> (having all the server and database resources to him /herself) is posting
> a
> sales transaction!
> Can anyone please kindly advice what are the possible causes and
> resolutions
> for this anomalies be? Can the hardware spec affect it, even though it is
> supposed to be high-end? Could other applications such as Veritas co-exist
> with the SQL Server? Is SQL Server running any system jobs, e.g. Ghost
> Cleanup, that can slow down its performance?
> TQ in advance.|||Hi,
TQ for reply.
I don't think it is a blocking issue coz EM Current Activities shows no
blocks, no locks and whatsoever, and the codes have been optimized to handle
transaction properly. And as mentioned, weird things like why everything
works fine for about six ws, but fails for 3 hours; and even
one-and-only-one user gets timeout during posting. The system just degraded
for no apparent reason.
TQ for the sp_who2 thou. Will try it out (fingers crossed) in abt six ws
time! ;)
"Mike Epprecht (SQL MVP)" wrote:

> Hi
> The fact that it is stuck sounds like blocking issues.
> If the application does not finish it's transaction on a specific set of
> row(s) in a table, and another one comes in to work on the same row(s), th
e
> 2nd one has to wait.
> When this happens, run sp_who2 and see what is blocking what.
> Look up "blocks, avoiding" in books online.
> Regards
> --
> Mike Epprecht, Microsoft SQL Server MVP
> Zurich, Switzerland
> IM: mike@.epprecht.net
> MVP Program: http://www.microsoft.com/mvp
> Blog: http://www.msmvps.com/epprecht/
> "HardKhor" <HardKhor@.discussions.microsoft.com> wrote in message
> news:B1960834-C93C-458E-BC88-792045D88F2C@.microsoft.com...
>
>

No comments:

Post a Comment