Thursday, February 16, 2012

A problem in memory when I run sp with a large number of records !

I will describe my problem and I hope that you will help me to find the way to solve it !

My task is how to find the "similiar" words in our database ! Now, I have to use my software to find the similiar name. For example :

The name : "Donaldson Filtration Slovensko s.r." and the name : "PENTA Slovensko Donaldsin spol. sro" are "similiar". In that case : the word "Donaldson" and "Donaldsin" can be called similiar because they are diffrence not more one character in the same positon in the string.

I used stored procedure to sovle this problem because it is a larg number of records in each table (for over 1 million records for each table).

The way I had used first is create a temperary table to contain

CREATE TABLE #SeperateString1
(
String1 nvarchar(100)
)

CREATE TABLE #SeperateString2
(
String2 nvarchar(100)
)

So that, I will analyze those string into 2 tables by detecting the blank character between 2 words !

Then, I compare values between 2 tables,

_

DECLARE ST1_Cursor CURSOR FOR
SELECT String1 FROM #SeperateString1
OPEN ST1_Cursor

FETCH NEXT FROM ST1_Cursor INTO @.String1
WHILE @.@.FETCH_STATUS = 0
Begin

DECLARE ST2_Cursor CURSOR FOR
SELECT String2 FROM #SeperateString2
OPEN ST2_Cursor

FETCH NEXT FROM ST2_Cursor INTO @.String2
WHILE @.@.FETCH_STATUS = 0
Begin

Exec CompareString @.String1, @.String2, @.Result = @.KetQua Output

If @.KetQua = 1
Begin
Exec SetCol1 @.ID, @.Rescol1 = @.Rescol1 Output
Exec SetCol2 @.IDCI, @.Rescol2 = @.Rescol2 Output
Select @.ResCol3 = 'Name: '+ Char(13) + Char(10) + @.String1 + ' <==> ' + @.String2
Select @.ResCol4 = 0

Select @.ResCol5 = GetDate()

If (Not Exists(Select * From Result Where IAdata like @.ResCol1 And DataFound like @.ResCol2 And Reason like @.Rescol3)) And (Not Exists(Select * From Dictionary Where Value = @.String1)) And (Not Exists(Select * From Dictionary Where Value = @.String2))
Begin
INSERT INTO Result (IAData, DataFound, Reason, Decide, DateNow) VALUES (@.ResCol1, @.ResCol2, @.ResCol3, @.ResCol4, @.ResCol5)
End

End
FETCH NEXT FROM ST2_Cursor INTO @.String2
End

CLOSE ST2_Cursor
DEALLOCATE ST2_Cursor

FETCH NEXT FROM ST1_Cursor INTO @.String1
End


CLOSE ST1_Cursor
DEALLOCATE ST1_Cursor

__
Results will be add to "Result" table

After that, I will deallocate my temporary tables which I have created.

_
drop table #SeperateString1
drop table #SeperateString2
_

This SP will be run in about 3 hours !
Problems occour is memory ! For about an hour to run, my memory I detected in "Task Manger" grown up to 200MB --> 300 MB (Amazing and suprised !) and my computer "run" slowlier than before ! I don't know why ! Can you help me to explain.

For me, the way to explain in this case is Microsoft SQL Server 2000, he was not able to free memory after each session, it was still in my RAM (and even over my RAM, because there is no room to contain them)! So that, it was grown up in an amazing way !

My solution in this case is use an array of string to compare (not simulate a virtual array)! But, in data type of SP not have "array" !

What can I do now ? Please help me to find the way to solve this problem, thanks for all !

I think the fuzzy lookup and fuzzy grouping transforms in Integration Services work very well for solving problems like the one you described. If you need more details, here is the Integration Services dedicated forum: http://forums.microsoft.com/MSDN/ShowForum.aspx?ForumID=80&SiteID=1

Hope this helps

|||

Thanx to Bogdan Crivat on your suggest,

I tried to find Integration Services from your link of forum and found out that I can use this thing :

go

sp_configure 'show advanced options', 1

go

reconfigure

go

sp_configure 'max server memory', 400

go

reconfigure

go

sp_configure 'min server memory', 100

go

reconfigure

go

But, It is does not apply to maximize my memory, it is over 400 when I run my program !

Note : I use Microsoft SQL Server 2000 in Windows XP SP2.

Regards,

Tran Quang Phuong.

No comments:

Post a Comment