Reputation: 21
I want to update my table with values generated out of the same table.
My goal is to search all rows with proftxt like _NS and _WP and with the same ao, sum them up,
divide the value through the number of _H,_G,_L-Elements of that ao and add this value to the _H, _G and _L objects of that ao.
It is possible that there are only _NS and _WP rows for an ao. Than the routine should jump over this ao.
Example:
My data looks like:
an, ao, proftxt, value, year
101 , 1, 'e_NSe', 5, 2006
102 , 1, 'e_Ha', 1, 2006
103 , 1, 'w_NSr', 4, 2006
104 , 2, 'w_NSr', 2, 2006
105 , 2, 'x_H05r', 4, 2006
106 , 2, 'w_Gr', 2, 2006
107 , 2, 'a_WPr', 4, 2006
108 , 3, 'a_WPr', 4, 2006
My data should be like:
an, ao, proftxt, value, year
102 , 1, 'e_Ha', 10 2006
103 , 2, 'x_H05r', 7, 2006
103 , 2, 'w_Gr', 5, 2006
108 , 3, 'a_WPr', 4, 2006
My routine works for a small amount of testdata.
The update function ends, while working on the real database, after 13 hours successful.
But it only edited 5000 out of 210000 rows.
DECLARE @ENDYEAR INT
DECLARE @AO BIGINT
DECLARE @YEAR INT
DECLARE @ELEMENTS INT
--Parameter festlegen
SET @YEAR = 2006
SET @ENDYEAR = 2013 --Endyear+1
SET @AO = 2
WHILE(@YEAR<@ENDYEAR)
BEGIN
WHILE (@AO >1) --Do as long as Cursor is inside table
BEGIN
SET @AO = (SELECT TOP 1 ao FROM tbl_slp -- Search ao with _WP _NS
WHERE (proftxt LIKE '%[_]WP%'
OR proftxt LIKE '%[_]NS%')
AND year = @YEAR
AND ao > @AO );
SET @ELEMENTS = (SELECT COUNT(proftxt) --Count Number of _H, _G, _L elements
FROM tbl_SLP
WHERE ao = @AO AND year = @YEAR AND
(proftxt LIKE '%[_]H%' OR proftxt = NULL
OR proftxt LIKE '%[_]G%'
OR proftxt LIKE '%[_]L%'))
IF (@ELEMENTS != 0)
BEGIN
UPDATE tbl_SLP --Update _H, _G, _L rows
SET value = value + (SELECT SUM(CONVERT(float, value))
FROM tbl_SLP
WHERE (proftxt LIKE '%[_]WP%'
OR proftxt LIKE '%[_]NS%')
AND year = @YEAR
AND ao = @AO)
/@ELEMENTS
WHERE ao = @AO AND year = @YEAR
DELETE FROM tbl_SLP --delete_WP _NS rows
WHERE ao= @AO
AND year = @YEAR
AND (proftxt LIKE '%[_]WP%' OR proftxt LIKE '%[_]NS%')
END
SET @AO = @AO +1
END
SET @YEAR = @YEAR +1
END
I know that the routine is super slow, but what can I do?
Upvotes: 1
Views: 416
Reputation: 21
I combined both (really helpful!) answers. I added, as criticalfix told me, a coloum proftype to set an index on the table:
ALTER TABLE
ADD proftype CHAR(1)
GO
UPDATE tbl_SLPverrechnetWPNSP
SET proftype = 'W'
WHERE proftxt LIKE '%[_]WP%'
UPDATE tbl_SLP
SET proftype = 'N'
WHERE proftxt LIKE '%[_]NS%'
UPDATE tbl_SLP
SET proftype = 'H'
WHERE proftxt LIKE '%[_]H%'
OR proftxt IS NULL
UPDATE tbl_SLP
SET proftype = 'G'
WHERE proftxt LIKE '%[_]G%'
UPDATE tbl_SLP
SET proftype = 'L'
WHERE proftxt LIKE '%[_]L%'
--set index on proftype
CREATE NONCLUSTERED INDEX [IX_PROFTYPE] ON [dbo].[tbl_SLP] (proftype ASC) ON [PRIMARY]
GO
Next I used the code from bob to edit my table.
SET XACT_ABORT ON
SET NOCOUNT ON
BEGIN TRANSACTION
-- Create a temp table with each ao-year's sums and counts (sums of N and W record values and counts of H, G, and L records)
SELECT T.ao, T.year, SUM(CONVERT(float, T.value)) AS SumVals, (SELECT COUNT(*)
FROM tbl_slp A
WHERE A.ao = T.ao
AND A.year = T.year
AND (A.proftype ='G' OR A.proftype = 'H' OR A.proftype = 'L' ))
AS CountOther
INTO #temp1
FROM tbl_slp T
WHERE (T.proftype = 'W' OR T.proftype = 'N')
GROUP BY T.ao, T.year
-- Add "sum/count" for each ao-year to the H, G, and L records for that year
UPDATE A
SET value = value + CONVERT(FLOAT, T.SumVals) / T.CountOther
FROM tbl_slp A
INNER JOIN #temp1 T ON A.ao = T.ao AND A.year = T.year
WHERE (A.proftype = 'H' OR A.proftype = 'G' OR A.proftype LIKE 'L')
-- Now that we've distributed the W and N values, delete those records
DELETE A
FROM tbl_slp A
INNER JOIN #temp1 T ON A.ao = T.ao AND A.year = T.year
WHERE (A.proftype = 'W' OR A.proftype = 'N')
AND T.CountOther > 0
DROP TABLE #temp1
COMMIT TRANSACTION
Thank you so much for the help! The routine ran only 3,5 minutes!!!
Upvotes: 1
Reputation: 2870
First, I see a couple of NULL-related problems. For instance, your inner loop is apparently waiting for @AO to become NULL before it will finish:
WHILE (@AO >1)
This will work when you set @AO to something that isn't there, but it is hard to read, and you probably want to write more explicit logic.
Next, this condition will always be false:
OR proftxt = NULL
The NULL value is not equal to itself. To test for this condition you would have to write:
OR proftxt IS NULL
Also, any NULL values will be omitted from your COUNT(proftxt). Try running the following sample query. It returns 1, along with the message "Warning: Null value is eliminated by an aggregate or other SET operation".
SELECT COUNT(fieldname) FROM (SELECT 1 AS fieldname UNION SELECT NULL AS fieldname) AS tablename
Finally, indexing the proftxt column won't fix your performance problems, because a LIKE condition with a leading wildcard can't use the index. You can think of an index like a telephone book that is alphabetized by last name. If you are looking for LastName LIKE '%mann', the index won't help you. You will still have to read through every entry in the telephone book to find all the last names ending in "mann". In database terms, that is called a "table scan", and is slow.
I would add a new column, which you could call proftxttype.
UPDATE tbl_SLP
SET proftxttype = 1
WHERE proftxt LIKE '%[_]WP%'
OR proftxt LIKE '%[_]NS%'
UPDATE tbl_SLP
SET proftxttype = 2
WHERE proftxt LIKE '%[_]H%'
OR proftxt LIKE '%[_]G%'
OR proftxt LIKE '%[_]L%'
OR proftxt IS NULL
Then index this column:
CREATE NONCLUSTERED INDEX [IX_PROFTXTTYPE] ON [dbo].[TBL_SLP] (PROFTXTTYPE ASC) ON [PRIMARY]
Now rewrite your update in terms of proftxttype. Of course, whenever you insert or update proftxt, you will also have to update proftxttype. That is inavoidable, but SQL Server will take care of keeping the index up to date, so you don't have to worry about the index.
I know this sounds like a lot of work, but the core of your problem is that you're scanning through the entire table every time you want to find a proftxt value with a leading wildcard.
Upvotes: 1
Reputation: 452
SQL is designed for set-based operations, not procedural flow-of-control style logic like your routine. Here's a set-based way of doing it, which I'm guessing will be much faster than the procedural way:
SET XACT_ABORT ON
SET NOCOUNT ON
BEGIN TRANSACTION
-- Create a temp table with each ao-year's sums and counts (sums of _NS and _WP record values and counts of _H, _G, and _L records)
SELECT T.ao, T.year, SUM(T.value) AS SumVals, (SELECT COUNT(*) FROM tbl_slp A WHERE A.ao = T.ao AND A.year = T.year AND (A.proftxt = NULL OR A.proftxt LIKE '%[_]H%' OR A.proftxt LIKE '%[_]G%' OR A.proftxt LIKE '%[_]L%')) AS CountOther
INTO #temp1
FROM tbl_slp T
WHERE (T.proftxt LIKE '%[_]WP%' OR T.proftxt LIKE '%[_]NS%')
GROUP BY T.ao, T.year
-- Add "sum/count" for each ao-year to the _H, _G, and _L records for that year
UPDATE A
SET value = value + CONVERT(FLOAT, T.SumVals) / T.CountOther
FROM tbl_slp A
INNER JOIN #temp1 T ON A.ao = T.ao AND A.year = T.year
WHERE (A.proftxt = NULL OR A.proftxt LIKE '%[_]H%' OR A.proftxt LIKE '%[_]G%' OR A.proftxt LIKE '%[_]L%')
-- Now that we've distributed the _WP and _NS values, delete those records
DELETE A
FROM tbl_slp A
INNER JOIN #temp1 T ON A.ao = T.ao AND A.year = T.year
WHERE (A.proftxt LIKE '%[_]WP%' OR A.proftxt LIKE '%[_]NS%')
AND T.CountOther > 0
COMMIT TRANSACTION
For the sample set you gave, this produces the exact same results (except for the an
column which I assume was a typo).
Full disclosure, this takes longer on the sample set than your routine does (17 ms compared to your 3 ms), but it should scale up to large data a whole lot better. I put it in a transaction for correctness but I'm not sure what your exact use case is, so that may be a disadvantage of my way since it will lock the pages (and may escalate to the whole table) for the entire time. Your routine didn't have any transactions, though, which could lead to bad data so if you keep your way make sure to put each update-delete pair in its own transaction.
Also, if you don't have an index on proftxt
, add one! This will make a huge difference for both solutions.
Good luck. Here's the SQL Fiddle I used.
Upvotes: 5