Pritam Das
Pritam Das

Reputation: 11

Slowness in update query using inner join

I am using the below query to update one column based on the conditions it is specified. I am using "inner join" but it is taking more than 15 seconds to run the query even if it has to update no records(0 records).

UPDATE CONFIGURATION_LIST
SET DUPLICATE_SERIAL_NUM = 0
FROM CONFIGURATION_LIST
INNER JOIN (SELECT DISTINCT APPLIED_MAT_CODE, APPLIED_SERIAL_NUMBER, COUNT(*) AS NB
            FROM CONFIGURATION_LIST
            WHERE
                PLANT = '0067'
                AND APPLIED_SERIAL_NUMBER IS NOT NULL 
                AND APPLIED_SERIAL_NUMBER !=''
                AND DUPLICATE_SERIAL_NUM = 1
            GROUP BY 
                APPLIED_MAT_CODE, APPLIED_SERIAL_NUMBER
            HAVING
                COUNT(*) = 1) T2 ON T2.APPLIED_SERIAL_NUMBER = CONFIGURATION_LIST.APPLIED_SERIAL_NUMBER
                                 AND T2.APPLIED_MAT_CODE = CONFIGURATION_LIST.APPLIED_MAT_CODE
WHERE 
    CONFIGURATION_LIST.PLANT = '0067'
    AND DUPLICATE_SERIAL_NUM = 1

The index is there with APPLIED_SERIAL_NUMBER and APPLIED_MAT_CODE and fragmentation is also fine.

Could you please help me on the above query performance.

Upvotes: 1

Views: 50

Answers (3)

James Z
James Z

Reputation: 12317

Looks like you're checking the table for rows that exist in the same table with the same values, and if not, update the duplicate column to zero. If your table has a unique key (identity field or composite key), you could do something like this:

UPDATE C
SET C.DUPLICATE_SERIAL_NUM = 0
FROM 
  CONFIGURATION_LIST C
where 
  not exists (
    select
      1 
    FROM
      CONFIGURATION_LIST C2
    where
      C2.APPLIED_SERIAL_NUMBER = C.APPLIED_SERIAL_NUMBER and
      C2.APPLIED_MAT_CODE = C.APPLIED_MAT_CODE and
      C2.UNIQUE_KEY_HERE != C.UNIQUE_KEY_HERE
  ) and 
  C.PLANT = '0067' and
  C.DUPLICATE_SERIAL_NUM = 1

Upvotes: 1

ikrabbe
ikrabbe

Reputation: 1929

I will try with a select first:

select APPLIED_MAT_CODE, APPLIED_SERIAL_NUMBER, count(*) as n
from CONFIGURATION_LIST cl
where
    cl.PLANT='0067' and
    cl.APPLIED_SERIAL_NUMBER IS NOT NULL and
    cl.APPLIED_SERIAL_NUMBER <> ''
group by APPLIED_MAT_CODE, APPLIED_SERIAL_NUMBER;

How many rows do you get with this and how long does it take?

If you remove your DUPLICATE_SERIAL_NUM column from your table it might be very simple. The DUPLICATE_SERIAL_NUM suggests that you are searching for duplicates. As you count your rows you could introduce a simple table that contains the counts:

create table CLCOUNT ( N int unsigned, C int /* or what APPLIED_MAT_CODE is */, S int /* or what APPLIED_SERIAL_NUMBER is */, PLANT char(20) /* or what PLANT is */, index unique (C,S,PLANT), index(PLANT,N));
insert into CLCOUNT select count(*), cl.APPLIED_MAT_CODE, cl.APPLIED_SERIAL_NUMBER, cl.PLANT
from CONFIGURATION_LIST cl
where
    cl.PLANT='0067' and
    cl.APPLIED_SERIAL_NUMBER IS NOT NULL and
    cl.APPLIED_SERIAL_NUMBER <> ''
group by APPLIED_MAT_CODE, APPLIED_SERIAL_NUMBER;

How long does this take?

Now you can simply select * from CLCOUNT where PLANT='0067' and N=1;

This is all far from being perfect. But you should be able to analyze (EXPLAIN SELECT ...) your queries and find why it takes so long.

Upvotes: 0

Gordon Linoff
Gordon Linoff

Reputation: 1269873

First, you don't need the DISTINCT when using GROUP BY. SQL Server probably ignores it, but it is a bad idea anyway:

UPDATE CONFIGURATION_LIST
SET DUPLICATE_SERIAL_NUM = 0
    FROM CONFIGURATION_LIST INNER JOIN
         (SELECT APPLIED_MAT_CODE, APPLIED_SERIAL_NUMBER, COUNT(*) AS NB
          FROM CONFIGURATION_LIST cl
          WHERE cl.PLANT = '0067' AND
                cl.APPLIED_SERIAL_NUMBER IS NOT NULL AND
                cl.APPLIED_SERIAL_NUMBER <> ''
                cl.DUPLICATE_SERIAL_NUM = 1
          GROUP BY cl.APPLIED_MAT_CODE, cl.APPLIED_SERIAL_NUMBER
          HAVING COUNT(*) = 1
         ) T2
         ON T2.APPLIED_SERIAL_NUMBER = CONFIGURATION_LIST.APPLIED_SERIAL_NUMBER AND
            T2.APPLIED_MAT_CODE = CONFIGURATION_LIST.APPLIED_MAT_CODE
WHERE CONFIGURATION_LIST.PLANT = '0067' AND
      DUPLICATE_SERIAL_NUM = 1;

For this query, you want the following index: CONFIGURATION_LIST(PLANT, DUPLICATE_SERIAL_NUM, APPLIED_SERIAL_NUMBER, APPLIED_MAT_CODE, APPLIED_SERIAL_NUMBER).

The HAVING COUNT(*) = 1 suggests that you might really want NOT EXISTS (which would normally be faster). But you don't really explain what the query is supposed to be doing, you only say that this code is slow.

Upvotes: 1

Related Questions