Andre
Andre

Reputation: 1902

SQL Server query execution very slow when comparing Primary Keys

I have a SQL Server 2008 R2 database table with 12k address records where I am trying to filter out duplicate phone numbers and flag them using the following query

SELECT a1.Id, a2.Id  
  FROM Addresses a1 
    INNER JOIN  Addresses a2 ON a1.PhoneNumber = a2.PhoneNumber
  WHERE a1.Id < a2.Id

Note: I realize that there is another way to solve this problem by using EXISTS, but this is not part of the discussion.

The table has a Primary Key on the ID field with a Clustered Index, the fragmentation level is 0 and the phone number field is not null and has about 130 duplicates out of the 12k records. To make sure it is not a server or database instance issue I ran it on 4 different systems.

Execution of the query takes several minutes, sometimes several hours. After trying almost everything as one of my last steps I removed the Primary Key and ran the query without it and voila it executed in under 1 second. I added the Primary Key back and it still ran in under one second.

Does anybody have an idea what is causing this problem?

Is it possible that the primary key gets somehow corrupted?

EDIT: My apologies I had a couple of typos in the Sql Query

Upvotes: 1

Views: 1179

Answers (2)

gbn
gbn

Reputation: 432331

Out of data statistics. Dropping and recreating the PK will give up fresh statistics.

Too late now, but I'd have suggest running sp_updatestats to see what happened.

If you backup and restore a database onto different systems, statistics follow the data

I'd suspect a different plan too after non-indexed (I guess) columns PhoneNumber and CCAPhoneN

Upvotes: 3

JNK
JNK

Reputation: 65177

I'm guessing there are no indexes on PhoneNumber or PhoneNo.

You are joining on these fields, but if they aren't indexed it's forcing TWO table scans, one for each instance of the table in the query, then probably doing a hash match to find matching records.

Next step - get an exec plan and see what the pain points are.

Then, add indexes to those fields (assuming you see a Clustered Index Scan) and see if that fixes it.

I think your other issue is a red herring. The PK likely has nothing to do with it, but you may have gotten page caching (did you drop the buffers and clear the cache between runs?) which made the later runs faster.

Upvotes: 2

Related Questions