Mark Bowytz
Mark Bowytz

Reputation: 1362

SQL Server DELETE and SELECT Behave Differently with Same WHERE Clause

I have a table which is populated by a daily scheduled job that deletes the last 7 days of data and then repopulates with the 7 most recent days worth of data from another source (mainframe).

Recently, users reported a number of duplicates going back to the beginning of October 2011. ...in the magnitude of hundreds of thousand of rows.

I noticed strange behavior with the delete that runs for each job:

DELETE FROM fm104d 
 WHERE location = '18'
   AND (CONVERT(datetime,CASE WHEN ISDATE(pull_date)=0 THEN '19000101' 
                 ELSE pull_date END)) >  DATEADD(day, -7, getdate())

The above returns "(0 row(s) affected)".

When I run the above after replacing the DELETE with a SELECT *, I get 32,000+ rows in return.

Why would the SELECT and DELETE behave differently?

UPDATE

Here is the Actual Execution Plan:

http://pastie.org/2869202

Upvotes: 18

Views: 824

Answers (6)

Mark Bowytz
Mark Bowytz

Reputation: 1362

You won't believe this. I didn't in fact as it makes almost no logical sense, but in the end, the solution that worked...was to add an index.

Credit for this goes to my local DBA "Did think about adding an index? I just did to test and sure enough it works".

Here's the index as added:

CREATE  INDEX ixDBO_fir104d__SOURCE_LOCATION__Include
ON [dbo].[fir104d] ([SOURCE_LOCATION])
INCLUDE ([Transaction_Date],[PULL_DATE])
GO

I let the job run as scheduled and, sure enough, all is as it was.

My guess is that there is something in the explain plan to say it wasn't using an index / wrong index, but my developer mind can't make much sense of that level of detail.

Thanks to everybody for the time and effort you've all spent.

UPDATE

Received news from a different dev that the data in this table additionally corrupted to the point where it took "several hours of DBA involvement to resolve" along with the dev having to perform some other data fixes (read:data file reloads).

At the end of the day, while adding the index was probably a good thing considering the way the scheduled job runs, apparently, there was even more to the story!

Upvotes: 1

Matthew
Matthew

Reputation: 10444

It looks to me like you never want to delete when pull_date is not a date.
Try eliminating the explicit string replacements... perhaps there is a parsing different between the SELECT and DELETE

DELETE 
FROM 
    fm104d 
WHERE 
    [location] = '18' --NOTE if this is an int, then just try with 18, no dits
    AND (
         CASE ISDATE([pull_date])
         WHEN 1 THEN
             CAST([pull_date] AS DATETIME)
         ELSE
             NULL
         END  >  DATEADD(DAY, -7, GETDATE())
        ) 

EDIT: Note that this doesn't exactly match your SQL because, in yours, if you time-travel back to January First, 1900 it will delete your row regardless.... I presumed this was not actually your intention.

Upvotes: 0

RodMeans
RodMeans

Reputation: 49

How about trying this, see if you can evaluate your pull_date column first and then delete the records.

DELETE FROM fm104d
WHERE Location = 18 
   AND Pull_date IN
   (
      SELECT CONVERT (DATETIME,
         CASE 
            WHEN ISDATE(pull_Date) = 0
               THEN '19000101'
            ELSE pull_date
         END) AS pull_date
      FROM fm104d 
      WHERE pull_date > DATEADD(DAY, -7, GETDATE())
   )

Upvotes: 0

AaronLS
AaronLS

Reputation: 38365

For your select, add ISDATE(pull_date) to the select list to determine what part of the case statement these are affecting. Look at the pull_date as well and see if there's a pattern to the format of the string common among these offenders that refuse to be deleted.

This might have some relation to the determinism of Convert and IsDate:

"ISDATE is deterministic only if you use it with the CONVERT function, if the CONVERT style parameter is specified, and style is not equal to 0, 100, 9, or 109."

See the couple of examples here where convert is nested inside isdate:

http://www.sqlmonster.com/Uwe/Forum.aspx/sql-server-programming/181/CAST-CONVERT-nondeterministic

So try adjusting your where clause and see if that helps. Also note that "The return value of ISDATE may be affected by LANGUAGE and DATEFORMAT settings." So maybe something on your server has changed in these regards. Why it'd affect the delete but not the select is still strange.

Upvotes: 0

Andomar
Andomar

Reputation: 238096

One possible explanation might be that there are two tables, each in a different schema. Perhaps if you have select rights on both schema's but delete rights on only one, SQL Server might choose a different table for delete.

To verify this, prefix your table with the schema name (the default schema is dbo)

FROM schema1.fm104d

(Not tested, just a thought, no access to a SQL Server installation atm.)

Upvotes: 0

Royi Namir
Royi Namir

Reputation: 148524

Try this :

DELETE FROM fm104d  where fm104d.id in 
(
select  id from fm104d 
 WHERE location = '18'
   AND (CONVERT(datetime,CASE WHEN ISDATE(pull_date)=0 THEN '19000101' 
                 ELSE pull_date END)) >  DATEADD(day, -7, getdate())
)aaa

and give response if it deletes

p.s. : this is not the solution but will lead to decision.

Upvotes: 0

Related Questions