Reputation: 24572
I am looking at an application and I found this SQL:
DELETE FROM Phrase
WHERE Modified < (SELECT Modified FROM PhraseSource WHERE Id = Phrase.PhraseId)
The intention of the SQL is to delete rows from Phrase where there are more recent rows in the PhraseSource table.
Now I know the tables Phrase and PhraseSource have the same columns and Modified holds the number of seconds since 1970 but I cannot understand how/why this works or what it is doing. When I look at it then it seems like on the left of the < it is just one column and on the right side of the > it would be many rows. Does it even make any sense?
The two tables are identical and have the following structure
Id - GUID primary key
...
...
...
Modified int
the ...
columns are about ten columns containing text and numeric data. The PhraseSource
table may or may not contain more recent rows with a higher number in the Modified column and different text and numeric data.
Upvotes: 0
Views: 130
Reputation: 39092
The SELECT
statement in parenthesis is a sub-query or nested query.
What happens is that for each row, the Modified
column value is compared with the result of the sub-query (which is run once for each of the rows in the Phrase table).
The sub-query has a WHERE
statement, so it finds a row that has the same ID
as the row from Phrase
table that we are currently evaluating and returns the Modified
value (which is for a sigle row, actually a single scalar value).
The two Modified
values are compared and in case the Phrase
's row has been modified before the row in PhraseSource
, it is deleted.
As you can see this approach is not efficient, because it requires the database to run a separate query for each of the rows in the Phrase
table (although I imagine that some databases might be smart enough to optimize this a little bit).
The more efficient solution would be to use INNER JOIN
:
DELETE p FROM Phrase p
INNER JOIN PhraseSource ps
ON p.PhraseId=ps.Id
WHERE p.Modified < ps.Modified
This should do the exact same thing as your query, but using efficient JOIN
mechanism. INNER JOIN
uses the ON
statement to choose how to "match" rows in two different tables (which is done very efficiently by the DB) and then again compares the Modified
values of matching rows.
Upvotes: 5