AbdelElrafa
AbdelElrafa

Reputation: 891

Limit rows checked with a not in query - SQL

So I have 2 tables. TableA that has 350K rows and TableB has only the active data that TableA is supposed to have. So I need to clear out the old data in TableA.

This query would be great if Table A wasn't so large.

DELETE FROM TableA where MLS NOT IN (SELECT LIST_3 FROM TableB);

So what I need to do is limit the number of rows checked in TableA. Instead of it checking 350K records from TableA I want it to check 100 for example.

I can't figure out how... This doesn't work because it only limits the number of rows deleted and not checked. It may check 500 rows and delete 100. I want it to check 100 and delete however many aren't in TableB.

DELETE FROM TableA where MLS NOT IN (SELECT LIST_3 FROM TableB) LIMIT 100;

Upvotes: 0

Views: 58

Answers (2)

Hunter McMillen
Hunter McMillen

Reputation: 61512

A left join on indexed columns should perform a delete fairly fast:

delete a
from TableA a 
left join TableB b on a.MLS = b.LIST_3
where b.LIST_3 is null;

SqlFiddle: http://sqlfiddle.com/#!9/2e1e0/1

Upvotes: 0

Gordon Linoff
Gordon Linoff

Reputation: 1269753

If TableA has a primary key, you could do this as:

DELETE a
    FROM TableA a JOIN
         (SELECT a2.id
          FROM TableA a2
          LIMIT 100  -- should have an order by with limit
         ) a2
         ON a.id = a2.id
    WHERE NOT EXISTS (SELECT 1 FROM TableB b WHERE b.List_3 = a.MLS);

Notes:

  • The join using the primary key should be quite fast, even with the subquery.
  • I changed the NOT IN to NOT EXISTS. The latter has more intuitive behavior when there are NULL values.
  • You should always use ORDER BY with LIMIT, but your original question didn't specify a way of choosing which 100 rows to look for.

Upvotes: 1

Related Questions