Martin
Martin

Reputation: 195

sqlite query - select all older than X days, not Y newest

my sqlite table has following format (all not null and not unique INTEGER for example):

time        type data
1436268660  0    ...
1436268661  1    ...
1436268662  0    ...
1436268666  2    ...
1436268668  1    ...

Sometimes I need to delete all rows of each type which are older than some time but I need to leave 5 newest from each type even if they are older than that specific time. In other words, to leave 5 newest rows of each type and also all newer than specified time (if there is more than 5) and to delete the rest.

So, if the specified time is X and type 0 has 20 rows newer than X, nothing is done for type 0 (all are new enough). Also if the specified time is X and type 0 has 5 rows all older than X, nothing is done (there is not more than 5 of them). But if there is for example 7 entries and at least 2 of them are older than X, then those 2 oldest are deleted.

What I have so far is this query. But it is not correct. It just deletes all rows older than X when there is more than 5 of that type. If they are all older than X nothing is left.

DELETE FROM table WHERE rowid IN 
  (SELECT table.rowid FROM table JOIN
    (SELECT type FROM table GROUP BY type HAVING COUNT(*) > 5) 
  USING (type) WHERE TIME < 14362685399);

As you can see the situation is little bit more complicated as "type" described above by me is in reality unique combination of multiple columns (you can replace by type1,type2,type3), but I guess it is not so important for the solution. Thank you for any help.

time        type0 type1 type2 data
1436268660  0     0     0     ...
1436268661  1     1     1     ...
1436268662  0     0     0     ...
1436268666  2     2     2     ...
1436268668  1     1     1     ...

Edit: Basically I need to delete all rows NOT in: (newer than X) UNION (5 latest entries for each type). I just don't know how to create result with "5 latest entries for each type".

Upvotes: 1

Views: 433

Answers (1)

bsa
bsa

Reputation: 2801

Try this. I've renamed your table to t.

DELETE FROM t WHERE rowid IN(
  SELECT a.rowid FROM t a
  WHERE time < 14362685399
  AND a.rowid NOT IN (
    SELECT b.rowid FROM t b
    WHERE a.type = b.type
    ORDER BY b.time DESC
    LIMIT 5
  )
);

Note that this may not be very efficient on large data, due to the correlated subquery which will be evaluated each time it is required (possibly once per distinct type in your table, or maybe even once per row in the table, depending on how the query is executed).

As an aside, in a SQL variant that supports it, this would probably be better achieved with a window function. For example, in Postgres.

Upvotes: 2

Related Questions