Reputation: 1
This is my query, I plan to run it in batches of perhaps 5000 hence the rownum < 5000
delete my_table
where rownum < 5000
and type = 'Happy'
and id not in
( select max_id
from ( select max(log_id) max_id
, object_id
, type
from my_table
where type = 'Happy'
group
by id
, type
)
)
I want to delete happy records but keeping the maximum log id, per object id
I hope that makes sense.
Should I be using some sort of join to improve performance?
Upvotes: 0
Views: 1407
Reputation: 52386
You could simplify the query to:
delete my_table
where rownum < 5000
and type = 'Happy'
and id not in (select max(log_id) max_id
from my_table
where type = 'Happy'
group by object_id, type)
Upvotes: 0
Reputation: 15493
Since you only care to delete ANY 5000 log entries for type = 'Happy', as long as its not the most recent for any object_id, then you can do something like this:
delete
from my_table
where log_id in (
select log_id from (
select log_id,
row_number() over (partition by object_id order by log_id desc) rnk
from my_table
where typ = 'Happy'
and rownum <= 5000
)
where rnk > 1
)
This is different from what you have because in your approach, you still need to calculate the max(id) per object across the entire table, which isn't necessary (and log tables can get very large). You just need to make sure you're not deleting the "newest" row (per object) of the 5000 batch rows. Personally, I prefer to setup log tables using partitions, but not everyone has this option.
Hope that helps.
Upvotes: 0
Reputation: 1271121
I think this might run faster as a correlated subquery:
Delete
from my_table
where type = 'Happy' and
exists (select 1
from my_table t2
where t2.object_id = my_table.object_id and
t2.type = my_table.type and
t2.id > my_table.id
);
Then, an index on my_table(object_id, type, id)
might also help this query.
Upvotes: 3