Superole
Superole

Reputation: 1371

Oracle listagg groups of x number of rows

I am migrating some data, but before I do I want to generate a rollback script.

The important columns in the table might look like this:

   id    code 
----- -------
 1234  121212
 1345  434343
 2345  121212
    ...

I am changing all codes 121212 and 151515 into 252525, but I need to be able to change them back if something goes wrong somewhere else.

So I started out using listagg to create comma separated lists of ids that I could use in the where-clause of an update statement. Something along these lines:

select '... set code='||code||' where id in ('
|| listagg(id,',') within group (order by id)
|| ');' as RB_STMT
from mytable
where code in (121212,151515)
group by code;

This works perfectly for small sets, like in my dev-environment, but in production i hit a limit of string lengths (ORA-01489).

So what I would like to be able to do is to group by x number of ids in addition to code, such that the result for for x=5 would be:

 RB_STMT
------------
...set code=121212 where id in (1234,1235,1236,1237,1238);
...set code=121212 where id in (1239,2111,2112,2123,2124);
...set code=121212 where id in (2125,2126,2136);
...set code=151515 where id in (1456,2345,2468,2469,2470);
etc.

(I would use a much larger x in reality of course; adapted to the length of the ids)

I tried grouping by floor(id/5) and that fixes the string length issue, but because the ids are not sequential, or evenly spaced, I end up with a lot of results with only a few ids, and that is not ideal either.

Upvotes: 0

Views: 1754

Answers (1)

Alex Poole
Alex Poole

Reputation: 191265

You could use a subquery (inline view or CTE) to allocate each ID to a bucket, and include that in your group-by clause:

select '... set code='||code||' where id in ('
|| listagg(id,',') within group (order by id)
|| ');' as RB_STMT
from (
  select id, code, ceil(row_number() over (partition by code order by id)/5) as bucket
  from mytable
)
where code in (121212,151515)
group by code, bucket;

RB_STMT                                                                         
--------------------------------------------------------------------------------
... set code=121212 where id in (1234,1235,1236,1237,1238);
... set code=121212 where id in (1239,2111,2112,2123,2124);
... set code=121212 where id in (2125,2126,2136);
... set code=151515 where id in (1456,2345,2468,2469,2470);

The subquery on its own would produce output like:

        ID       CODE     BUCKET
---------- ---------- ----------
      1234     121212          1
      1235     121212          1
      1236     121212          1
      1237     121212          1
      1238     121212          1
      1239     121212          2
      1456     151515          1
      2111     121212          2
      2112     121212          2
      2123     121212          2
      2124     121212          2
      2125     121212          3
      2126     121212          3
      2136     121212          3
      2345     151515          1
      2468     151515          1
      2469     151515          1
      2470     151515          1

The first five IDs for code 121212 all have bucket 1; the next five all have bucket 2; the last four have bucket 3. The buckets for 151515 start again at 1. Grouping by both code and bucket gives you a short list to aggregate into the final string.

To have more per line, just change the 5 to something higher.


Assuming you won't know there's an issue until after you've committed your update, so rollback isn't an option - so you're creating a backout strategy rather than a 'rollback' script - other options include: export the table and truncate/import if you need to revert it; or create a backup copy of the table and either revert to the copy or use it to do the update; or probably some others I haven't thought of.

Upvotes: 3

Related Questions