tonyf
tonyf

Reputation: 35557

How to process a column that holds a comma-separated or range string values in Oracle

Using Oracle 12c DB, I have the following table data example that I need assistance with using SQL and PL/SQL.

Table data is as follows:

Table Name: my_data

ID      ITEM        ITEM_LOC
------- ----------- ----------------
1       Item-1      0,1
2       Item-2      0,1,2,3,4,7
3       Item-3      0-48
4       Item-4      0,1,2,3,4,5,6,7,8
5       Item-5      1-33
6       Item-6      0,1
7       Item-7      0,1,5,8

Using the data above within the my_data table, what is the best way to process this ITEM_LOC as I need to use the values in this column as an individual value, i.e:

0,1 means the SQL needs to return either 0 or 1 or

range values, i.e:

0-48 means the SQL needs to return a value between 0 and 48.

The returned values for both scenarios should commence from lowest to highest and can't be re-used once processed.

Based on the above, it would be great to have a function that takes the ID and returns an individual value from ITEM_LOC that hasn't been used, based on my description above. This could be a comma-separated string value or a range string value.

Desired result for ID = 2 could be 7. For this ID = 2, ITEM_LOC = 7 could not be used again.

Desired result for ID = 5 could be 31. For this ID = 5, ITEM_LOC = 31 could not be used again.

For the ITEM_LOC data that could not be used again, against that ID, I am looking at holding another table to hold this or perhaps separate all data into separate rows with a new column called VALUE_USED.

Upvotes: 0

Views: 425

Answers (2)

Stew Ashton
Stew Ashton

Reputation: 1529

As others have said, it's a bad idea to store data in this way. You very likely could have input like this, and you likely could need to display the data like this, but you don't have to store the data the way it is input or displayed.

I'm going to store the data as individual LOC elements based on the input. I assume the data contains only integers separated by commas, or pairs of integers separated by a hyphen. Whitespace is ignored. The comma-separated list does not have to be in any order. In pairs, if the left integer is greater than the right integer I return no LOC element.

create table t as 
with input(id, item, item_loc) as (
  select 1, 'Item-1', ' 0,1' from dual union all
  select 2, 'Item-2', '0,1,2,3,4,7' from dual union all
  select 3, 'Item-3', '0-48' from dual union all
  select 4, 'Item-4', '0,1,2,3,4,5,6,7,8' from dual union all
  select 5, 'Item-5', '1-33' from dual union all
  select 6, 'Item-6', '0,1' from dual union all
  select 7, 'Item-7', '0,1,5,8,7 - 11' from dual
)
select distinct id, item, loc from input, xmltable(
 'let $item := if (contains($X,",")) then ora:tokenize($X,"\,") else $X
  for $i in $item
    let $j := if (contains($i,"-")) then ora:tokenize($i,"\-") else $i
    for $k in xs:int($j[1]) to xs:int($j[count($j)])
    return $k'
  passing item_loc as X
  columns loc number path '.'
);

Now to "use" an element I just delete it from the table:

delete from t where rowid = (
  select min(rowid) keep (dense_rank first order by loc)
  from t
  where id = 7
);

To return the data in the same format it was input, use MATCH_RECOGNIZE:

select id, item, listagg(item_loc, ',') within group(order by first_loc) item_loc
from t
match_recognize(
  partition by id, item order by loc
  measures a.loc first_loc,
    a.loc || case count(*) when 1 then null else '-'||b.loc end item_loc
  pattern (a b*)
  define b as loc = prev(loc) + 1
)
group by id, item;

ID  ITEM    ITEM_LOC
1   Item-1  0-1
2   Item-2  0-4,7
3   Item-3  0-48
4   Item-4  0-8
5   Item-5  1-33
6   Item-6  0-1
7   Item-7  1,5,7-11

Note that the output here will not be exactly like the input, because any consecutive integers will be compressed into a pair.

Upvotes: 0

Littlefoot
Littlefoot

Reputation: 142733

This query shows how to extract list of ITEM_LOC values based on whether they are comma-separated (which means "take exactly those values") or dash-separated (which means "find all values between starting and end point"). I modified your sample data a little bit (didn't feel like displaying ~50 values if 5 of them do the job).

  • lines #1 - 6 represent sample data.
  • the first select (lines #7 - 15) splits comma-separated values into rows
  • the second select (lines #17 - 26) uses a hierarchical query which adds 1 to the starting value, up to item's end value.

SQL> with my_data (id, item, item_loc) as
  2    (select 2, 'Item-2', '0,2,4,7' from dual union all
  3     select 7, 'Item-7', '0,1,5'   from dual union all
  4     select 3, 'Item-3', '0-4'     from dual union all
  5     select 8, 'Item-8', '5-8'     from dual
  6    )
  7  select id,
  8    item,
  9    regexp_substr(item_loc, '[^,]+', 1, column_value) loc
 10  from my_data
 11    cross join table(cast(multiset
 12      (select level from dual
 13       connect by level <= regexp_count(item_loc, ',') + 1
 14      ) as sys.odcinumberlist))
 15  where instr(item_loc, '-') = 0
 16  union all
 17  select id,
 18    item,
 19    to_char(to_number(regexp_substr(item_loc, '^\d+')) + column_value - 1) loc
 20  from my_data
 21    cross join table(cast(multiset
 22      (select level from dual
 23       connect by level <= to_number(regexp_substr(item_loc, '\d+$')) -
 24                           to_number(regexp_substr(item_loc, '^\d+')) + 1
 25      ) as sys.odcinumberlist))
 26  where instr(item_loc, '-') > 0
 27  order by id, item, loc;

        ID ITEM   LOC
---------- ------ ----------------------------------------
         2 Item-2 0
         2 Item-2 2
         2 Item-2 4
         2 Item-2 7
         3 Item-3 0
         3 Item-3 1
         3 Item-3 2
         3 Item-3 3
         3 Item-3 4
         7 Item-7 0
         7 Item-7 1
         7 Item-7 5
         8 Item-8 5
         8 Item-8 6
         8 Item-8 7
         8 Item-8 8

16 rows selected.

SQL>

I don't know what you meant by saying that "item_loc could not be used again". Used where? If you use the above query in, for example, cursor FOR loop, then yes - those values would be used only once as every loop iteration fetches next item_loc value.

Upvotes: 2

Related Questions