ziggy
ziggy

Reputation: 15876

Finding out the highest number in a comma separated string using Oracle SQL

I have a table with two columns:

OLD_REVISIONS   |NEW_REVISIONS
-----------------------------------
1,25,26,24      |1,26,24,25
1,56,55,54      |1,55,54
1               |1
1,2             |1
1,96,95,94      |1,96,94,95
1               |1
1               |1
1               |1
1               |1
1,2             |1,2
1               |1
1               |1
1               |1
1               |1

I would like to find all the instances where the highest revision in the OLD_REVISIONS column is lower than than the highest revision in NEW_REVISIONS

The following would fit that criteria

OLD_REVISIONS   |NEW_REVISIONS
-----------------------------------
1,2             |1
1,56,55,54      |1,55,54

Also, MAX function expects a single number.

Is there another way i can achieve the above? I am looking for a pure SQL option so i can print out the results (or a PL/SQL option that can print out the results)

Edit

Apologies for not mentioning this but for the NEW_REVISIONS i do actually have the data in a table where each revision is in a separate row:

"DOCNUMBER" "REVISIONNUMBER"
67          1
67          24
67          25
67          26
75          1
75          54
75          55
75          56
78          1
79          1
79          2
83          1
83          96
83          94

Just to give some content, a few weeks ago i suspected that there are revisions disappearing. To investigate this, i decided to take a count of all revisions for all documents and take a snapshot to compare later to see if revisions are indeed missing.

The snapshot that i took contained the following columns:

docnumber, count, revisions

The revisions were stored in a comma separated list using the listagg function.

The trouble i have now is the on live table, new revisions have been added so when i compare the main table and the snapshot using a MINUS i get a difference because of the new revisions in the main table.

Even though in the actual table the revisions are individual rows, in the snapshot table i dont have the individual rows.

I am thinking the only way to recreate the snapshot in the same format and compare them find out if maximum revision in the main table is lower than the max revision in the snapshot table (hence why im trying to find out how to find out the max in a comma separated string)

Upvotes: 3

Views: 3254

Answers (7)

David דודו Markovitz
David דודו Markovitz

Reputation: 44921

Enjoy.

select    xmlcast(xmlquery(('max((' || OLD_REVISIONS || '))') RETURNING CONTENT) as int) as OLD_REVISIONS_max
         ,xmlcast(xmlquery(('max((' || NEW_REVISIONS || '))') RETURNING CONTENT) as int) as NEW_REVISIONS_max

from      t
;

Upvotes: 4

Marmite Bomber
Marmite Bomber

Reputation: 21063

You may write a PL/SQL function parsing the string and returning the maximal number

select  max_num( '1,26,24,25') max_num from dual;
   MAX_NUM
----------
        26

The query ist than very simple:

select OLD_REVISIONS  NEW_REVISIONS
from revs
where max_num(OLD_REVISIONS) < max_num(NEW_REVISIONS);

A prototyp function without validation and error handling

create or replace function max_num(str_in VARCHAR2) return NUMBER as 
i number;
x varchar2(1);
n number := 0;
max_n number := 0;
pow number := 0;
begin
 for i in 0.. length(str_in)-1 loop
  x := substr(str_in,length(str_in)-i,1);
  if x = ',' then 
    -- check max number
    if n > max_n then 
       max_n := n;
    end if;   
    -- reset
    n := 0;
    pow := 0;
  else
    n := n + to_number(x)*power(10,pow);
    pow := pow +1;
  end if;
 end loop;
 return(max_n);
end;
/

Upvotes: 0

user5683823
user5683823

Reputation:

Assuming your base table has an id column (versions of what?) - here is a solution based on splitting the rows.

Edit: If you like this solution, check out vkp's solution, which is better than mine. I explain why his solution is better in a Comment to his Answer.

with
     t ( id, old_revisions, new_revisions ) as (
       select 101, '1,25,26,24', '1,26,24,25' from dual union all
       select 102, '1,56,55,54', '1,55,54'    from dual union all
       select 103, '1'         , '1'          from dual union all
       select 104, '1,2'       , '1'          from dual union all
       select 105, '1,96,95,94', '1,96,94,95' from dual union all
       select 106, '1'         , '1'          from dual union all
       select 107, '1'         , '1'          from dual union all
       select 108, '1'         , '1'          from dual union all
       select 109, '1'         , '1'          from dual union all
       select 110, '1,2'       , '1,2'        from dual union all
       select 111, '1'         , '1'          from dual union all
       select 112, '1'         , '1'          from dual union all
       select 113, '1'         , '1'          from dual union all
       select 114, '1'         , '1'          from dual
       )
--   END of TEST DATA; the actual solution (SQL query) begins below.
select id, old_revisions, new_revisions
from (
    select id, old_revisions, new_revisions, 'old' as flag,
           to_number(regexp_substr(old_revisions, '\d+', 1, level)) as rev_no
      from t
      connect by level <= regexp_count(old_revisions, ',') + 1
         and  prior id = id
          and prior sys_guid() is not null
    union all
    select id, old_revisions, new_revisions, 'new' as flag,
           to_number(regexp_substr(new_revisions, '\d+', 1, level)) as rev_no
      from t
      connect by level <= regexp_count(new_revisions, ',') + 1
         and  prior id = id
          and prior sys_guid() is not null
     )
group by id, old_revisions, new_revisions
having max(case when flag = 'old' then rev_no end) !=
       max(case when flag = 'new' then rev_no end)
order by id           --   ORDER BY is optional
;


 ID OLD_REVISION NEW_REVISION
--- ------------ ------------
102 1,56,55,54   1,55,54   
104 1,2          1         

Upvotes: 2

Kacper
Kacper

Reputation: 4818

Comments say normalise data. I agree but also I understand it may be not possible. I would try something like query below:

select greatest(val1, val2), t1.r from (
select max(val) val1, r from (
select regexp_substr(v1,'[^,]+', 1, level) val, rowid r from tab1
  connect by regexp_substr(v1, '[^,]+', 1, level) is not null
  ) group by r) t1
  inner join (
select max(val) val2, r from (
select regexp_substr(v2,'[^,]+', 1, level) val, rowid r from tab1
  connect by regexp_substr(v2, '[^,]+', 1, level) is not null
  ) group by r) t2
  on (t1.r = t2.r);

Tested on:

create table tab1 (v1 varchar2(100), v2 varchar2(100));
insert into tab1 values ('1,3,5','1,4,7');
insert into tab1 values ('1,3,5','1,2,9');
insert into tab1 values ('1,3,5','1,3,5');
insert into tab1 values ('1,3,5','1,4');

and seems to work fine. I left rowid for reference. I guess you have some id in table.

After your edit I would change query to:

select greatest(val1, val2), t1.r from (
select max(val) val1, r from (
select regexp_substr(v1,'[^,]+', 1, level) val, DOCNUMBER r from tab1
  connect by regexp_substr(v1, '[^,]+', 1, level) is not null
  ) group by DOCNUMBER) t1
  inner join (
select max(DOCNUMBER) val2, DOCNUMBER r from NEW_REVISIONS) t2
  on (t1.r = t2.r);

Upvotes: 1

Vamsi Prabhala
Vamsi Prabhala

Reputation: 49260

One way to do is to split the columns on comma separation using regexp_substr and checking if the max and min values are different.

Sample Demo

with rownums as (select t.*,row_number() over(order by old_revisions) rn from t)
select old_revisions,new_revisions 
from rownums 
where rn in (select rn
             from rownums
             group by rn
             connect by regexp_substr(old_revisions, '[^,]+', 1, level) is not null 
             or regexp_substr(new_revisions, '[^,]+', 1, level) is not null
             having max(cast(regexp_substr(old_revisions,'[^,]+', 1, level) as int)) 
              <> max(cast(regexp_substr(new_revisions,'[^,]+', 1, level) as int))
    )

Upvotes: 1

Aleksej
Aleksej

Reputation: 22949

This could be a way:

select 
  OLD_REVISIONS,
  NEW_REVISIONS
from 
  REVISIONS t,
  table(cast(multiset(
                        select level
                        from dual
                        connect by  level <= length (regexp_replace(t.OLD_REVISIONS, '[^,]+'))  + 1
                      ) as sys.OdciNumberList
             )
       ) levels_old,
  table(cast(multiset(
                        select level
                        from dual
                        connect by  level <= length (regexp_replace(t.NEW_REVISIONS, '[^,]+'))  + 1
                     )as sys.OdciNumberList
            )
       ) levels_new
group by t.ROWID,
  OLD_REVISIONS,
  NEW_REVISIONS
having max(to_number(trim(regexp_substr(t.OLD_REVISIONS, '[^,]+', 1, levels_old.column_value)))) >
       max(to_number(trim(regexp_substr(t.new_REVISIONS, '[^,]+', 1, levels_new.column_value))))

This uses a double string split to pick the values from every field, and then simply finds the rows where the max values among the two collections match your requirement. You should edit this by adding some unique key in the GROUP BYclause, or a rowid if you don't have any unique key on your table.

Upvotes: 1

hemalp108
hemalp108

Reputation: 1249

You can compare every value by putting together the revisions in the same order using listagg function.

SELECT listagg(o,',') WITHIN GROUP (ORDER BY o) old_revisions,
       listagg(n,',')  WITHIN GROUP (ORDER BY n) new_revisions
FROM (
     SELECT DISTINCT  rowid r,
            regexp_substr(old_revisions, '[^,]+', 1, LEVEL) o,
            regexp_substr(new_revisions, '[^,]+', 1, LEVEL) n
     FROM   table
     WHERE  regexp_substr(old_revisions, '[^,]+', 1, LEVEL) IS NOT NULL
     CONNECT BY LEVEL<=(SELECT greatest(MAX(regexp_count(old_revisions,',')),MAX(regexp_count(new_revisions,',')))+1 c FROM table)
     )
GROUP BY r
HAVING listagg(o,',') WITHIN GROUP (ORDER BY o)<>listagg(n,',') WITHIN GROUP (ORDER BY n); 

Upvotes: 1

Related Questions