ajmalmhd04
ajmalmhd04

Reputation: 2602

Oracle regular expression split string from last occurence

I am still learning regexp in oracle I am stuck in middle, below is my sample code:

with t(val)
as
(
  --format: xyz_year_month_date
  select 'my_new_table_2015_06_31' from dual
  union all
  select 'my_new_table_temp_2016_06_31' from dual
 )
 select reverse(regexp_substr(reverse(val),'[^_]+',1,4)) col4,
 reverse(regexp_substr(reverse(val),'[^_]+',1,3)) col3,
 reverse(regexp_substr(reverse(val),'[^_]+',1,2)) col2,
 reverse(regexp_substr(reverse(val),'[^_]+',1,1)) col1
 from t;

Output:
COL4 COL3 COL2 COL1
table 2015 06 31
temp 2016 06 31

Expected output:
COL4 COL3 COL2 COL1
my_new_table 2015 06 31
my_new_table_temp 2016 06 31

Thanks in advance.

Upvotes: 0

Views: 2130

Answers (2)

collapsar
collapsar

Reputation: 17258

Adjust your col4 regex to match the remainder of the string. The last parameter indicates the capture group from the regex match, the second but last parameter is only syntactically required (it holds match parameters).

The outer substr call on cols 1-3 gets rid of the underscore that is part of the matches.

with t(val)
as
(
  --format: xyz_year_month_date
  select 'my_new_table_2015_06_31' from dual
  union all
  select 'my_new_table_temp_2016_06_31' from dual
 )
 select reverse(regexp_substr(reverse(val),'([^_]+_){3}(.*)',1,1,'',2)) col4,
 substr(reverse(regexp_substr(reverse(val),'[^_]+_',1,3)), 2) col3,
 substr(reverse(regexp_substr(reverse(val),'[^_]+_',1,2)), 2) col2,
 substr(reverse(regexp_substr(reverse(val),'[^_]+_',1,1)), 2) col1
 from t;

Upvotes: 0

MT0
MT0

Reputation: 168623

You can do it without the double reverse by extracting different capture groups (surrounded in round () brackets):

WITH t ( VAL ) AS (
  SELECT 'my_new_table_2015_06_31' FROM DUAL UNION ALL
  SELECT 'my_new_table_temp_2016_06_31' FROM DUAL
)
SELECT REGEXP_SUBSTR( val, '^(.*)_([^_]+)_([^_]+)_([^_]+)$', 1, 1, NULL, 1 ) AS COL4,
       REGEXP_SUBSTR( val, '^(.*)_([^_]+)_([^_]+)_([^_]+)$', 1, 1, NULL, 2 ) AS COL3,
       REGEXP_SUBSTR( val, '^(.*)_([^_]+)_([^_]+)_([^_]+)$', 1, 1, NULL, 3 ) AS COL2,
       REGEXP_SUBSTR( val, '^(.*)_([^_]+)_([^_]+)_([^_]+)$', 1, 1, NULL, 4 ) AS COL1
FROM   t

You could even make the regular expression much simpler by just using:

'^(.+)_(.+)_(.+)_(.+)$'

The first .+ is greedy so it will match as much as possible until there is only enough of the string left for the minimum matches on the 2nd - 4th capturing groups.

However, you don't need regular expressions:

WITH t ( VAL ) AS (
  SELECT 'my_new_table_2015_06_31' FROM DUAL UNION ALL
  SELECT 'my_new_table_temp_2016_06_31' FROM DUAL
)
SELECT SUBSTR( val, 1,        pos1 - 1        ) AS col4,
       SUBSTR( val, pos1 + 1, pos2 - pos1 - 1 ) AS col3,
       SUBSTR( val, pos2 + 1, pos3 - pos2 - 1 ) AS col2,
       SUBSTR( val, pos3 + 1                  ) AS col1
FROM   (
  SELECT val,
         INSTR( val, '_', -1, 1 ) AS pos3,
         INSTR( val, '_', -1, 2 ) AS pos2,
         INSTR( val, '_', -1, 3 ) AS pos1
  FROM   t
);

Upvotes: 4

Related Questions