Reputation: 1250
I have an issue of regex_substr not honoring the null value.
select
REGEXP_SUBSTR ('2035197553,2,S,14-JUN-14,,P', '[^,]+', 1, 1) AS phn_nbr,
REGEXP_SUBSTR ('2035197553,2,S,14-JUN-14,,P', '[^,]+', 1, 2) AS phn_pos,
REGEXP_SUBSTR ('2035197553,2,S,14-JUN-14,,P', '[^,]+', 1, 3) AS phn_typ,
REGEXP_SUBSTR ('2035197553,2,S,14-JUN-14,,P', '[^,]+', 1, 4) AS phn_strt_dt,
REGEXP_SUBSTR ('2035197553,2,S,14-JUN-14,,P', '[^,]+', 1, 5) AS phn_end_dt,
REGEXP_SUBSTR ('2035197553,2,S,14-JUN-14,,P', '[^,]+', 1, 6) AS pub_indctr
from dual;
If the phn_end_dt is null and pub_indctr is not null, the values of pub_indctr are shifted to phn_end_dt.
Result:-
PHN_NBR PHN_POS PHN_TYP PHN_STRT_DT PHN_END_DT PUB_INDCTR
---------- ------- ------- ----------- ---------- ------------
2035197553 2 S 14-JUN-14 P
While it should be
PHN_NBR PHN_POS PHN_TYP PHN_STRT_DT PHN_END_DT PUB_INDCTR
---------- ------- ------- ----------- ---------- ------------
2035197553 2 S 14-JUN-14 P
Any suggestions ?
Upvotes: 3
Views: 6599
Reputation: 11
I have a generic use case where I don't know the exact columns coming in the string. I thus used below code which solved the purpose.
function substring_specific_occurence(p_string varchar2
,p_delimiter varchar2
,p_occurence number) return varchar2
is
l_output varchar2(2000);
g_miss_char varchar2(20) := 'fdkjkjhkuhhf7';
l_string varchar2(10000) := replace(p_string,p_delimiter||p_delimiter,''||p_delimiter||g_miss_char||p_delimiter||'' );
begin
while (l_string like '%'||p_delimiter||p_delimiter||'%' )
loop
l_string := replace(l_string,p_delimiter||p_delimiter,''||p_delimiter||g_miss_char||p_delimiter||'');
end loop;
select regexp_substr(l_string,'[^'||p_delimiter||']+',1,p_occurence)
into l_output
from dual;
return replace(l_output,g_miss_char);
end substring_specific_occurence;
Upvotes: 1
Reputation: 1250
Thanks for pointing me in the right direction, I have used this to solve the issue.
SELECT REGEXP_SUBSTR (val, '([^,]*),|$', 1, 1, NULL, 1) phn_nbr ,
REGEXP_SUBSTR (val, '([^,]*),|$', 1, 2, NULL, 1) phn_pos ,
REGEXP_SUBSTR (val, '([^,]*),|$', 1, 3, NULL, 1) phn_typ ,
REGEXP_SUBSTR (val, '([^,]*),|$', 1, 4, NULL, 1) phn_strt_dt ,
REGEXP_SUBSTR (val, '([^,]*),|$', 1, 5, NULL, 1) phn_end_dt ,
REGEXP_SUBSTR (val
|| ',', '([^,]*),|$', 1, 6, NULL, 1) pub_indctr
FROM
(SELECT '2035197553,2,S,14-JUN-14,,P' val FROM dual
);
Oracle Version:- Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
Upvotes: 1
Reputation: 10360
I'm afraid your accepted answer does not handle the case where you need the value after the null position (try to get the 6th field):
SQL> select REGEXP_SUBSTR ('2035197553,2,S,14-JUN-14,,P', '[^,]*', 1, 6) phn_end
_dt
2 from dual;
P
-
You need to do this instead I believe (works on 11g):
SQL> select REGEXP_SUBSTR ('2035197553,2,S,14-JUN-14,,P', '([^,]*)(,|$)', 1, 6,
NULL, 1) phn_end_dt
2 from dual;
P
-
P
I just discovered this after posting my own question: REGEX to select nth value from a list, allowing for nulls
Upvotes: 4
Reputation: 4660
The typical csv parsing approach is as follows:
WITH t(csv_str) AS
( SELECT '2035197553,2,S,14-JUN-14,,P' FROM dual
UNION ALL
SELECT '2035197553,2,S,14-JUN-14,,' FROM dual
)
SELECT LTRIM(REGEXP_SUBSTR (','
|| csv_str, ',[^,]*', 1, 1), ',') AS phn_nbr,
LTRIM(REGEXP_SUBSTR (','
|| csv_str, ',[^,]*', 1, 2), ',') AS phn_pos,
LTRIM(REGEXP_SUBSTR (','
|| csv_str, ',[^,]*', 1, 3), ',') AS phn_typ,
LTRIM(REGEXP_SUBSTR (','
|| csv_str, ',[^,]*', 1, 4), ',') AS phn_strt_dt,
LTRIM(REGEXP_SUBSTR (','
|| csv_str, ',[^,]*', 1, 5), ',') AS phn_end_dt,
LTRIM(REGEXP_SUBSTR (','
|| csv_str, ',[^,]*', 1, 6), ',') AS pub_indctr
FROM t
I like to place a comma preceeding my csv and then I would count the commas with the non-comma pattern.
Explanation of the search pattern
The search pattern looks for the nth substring (nth corresponds with the nth element in the csv) which has the following:
-The pattern begins with a ',
'
-Next, it is followed by the pattern, '[^,]
'. This is just a non-matching list expression. The caret, ^
, conveys that the characters following in the list should not be matched.
-This non-matching list of characters has the quantifier, *
, which means this can occur 0 or more times.
~~~~~~~~~~~~~~~~~~~~~~~~~~~
Once a match is found, I would also use the LTRIM function to remove the comma after I used the reg expression.
What is nice about this approach is the occurrence of the search pattern will always correspond with the occurences of the comma.
Upvotes: 2
Reputation: 6476
You can solve your task like this:
with t(val) as (
select '2035197553,2,S,14-JUN-14,,P' from dual
), t1 (val) as (
select ',' || val || ',' from t
)
select substr(val, REGEXP_INSTR(val, ',', 1, 1) + 1, REGEXP_INSTR(val, ',', 1, 1 + 1) - REGEXP_INSTR(val, ',', 1, 1) - 1) a
, substr(val, REGEXP_INSTR(val, ',', 1, 2) + 1, REGEXP_INSTR(val, ',', 1, 2 + 1) - REGEXP_INSTR(val, ',', 1, 2) - 1) b
, substr(val, REGEXP_INSTR(val, ',', 1, 3) + 1, REGEXP_INSTR(val, ',', 1, 3 + 1) - REGEXP_INSTR(val, ',', 1, 3) - 1) c
, substr(val, REGEXP_INSTR(val, ',', 1, 4) + 1, REGEXP_INSTR(val, ',', 1, 4 + 1) - REGEXP_INSTR(val, ',', 1, 4) - 1) d
, substr(val, REGEXP_INSTR(val, ',', 1, 5) + 1, REGEXP_INSTR(val, ',', 1, 5 + 1) - REGEXP_INSTR(val, ',', 1, 5) - 1) e
, substr(val, REGEXP_INSTR(val, ',', 1, 6) + 1, REGEXP_INSTR(val, ',', 1, 6 + 1) - REGEXP_INSTR(val, ',', 1, 6) - 1) f
from t1
A B C D E F
-------------------------------------
2035197553 2 S 14-JUN-14 - P
Upvotes: 2
Reputation: 174756
You need to change this line,
REGEXP_SUBSTR ('2035197553,2,S,14-JUN-14,,P', '[^,]+', 1, 5) AS phn_end_dt,
to,
REGEXP_SUBSTR ('2035197553,2,S,14-JUN-14,,P', '[^,]*', 1, 5) AS phn_end_dt,
^
[^,]+
means it matches any character not of ,
one or more times. [^,]*
means it matches any character not of ,
zero or more times. So [^,]+
assumes that there must be a single character not of ,
would present. But really there isn't , by changing +
to *
makes the regex engine to match a empty character.
Upvotes: 1