Reputation: 748
I come here after checking this topic: Regular Expression that includes plus sign and decimal
my main problem is the following, I have wrong migrated data that needs fixing. for phone numbers we have some letters and special chars that need to be removed.
imagine for example this would be the data:
s#sdads+#09232342a
I want to get only the + and the numbers The + must be optional.
http://www.techonthenet.com/oracle/functions/regexp_substr.php https://docs.oracle.com/cd/B19306_01/appdev.102/b14251/adfns_regexp.htm
if I read the documentation from oracle this should be quite easy but I can't make it work, I must be missing something basic. any idea?
below what some examples of what I've tried:
select regexp_substr('s#sdads+#09232342a','\+*[0-9]+') from dual;
returns
09232342
I was expecting it to consider the + I also tried putting the + in the beginning of the test string and no difference. The same for changing the regexp to
[\+]*[0-9]+ or [+]*[0-9]+ or (\+)*(\d)+
and there was no difference in the result.
the part that gets me confused is that if I change the test string and the regexp this returns + and only plus
select regexp_substr('+s#sdads+#09232342a','?[\+]*') from dual;
but if I move the + out of the beginning is gone
also if I put the numbers there, only the numbers are considered:
select regexp_substr('+s#sdads+#09232342a','?[\+]*(\d)+') from dual;
this also didn't work:
select regexp_substr('+s#sdads+#09232342a','?[\+]{0,1}(\d)+') from dual;
Upvotes: 0
Views: 116
Reputation:
The problem is that in all your tries, you are looking for + followed IMMEDIATELY by digits. Something like this should work:
select regexp_substr('+s#sdads+#09232342a', '\+') ||
regexp_substr('+s#sdads+#09232342a', '\d+') from dual;
Upvotes: 1
Reputation: 168051
You have a #
character between the +
and the digits that you need to account for:
Query 1 - Replace all non-desired characters:
SELECT REGEXP_REPLACE( '+s#sdads+#09232342a', '[^+0-9]' )
FROM DUAL
Output:
++09232342
Query 2 - Match the entire string:
SELECT REGEXP_REPLACE( '+s#sdads+#09232342a', '^.*?(\+?)#(\d+).*$', '\1\2' )
FROM DUAL
Output:
+09232342
Query 3 - Match the sub-string:
SELECT REGEXP_SUBSTR( '+s#sdads+#09232342a', '\+?#\d+' )
FROM DUAL
(You could then replace the extraneous #
by wrapping that statement in another replace)
Output:
+#09232342
Upvotes: 1