Miguel Costa
Miguel Costa

Reputation: 748

oracle phone number format reg exp in wrong data

I come here after checking this topic: Regular Expression that includes plus sign and decimal

my main problem is the following, I have wrong migrated data that needs fixing. for phone numbers we have some letters and special chars that need to be removed.

imagine for example this would be the data:

s#sdads+#09232342a

I want to get only the + and the numbers The + must be optional.

http://www.techonthenet.com/oracle/functions/regexp_substr.php https://docs.oracle.com/cd/B19306_01/appdev.102/b14251/adfns_regexp.htm

if I read the documentation from oracle this should be quite easy but I can't make it work, I must be missing something basic. any idea?

below what some examples of what I've tried:

select regexp_substr('s#sdads+#09232342a','\+*[0-9]+') from dual;

returns

09232342

I was expecting it to consider the + I also tried putting the + in the beginning of the test string and no difference. The same for changing the regexp to

[\+]*[0-9]+  or [+]*[0-9]+ or (\+)*(\d)+

and there was no difference in the result.

the part that gets me confused is that if I change the test string and the regexp this returns + and only plus

select regexp_substr('+s#sdads+#09232342a','?[\+]*') from dual;

but if I move the + out of the beginning is gone

also if I put the numbers there, only the numbers are considered:

select regexp_substr('+s#sdads+#09232342a','?[\+]*(\d)+') from dual;

this also didn't work:

select regexp_substr('+s#sdads+#09232342a','?[\+]{0,1}(\d)+') from dual;

Upvotes: 0

Views: 116

Answers (2)

user5683823
user5683823

Reputation:

The problem is that in all your tries, you are looking for + followed IMMEDIATELY by digits. Something like this should work:

select regexp_substr('+s#sdads+#09232342a', '\+') || 
       regexp_substr('+s#sdads+#09232342a', '\d+') from dual;

Upvotes: 1

MT0
MT0

Reputation: 168051

You have a # character between the + and the digits that you need to account for:

Query 1 - Replace all non-desired characters:

SELECT REGEXP_REPLACE( '+s#sdads+#09232342a', '[^+0-9]' )
FROM   DUAL

Output:

++09232342

Query 2 - Match the entire string:

SELECT REGEXP_REPLACE( '+s#sdads+#09232342a', '^.*?(\+?)#(\d+).*$', '\1\2' )
FROM   DUAL

Output:

+09232342

Query 3 - Match the sub-string:

SELECT REGEXP_SUBSTR( '+s#sdads+#09232342a', '\+?#\d+' )
FROM   DUAL

(You could then replace the extraneous # by wrapping that statement in another replace)

Output:

+#09232342

Upvotes: 1

Related Questions