sameh.q
sameh.q

Reputation: 1709

How to check if a string matches multiple conditions in Oracle using regular expressions?

After struggling with regular expressions, I've came up with this pattern ^(ABC_)\w*(_USER[0-9]*)\w*(_MOD_)\w* that match this kind of word

If the string starts with ABC_ and contains _USER with any number following it, and also contains the word _MOD_ after that

Example of a matching strings:

ABC_sssss_USER0000000000_sssss_MOD_sssss

ABC_SCssB_USER0332_MOD_REG_SP

tested in this tool: http://www.regexpal.com/

but I cant get it work in oracle sql

Here is my testing code:

SELECT
    OBJECT_NAME,
    REGEXP_INSTR(OBJECT_NAME, '^(ABC_)\w*(_USER[0-9]*)\w*(_MOD_)\w*') AS IS_MATCH
FROM
    (

        SELECT 'ABC_SCssB_USER0332_MOD_REG_SP' OBJECT_NAME FROM DUAL UNION
        SELECT 'ABC_SCssB_USER0332_REG_SP' FROM DUAL UNION
        SELECT 'SCssB_USER0332_MOD_REG_SP' FROM DUAL UNION
        SELECT 'ABC_SCssB_MOD_REG_SP' FROM DUAL
    )

Result:

ABC_SCssB_MOD_REG_SP            0
ABC_SCssB_USER0332_MOD_REG_SP   0
ABC_SCssB_USER0332_REG_SP       0
SCssB_USER0332_MOD_REG_SP       0

Expected Result:

ABC_SCssB_MOD_REG_SP            0
ABC_SCssB_USER0332_MOD_REG_SP   1
ABC_SCssB_USER0332_REG_SP       0
SCssB_USER0332_MOD_REG_SP       0

How can I achieve that in oracle ?

Upvotes: 1

Views: 1506

Answers (2)

Tony Andrews
Tony Andrews

Reputation: 132580

If regular expressions are not mandated you could do this, assuming you need 1 or more digits after '_USER':

select
    object_name,
    case when translate(OBJECT_NAME, '#0123456789', ' ##########') 
              like 'ABC\_%\_USER#%\_MOD\_%' escape '\' 
         then 1 
         else 0 
         end as is_match
from
    (
        select 'ABC_SCssB_USER0332_MOD_REG_SP' object_name from dual union
        select 'ABC_SCssB_USER0332_REG_SP' from dual union
        select 'SCssB_USER0332_MOD_REG_SP' from dual union
        select 'ABC_SCssB_MOD_REG_SP' from dual
    );

This runs a bit quicker than the regexp version for me (on 12.1.0.1.0) - about 75% of the time taken by the regexp version.

If there can be 0 or more digits after '_USER' then this will do:

select
    object_name,
    case when OBJECT_NAME like 'ABC\_%\_USER%\_MOD\_%' escape '\'
         then 1
         else 0
         end as is_match
from
    (
        select 'ABC_SCssB_USER0332_MOD_REG_SP' object_name from dual union
        select 'ABC_SCssB_USER0332_REG_SP' from dual union
        select 'SCssB_USER0332_MOD_REG_SP' from dual union
        select 'ABC_SCssB_MOD_REG_SP' from dual
    );

Upvotes: 2

Timekiller
Timekiller

Reputation: 3126

Ok, so it turns out it will work if you change \w* to .*. It's still not clear what causes \w to fail, though.

I have once encountered non-latin ranges in character classes (like [A-z] but for Cyrillic, [А-я]) not working properly because of NLS_SORT settings. perhaps something similar is affecting \w?

@simsim, please post your exact database version and NLS settings, so that we could try to get to the root of the problem and make this question more useful to others.

EDIT:

The reason turns out to be much simpler - database version 10.1 is the culprit, regexp support was just added in 10g and \w is simply not supported in this version. My instance is 10.2, and "perl-influenced extensions" were only added in 10.2 - see this table for a full list of things that were added, and this link to see what's available in 10.1. Be aware that you also don't have support for non-greedy quantifiers (.*?, .+?) or similar character classes like \d.

Upvotes: 1

Related Questions