Z.Szymon
Z.Szymon

Reputation: 337

Oracle Database, extract string beeing between two other strings

I need a regexp that's combined with regexp_substr() would give me the word being between two other specified words.

Example:

source_string => 'First Middle Last'

    substring varchar2(100);
    substring := regexp_substr(source_string, 'First (.*) Last'); <=== 

this doesn't work :(.

dbms_output.put_line(substring) ===> output should be: 'Middle'

I know it looks simple and to be honest, at the beginning I thought the same. But now after spending about 3h for searching for a solution I give up...

Upvotes: 0

Views: 2565

Answers (3)

Barbaros &#214;zhan
Barbaros &#214;zhan

Reputation: 65408

(\S*) pattern might be used with regexp_replace and regexp_substr as in the following way to get the middle word :

with t(str) as
(
 select 'First Middle Last' from dual
)
select regexp_substr(trim(regexp_replace(str, '^(\S*)', '')),'(\S*)') 
    as "Result String"
  from t;

Result String
-------------
Middle    

in the first step First, and in the second one Last words are trimmed.

Or, More directly you can figure out by using regexp_replace as

with t(str) as
(
 select 'First Middle Last' from dual
)
select regexp_replace(str,'(.*) (.*) (.*)','\2') 
    as "Result String"
  from t;

Result String
-------------
Middle    

Upvotes: 0

Littlefoot
Littlefoot

Reputation: 143083

If source string always looks the same, i.e. consists of 3 elements (words), then such a simple regular expression does the job:

SQL> with t (str) as
  2    (select 'First Middle Last' from dual)
  3  select regexp_substr(str, '\w+', 1, 2) result from t;

RESULT
------
Middle

SQL>

Upvotes: 1

Ben
Ben

Reputation: 52913

It's not working because the literal strings 'First' and 'Last' are being looked for. Assuming that the strings don't all literally begin 'First' you need to find another way to represent them. You've already done this by representing 'Middle' as (.*)

The next point is that you need to extract a sub-expression (the part in parenthesis), this is the 6th parameter of REGEXP_SUBSTR().

If you put these together then the following gives you what you want:

regexp_substr(source_string, '.*\s(.*)\s.*', 1, 1, 'i', 1)

An example of it working:

SQL> select regexp_substr('first middle last', '.*\s(.*)\s.*', 1, 1, 'i', 1)
  2    from dual;

REGEXP
------
middle

You can also use an online regex tester to validate that 'middle' is the only captured group.

Depending on what your actual source strings look like you may not want to search for exactly spaces, but use \W (a non-word character) instead.

If you're expecting exactly three words I'd also anchor your expression to the start and end of the string: ^.*\s(.*)\s.*$

Upvotes: 2

Related Questions