Mrye
Mrye

Reputation: 727

Oracle SQL : Extract all matched string in each row

Suppose that I have this table MESSAGE with two columns ITEM and CONTENT

ITEM1 | Dear ${username}, you have changed your address to ${address}
ITEM2 | Hi ${username}, thank you for attending this event.

Currently, I want to display each word with this pattern ${variable}. The expected end result is

ITEM1 | ${username}, ${address}
ITEM2 | ${username}

Whats the correct Oracle SQL query to achieve this ? I have tried something like this but it just list out the content with matching regex.

SELECT ITEM, REGEXP_SUBSTR(CONTENT, '^.*\$\{.*\}.*$', 1, 1) FROM MESSAGE;

Upvotes: 0

Views: 1961

Answers (3)

user5683823
user5683823

Reputation:

with
  inputs ( str ) as (
    select 'Dear ${username}, you have changed your address to ${address}' 
    from   dual
  )
select rtrim(regexp_replace(str, '.*?(\${[^}]*}|$)', '\1, '), ', ') as vars 
from   inputs;

VARS                   
-----------------------
${username}, ${address}

Upvotes: 0

Kaushik Nayak
Kaushik Nayak

Reputation: 31666

You may use (\$\{.+?\}) - using ? makes it a non-greedy match of one or more characters between {}
() captures the group.

A connect by loop (with PRIOR and SYS_GUID()) with level is used to extract all possible matches in a line.

LISTAGG does the concatenation.

WITH message AS (
    SELECT
        'ITEM1' AS item,
        'Dear ${username}, you have changed your address to ${address}' AS content
    FROM
        dual
    UNION ALL
    SELECT
        'ITEM2',
        'Hi ${username}, thank you for attending this event.'
    FROM
        dual
) 
SELECT item,LISTAGG(content,',') WITHIN GROUP ( ORDER BY lvl )
FROM
(
SELECT
    item,
    regexp_substr(content,'(\$\{.+?\})',1,level,NULL,1) as content,
    level as lvl
  FROM
      message
CONNECT BY level <= regexp_count(content,'\$\{.+?\}')
           AND PRIOR item = item
           AND PRIOR sys_guid() IS NOT NULL
           ) GROUP BY item;

Demo

Upvotes: 3

Littlefoot
Littlefoot

Reputation: 142753

I'm not that good at regular expressions, see if this can help:

SQL> with message (item, content) as
  2    (select 'ITEM1', 'Dear ${username}, you have changed your address to ${address}' from dual union
  3     select 'ITEM2', 'Hi ${username}, thank you for attending this event.' from dual
  4    )
  5  select item,
  6    regexp_substr(content, '\$\{\w+}', 1, 1) the_first,
  7    regexp_substr(content, '\$\{\w+}', 1, 2) the_second
  8  from message;

ITEM  THE_FIRST            THE_SECOND
----- -------------------- --------------------
ITEM1 ${username}          ${address}
ITEM2 ${username}

SQL>

What does it do?

  • \$ is here to escape the dollar sign, because it is used as an anchor for the end of the string (so - escape it)
  • the same goes for \{ as curly brackets represent number of appearances (so - escape it)
  • \w+ takes the whole word
  • 1, 1 and 1, 2: starting from the first character, take the first (1, 1) or the second (1, 2) appearance of that expression

It is trivial to concatenate these two and separate them with a comma.

Upvotes: 0

Related Questions