Omari Victor Omosa
Omari Victor Omosa

Reputation: 2879

Regex capture first words before character subsequently - oracle

How do i capture the word before the = sign, there is another equal sign in the URL which i do not want to capture

 SELECT
  REGEXP_SUBSTR('TType=SEND|Status=OK|URL=min://j?_a=3&ver=1.1|day=3',
                '([^|]+)=', 1,1,NULL,1) "TType"    
  FROM DUAL;

From the above string TType=SEND|Status=OK|URL=min://j?_a=3&ver=1.1|day=3

I only want TType,Status,URL,day

Note: The string has a pipe delimiter | for the parameters

Upvotes: 1

Views: 83

Answers (2)

user5683823
user5683823

Reputation:

You could do something like this:

select  level as ord,
        regexp_substr('TType=SEND|Status=OK|URL=min://j?_a=3&ver=1.1|day=3',
                      '(\||^)([^|=]*)', 1,level,null,2) as token    
from    dual
connect by level <= 
        regexp_count('TType=SEND|Status=OK|URL=min://j?_a=3&ver=1.1|day=3',
                     '(\||^)([^|=]*)')
;

ORD TOKEN   
--- --------
  1 TType   
  2 Status  
  3 URL     
  4 day 

In a more complex situation (where you may have several input strings, distinguished by an ID of some sort), the query is a bit more complex, but the "regexp" part is the same.

Note that this query finds all the "attribute names", not just the first one. The regular expression searches for each "attribute-value pair" beginning (either the beginning of the string or a pipe character, (\||^) - note that the verbatim pipe character must be escaped), and from there, it takes the longest continuous substring of characters other than pipe character and equal sign ([^|=]*) and it returns the second capture group.

In particular, this means that if a "pair" doesn't have an equal sign, the whole substring ("between pipe characters") will be returned. For example: if the input is

'TType=SEND|Status=OK|blah blah|day=3'

then the third token returned by the query will be all of 'blah blah'. That can be changed as needed (or, alternatively, if you know that all such substings always contain an equal sign, the issue is moot).

Upvotes: 0

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 627535

You can use

SELECT
  TRIM(',' FROM REGEXP_REPLACE('TType=SEND|Status=OK|URL=min://j?_a=3&ver=1.1|day=3',
                '=[^|]+($|\|)', ',')) "TType"    
  FROM DUAL

The REGEXP_REPLACE will find and replace with a comma all occurrences of a =[^|]+($|\|) pattern that matches

  • = - a = char
  • [^|]+ - one or more chars other than |
  • ($|\|) - either end of string or |.

See an online demo.

Upvotes: 1

Related Questions