Sibusiso Maphumulo
Sibusiso Maphumulo

Reputation: 13

Using Oracle REGEXP_SUBSTR to extract uppercase data separated by underscores

sample column data:

Failure on table TOLL_USR_TRXN_HISTORY:
Failure on table DOCUMENT_IMAGES:
Error in CREATE_ACC_STATEMENT() [line 16]

I am looking for a way to extract only the uppercase words (table names) separated by underscores. I want the whole table name, the maximum is 3 underscores and the minimum is 1 underscore. I would like to ignore any capital letters that are initcap.

Upvotes: 1

Views: 1874

Answers (3)

Gordon Linoff
Gordon Linoff

Reputation: 1270853

You can just use regexp_substr():

select regexp_substr(str, '[A-Z_]{3,}', 1, 1, 'c')
from (select 'Failure on table TOLL_USR_TRXN_HISTORY' as str from dual) x;

The pattern says to find substrings with capital letters or underscores, at least 3 characters long. The 1, 1 means start from the first position and return the first match. The 'c' makes the search case-sensitive.

Upvotes: 1

kanagaraj
kanagaraj

Reputation: 442

Alternate way to get only table name from below error message , the below query will work only if table_name at end in the mentioned way

with t as( select 'Failure on table TOLL_USR_TRXN_HISTORY:' as data from dual)
SELECT RTRIM(substr(data,instr(data,' ',-1)+1),':') from t

New Query for all messages :

 select  replace (replace ( 'Failure on table TOLL_USR_TRXN_HISTORY:
Failure on table DOCUMENT_IMAGES:' , 'Failure on table', ' ' ),':',' ') from dual

Upvotes: 0

Barbaros Özhan
Barbaros Özhan

Reputation: 65408

You may use such a SQL Select statement for each substituted individual line ( Failure on table TOLL_USR_TRXN_HISTORY in the below case ) from your text :

select regexp_replace(q.word, '[^a-zA-Z0-9_]+', '') as word 
  from
(
  select substr(str,nvl(lag(spc) over (order by lvl),1)+1*sign(lvl-1),
         abs(decode(spc,0,length(str),spc)-nvl(lag(spc) over (order by lvl),1))) word, 
         nvl(lag(spc) over (order by lvl),1) lg
    from
  (
    with tab as
      ( select 'Failure on table TOLL_USR_TRXN_HISTORY' str from dual )
        select instr(str,' ',1,level) spc, str, level lvl
          from tab 
       connect by level <= 10
  )
) q
where lg > 0
    and upper(regexp_replace(q.word, '[^a-zA-Z0-9_]+', '')) 
      = regexp_replace(q.word, '[^a-zA-Z0-9_]+', '')
  and ( nvl(length(regexp_substr(q.word,'_',1,1)),0) 
      + nvl(length(regexp_substr(q.word,'_',1,2)),0) 
      + nvl(length(regexp_substr(q.word,'_',1,3)),0)) > 0
  and   nvl(length(regexp_substr(q.word,'_',1,4)),0)  = 0;    

Upvotes: 0

Related Questions