Gayan Dasanayake
Gayan Dasanayake

Reputation: 2021

How to match regular expression and return a substring of the match

I want to match by regular expression but get a substring of it.

For example, I would like to get the string literals in the input string without the double quotes

DECLARE
  TYPE STRING_TABLE_TYPE IS TABLE OF VARCHAR2(32000) INDEX BY BINARY_INTEGER;
  string_table_  STRING_TABLE_TYPE;
  input_string_  VARCHAR2(32000);
  regex_pattern_ VARCHAR2(2000);
BEGIN
  regex_pattern_ := '"[^\"]*"';
  input_string_  := '{"ID", type date},
                     {"Reporting Date", type date},
                     {"Reporting Month No", Int64.Type},
                     {"Reporting Quarter", Int64.Type}';

  -- Extract substrings        
  SELECT REGEXP_SUBSTR(input_string_, regex_pattern_, 1, LEVEL, 'inm') text
  BULK COLLECT INTO string_table_
  FROM DUAL
  CONNECT BY REGEXP_SUBSTR(input_string_, regex_pattern_, 1, LEVEL, 'inm') IS NOT NULL;

  -- Printout result
  IF string_table_(1) IS NOT NULL THEN
     FOR i_ IN 1..string_table_.COUNT LOOP
        dbms_output.put_line(string_table_(i_));
     END LOOP;
  ELSE
     dbms_output.put_line('No substrings found!');
  END IF;
END;

The output I get is:

"ID"
"Reporting Date"
"Reporting Month No"
"Reporting Quarter"

I want to know if there is a regular expression pattern so that I get this directly

ID
Reporting Date
Reporting Month No
Reporting Quarter

Upvotes: 1

Views: 245

Answers (2)

Gayan Dasanayake
Gayan Dasanayake

Reputation: 2021

It seems from 11g onwards there is a proper solution in REGEXP_SUBSTR.

We need to use subexpressions within the regex. aka capture groups.

For example, since we are only interested in the part within double quotes, that part needs to be enclosed within parentheses and made a subexpression.

regex_pattern_ := '"([^\"]*)"';

Then when calling we need to specify which subexpression we need as the last parameter.

REGEXP_SUBSTR(input_string_, regex_pattern_, 1, LEVEL, 'inm', 1)

Demo

Upvotes: 2

Nick
Nick

Reputation: 147266

Given Oracle's limited regex capability it's probably simplest to just TRIM the double quotes:

SELECT TRIM(BOTH '"' FROM REGEXP_SUBSTR(input_string_, regex_pattern_, 1, LEVEL, 'inm')) text
BULK COLLECT INTO string_table_
FROM DUAL
CONNECT BY TRIM(BOTH '"' FROM REGEXP_SUBSTR(input_string_, regex_pattern_, 1, LEVEL, 'inm')) IS NOT NULL;

Small demo on dbfiddle

Upvotes: 4

Related Questions