Eryk Janocha
Eryk Janocha

Reputation: 21

How to use REGEXP_SUBSTR properly?

Currently in my select statement I have id and value. The value is json which looks like this:

{"layerId":"nameOfLayer","layerParams":{some unnecessary data}

I would like to have in my select id and nameOfLayer so the output would be for example:

1, layerName
2, layerName2
etc.

The json looks always the same so the layerID is the first. Could you tell me how can I use REGEXP_SUBSTR properly in my select query which looks like this now?

select
  id,
  value
from
  ...
where
  table1.id = table2.bookmark_id
  and ...;

Upvotes: 2

Views: 540

Answers (3)

Popeye
Popeye

Reputation: 35920

I think you can use regexp_substr like this:

regexp_substr(str, '[^"]+',1,2) as layer_id,
regexp_substr(str, '[^"]+',1,4) as layername

Db<>fiddle demo

Cheers!!

Upvotes: 0

MT0
MT0

Reputation: 168470

Don't use regular expressions; use a JSON_TABLE or JSON_VALUE to parse JSON:

Oracle 18c Setup:

CREATE TABLE test_data (
  id    INTEGER,
  value VARCHAR2(4000)
);

INSERT INTO test_data ( id, value )
SELECT 1, '{"layerId":"nameOfLayer","layerParams":{"some":"unnecessary data"}}' FROM DUAL UNION ALL
SELECT 2, '{"layerParams":{"layerId":"NOT THIS ONE!"},"layerId":"nameOfLayer"}' FROM DUAL UNION ALL
SELECT 3, '{"layerId":"Name with \"Quotes\"","layerParams":{"layerId":"NOT THIS ONE!"}}' FROM DUAL;

Query 1:

SELECT t.id,
       j.layerId
FROM   test_data t
       CROSS JOIN
       JSON_TABLE(
         t.value,
         '$'
         COLUMNS (
           layerId VARCHAR2(50) PATH '$.layerId'
         )
       ) j

Query 2:

If you only want a single value you could, alternatively, use JSON_VALUE:

SELECT id,
       JSON_VALUE( value, '$.layerId' ) AS layerId
FROM   test_data

Output:

Both output:

ID | LAYERID           
-: | :-----------------
 1 | nameOfLayer       
 2 | nameOfLayer       
 3 | Name with "Quotes"

Query 3:

You can try regular expressions but they do not always work as expected:

SELECT id,
       REPLACE(
         REGEXP_SUBSTR( value, '[{,]"layerId":"((\\"|[^"])*)"', 1, 1, NULL, 1 ),
         '\"',
         '"'
       ) AS layerID
FROM   test_data

Output:

ID | LAYERID           
-: | :-----------------
 1 | nameOfLayer       
 2 | NOT THIS ONE!     
 3 | Name with "Quotes"

So if you can guarantee that no-one is going to put data into the database where the JSON is in a different order then this may work; however the JSON specification allows key-value pairs to be in any order so regular expressions are not a general solution that will parse every JSON string. You should be using a proper JSON parser and there are 3rd party solutions available for Oracle 11g or you can upgrade to Oracle 12c where there is a native solution.

db<>fiddle here

Upvotes: 0

GMB
GMB

Reputation: 222632

In Oracle 11g, you can extract the layerId using the following regular expression, where js is the name of your JSON column:

regexp_replace(js, '^.*"layerId":"([^"]+).*$', '\1')

This basically extracts the string between double quotes after "layerId":.

In more recent versions, you would add a check constraint on the table to ensure that the document is valid JSON, and then use the dot notation to access the object attribute as follows:

create table mytable (
    id int primary key,
    js varchar2(200),
    constraint ensure_js_is_json check (js is json)
);

insert into mytable values (1, '{"layerId":"nameOfLayer","layerParams":{} }');

select id, t.js.layerId from mytable t;

Demo on DB Fiddle:

ID | LAYERID    
-: | :----------
 1 | nameOfLayer

Upvotes: 1

Related Questions