kunal kavthekar
kunal kavthekar

Reputation: 27

format string in oracle SQL to replace placeholders with variable names given at the end of the string

Hello folks & fellow developers,

I am working on an Oracle DB and have some XML data stored in blob format, The XML has:

a) a formula (eg. %1 - %2 , or %2||'-'||%1)

b) a delimited list of variables that fit into the formula (eg. SALE_Q1| SALE_Q2 or YEAR| MONTH)

I have managed to pull this data into 2 different columns (can be merged into 1 column if needed too) and what I need to do is get the output column as a superimposition of the variables onto the placeholders.

(eg. SALE_Q1 - SALE_Q2 , or MONTH||'-'||YEAR)

there are also a few caveats to this like:

  1. I don't know how many variables will each formula have and,
  2. the variables will not always be used in the formula in the same order as the delimited list (see eg.2)

we can consider the data coming from a query like: SELECT formula || ', ' || columns_used FROM data_table; for the input string and the output i am currently getting is like:

%1||'-'||%2||%3, SITE_NO| SITE_NAME| COUNTRY
0.1 * %1, WIND_RES
%1, TOTAL
CASE WHEN LENGTH(%1) < 8 THEN NULL ELSE TO_DATE(%1,'yyyymmdd')END, MIN_DATE
%1(+)=%3 and %2(+)=%4, ABC| LMN| PQR| XYZ

I am pretty new to SQL and this requirement is going over my head, any help will be highly appreciated. I need an SQL solution as a PL/SQL solution is not feasible in my requirement (this script is going to pull data from one DB and feed it to another repo on a regular basis)

I have seen some articles on XML table, model, or recursive regexp but I am not sure how I can use these. I have also seen this question on StackOverflow but my requirement is a little different from that and is a bit trickier as well. getting the formula string and the variables into the same string for processing like that question is also a plausible route. Any solution that can be written in an SQL query will be very helpful.

some more examples for your consideration:

"%1||'-'||%2||%3, SITE_NO| SITE_NAME| COUNTRY" => "SITE_NO||'-'||SITE_NAME||COUNTRY"

"0.1 * %1, WIND_RES" => "0.1 * WIND_RES"

"%1, TOTAL" => "TOTAL"

"CASE WHEN LENGTH(%1) < 8 THEN NULL ELSE TO_DATE(%1,'yyyymmdd')END, MIN_DATE" => "CASE WHEN LENGTH(MIN_DATE) < 8 THEN NULL ELSE TO_DATE(MIN_DATE,'yyyymmdd')END"

"%1(+)=%3 and %2(+)=%4, ABC| LMN| PQR| XYZ" => "ABC(+)=PQR and LMN(+)=XYZ"

Upvotes: 0

Views: 1448

Answers (1)

MT0
MT0

Reputation: 167981

I am assuming that the last comma is the delimiter between the template string and the delimited terms. You can use a recursive sub-query factoring clause and simple string functions:

WITH terms ( value, terms, num_terms ) AS (
  SELECT SUBSTR( value, 1, INSTR( value, ', ', -1 ) - 1 ),
         SUBSTR( value, INSTR( value, ', ', -1 ) + 2 ),
         REGEXP_COUNT(
           SUBSTR( value, INSTR( value, ', ', -1 ) + 2 ),
           '.+?(\| |$)'
         )
  FROM   table_name
),
term_bounds ( rn, value, terms, start_pos, lvl ) AS (
  SELECT ROWNUM,
         REPLACE(
           value,
           '%' || num_terms,
           CASE num_terms
           WHEN 1
           THEN terms
           ELSE SUBSTR( terms, INSTR( terms, '| ', 1, num_terms - 1 ) + 2 )
           END
         ),
         terms,
         CASE
         WHEN num_terms > 1
         THEN INSTR( terms, '| ', 1, num_terms - 1 )
         ELSE 1
         END,
         num_terms
  FROM   terms
UNION ALL
  SELECT rn,
         REPLACE(
           value,
           '%' || (lvl - 1),
           CASE lvl - 1
           WHEN 1
           THEN SUBSTR( terms, 1, start_pos - 1 )
           ELSE SUBSTR(
                  terms,
                  INSTR( terms, '| ', 1, lvl - 2 ) + 2,
                  start_pos - INSTR( terms, '| ', 1, lvl - 2 ) - 2
                )
           END
         ),
         terms,
         CASE
         WHEN lvl > 2
         THEN INSTR( terms, '| ', 1, lvl - 2 )
         ELSE 1
         END,
         lvl - 1
  FROM   term_bounds
  WHERE  lvl > 1
)
SEARCH DEPTH FIRST BY rn SET rn_order
SELECT value
FROM   term_bounds
WHERE  lvl = 1;

Which, for the sample data:

CREATE TABLE table_name ( value ) AS
SELECT '%1||'-'||%2||%3, SITE_NO| SITE_NAME| COUNTRY' FROM DUAL UNION ALL
SELECT '0.1 * %1, WIND_RES' FROM DUAL UNION ALL
SELECT '%1, TOTAL' FROM DUAL UNION ALL
SELECT 'CASE WHEN LENGTH(%1) < 8 THEN NULL ELSE TO_DATE(%1,'yyyymmdd')END, MIN_DATE' FROM DUAL UNION ALL
SELECT '%1(+)=%3 and %2(+)=%4, ABC| LMN| PQR| XYZ' FROM DUAL UNION ALL
SELECT '%1, %2, %3, %4, %5, %6, %7, %8, %9, %10, %11, ONE| TWO| THREE| FOUR| FIVE| SIX| SEVEN| EIGHT| NINE| TEN| ELEVEN' FROM DUAL UNION ALL
SELECT '%%%%%%%7, HELLO| 1| 2| 3| 4| 5| 6' FROM DUAL

Outputs:

| VALUE                                                                                      |
| :----------------------------------------------------------------------------------------- |
| SITE_NO||'-'||SITE_NAME||COUNTRY                                                 |
| 0.1 * WIND_RES                                                                             |
| TOTAL                                                                                      |
| CASE WHEN LENGTH(MIN_DATE) < 8 THEN NULL ELSE TO_DATE(MIN_DATE,'yyyymmdd')END |
| ABC(+)=PQR and LMN(+)=XYZ                                                                  |
| ONE, TWO, THREE, FOUR, FIVE, SIX, SEVEN, EIGHT, NINE, TEN, ELEVEN                          |
| HELLO                                                                                      |

db<>fiddle here


Or as a PL/SQL function:

CREATE FUNCTION substitute_values(
  i_value          IN VARCHAR2,
  i_terms          IN VARCHAR2,
  i_term_delimiter IN VARCHAR2 DEFAULT '| '
) RETURN VARCHAR2 DETERMINISTIC
IS
  TYPE term_list_type IS TABLE OF VARCHAR2(200);
  v_start  PLS_INTEGER := 1;
  v_end    PLS_INTEGER;   
  v_index  PLS_INTEGER;
  v_terms  term_list_type := term_list_type();
  v_output VARCHAR2(4000) := i_value;
BEGIN
  LOOP
    v_end := INSTR(i_terms, i_term_delimiter, v_start);
    v_terms.EXTEND;
    IF v_end > 0 THEN
      v_terms(v_terms.COUNT) := SUBSTR(i_terms, v_start, v_end - v_start);
    ELSE
      v_terms(v_terms.COUNT) := SUBSTR(i_terms, v_start);
      EXIT;
    END IF;
    v_start := v_end + LENGTH(i_term_delimiter);
  END LOOP;
  
  LOOP
    v_index  := TO_NUMBER(REGEXP_SUBSTR(v_output, '^(.*?)%(\d+)', 1, 1, NULL, 2));
    IF v_index IS NULL THEN
      RETURN v_output;
    ELSIF v_index > v_terms.COUNT THEN
      RETURN NULL;
    END IF;
    v_output := REGEXP_REPLACE(v_output, '^(.*?)%(\d+)', '\1' || v_terms(v_index));
  END LOOP;
END;
/

Then:

SELECT SUBSTITUTE_VALUES(
         SUBSTR(value, 1, INSTR(value, ', ', -1) - 1),
         SUBSTR(value, INSTR(value, ', ', -1) + 2)
       ) AS value
FROM   table_name

Outputs:

VALUE
SITE_NO||'-'||SITE_NAME||COUNTRY
0.1 * WIND_RES
TOTAL
CASE WHEN LENGTH(MIN_DATE) < 8 THEN NULL ELSE TO_DATE(MIN_DATE,'yyyymmdd')END
ABC(+)=PQR and LMN(+)=XYZ
ONE, TWO, THREE, FOUR, FIVE, SIX, SEVEN, EIGHT, NINE, TEN, ELEVEN
HELLO

db<>fiddle here

Upvotes: 1

Related Questions