Reputation: 27
Hello folks & fellow developers,
I am working on an Oracle DB and have some XML data stored in blob format, The XML has:
a) a formula (eg. %1 - %2
, or %2||'-'||%1
)
b) a delimited list of variables that fit into the formula (eg. SALE_Q1| SALE_Q2
or YEAR| MONTH
)
I have managed to pull this data into 2 different columns (can be merged into 1 column if needed too) and what I need to do is get the output column as a superimposition of the variables onto the placeholders.
(eg. SALE_Q1 - SALE_Q2 , or MONTH||'-'||YEAR
)
there are also a few caveats to this like:
we can consider the data coming from a query like: SELECT formula || ', ' || columns_used FROM data_table;
for the input string
and the output i am currently getting is like:
%1||'-'||%2||%3, SITE_NO| SITE_NAME| COUNTRY
0.1 * %1, WIND_RES
%1, TOTAL
CASE WHEN LENGTH(%1) < 8 THEN NULL ELSE TO_DATE(%1,'yyyymmdd')END, MIN_DATE
%1(+)=%3 and %2(+)=%4, ABC| LMN| PQR| XYZ
I am pretty new to SQL and this requirement is going over my head, any help will be highly appreciated. I need an SQL solution as a PL/SQL solution is not feasible in my requirement (this script is going to pull data from one DB and feed it to another repo on a regular basis)
I have seen some articles on XML table, model, or recursive regexp but I am not sure how I can use these. I have also seen this question on StackOverflow but my requirement is a little different from that and is a bit trickier as well. getting the formula string and the variables into the same string for processing like that question is also a plausible route. Any solution that can be written in an SQL query will be very helpful.
some more examples for your consideration:
"%1||'-'||%2||%3, SITE_NO| SITE_NAME| COUNTRY" => "SITE_NO||'-'||SITE_NAME||COUNTRY"
"0.1 * %1, WIND_RES" => "0.1 * WIND_RES"
"%1, TOTAL" => "TOTAL"
"CASE WHEN LENGTH(%1) < 8 THEN NULL ELSE TO_DATE(%1,'yyyymmdd')END, MIN_DATE" => "CASE WHEN LENGTH(MIN_DATE) < 8 THEN NULL ELSE TO_DATE(MIN_DATE,'yyyymmdd')END"
"%1(+)=%3 and %2(+)=%4, ABC| LMN| PQR| XYZ" => "ABC(+)=PQR and LMN(+)=XYZ"
Upvotes: 0
Views: 1448
Reputation: 167981
I am assuming that the last comma is the delimiter between the template string and the delimited terms. You can use a recursive sub-query factoring clause and simple string functions:
WITH terms ( value, terms, num_terms ) AS (
SELECT SUBSTR( value, 1, INSTR( value, ', ', -1 ) - 1 ),
SUBSTR( value, INSTR( value, ', ', -1 ) + 2 ),
REGEXP_COUNT(
SUBSTR( value, INSTR( value, ', ', -1 ) + 2 ),
'.+?(\| |$)'
)
FROM table_name
),
term_bounds ( rn, value, terms, start_pos, lvl ) AS (
SELECT ROWNUM,
REPLACE(
value,
'%' || num_terms,
CASE num_terms
WHEN 1
THEN terms
ELSE SUBSTR( terms, INSTR( terms, '| ', 1, num_terms - 1 ) + 2 )
END
),
terms,
CASE
WHEN num_terms > 1
THEN INSTR( terms, '| ', 1, num_terms - 1 )
ELSE 1
END,
num_terms
FROM terms
UNION ALL
SELECT rn,
REPLACE(
value,
'%' || (lvl - 1),
CASE lvl - 1
WHEN 1
THEN SUBSTR( terms, 1, start_pos - 1 )
ELSE SUBSTR(
terms,
INSTR( terms, '| ', 1, lvl - 2 ) + 2,
start_pos - INSTR( terms, '| ', 1, lvl - 2 ) - 2
)
END
),
terms,
CASE
WHEN lvl > 2
THEN INSTR( terms, '| ', 1, lvl - 2 )
ELSE 1
END,
lvl - 1
FROM term_bounds
WHERE lvl > 1
)
SEARCH DEPTH FIRST BY rn SET rn_order
SELECT value
FROM term_bounds
WHERE lvl = 1;
Which, for the sample data:
CREATE TABLE table_name ( value ) AS
SELECT '%1||'-'||%2||%3, SITE_NO| SITE_NAME| COUNTRY' FROM DUAL UNION ALL
SELECT '0.1 * %1, WIND_RES' FROM DUAL UNION ALL
SELECT '%1, TOTAL' FROM DUAL UNION ALL
SELECT 'CASE WHEN LENGTH(%1) < 8 THEN NULL ELSE TO_DATE(%1,'yyyymmdd')END, MIN_DATE' FROM DUAL UNION ALL
SELECT '%1(+)=%3 and %2(+)=%4, ABC| LMN| PQR| XYZ' FROM DUAL UNION ALL
SELECT '%1, %2, %3, %4, %5, %6, %7, %8, %9, %10, %11, ONE| TWO| THREE| FOUR| FIVE| SIX| SEVEN| EIGHT| NINE| TEN| ELEVEN' FROM DUAL UNION ALL
SELECT '%%%%%%%7, HELLO| 1| 2| 3| 4| 5| 6' FROM DUAL
Outputs:
| VALUE | | :----------------------------------------------------------------------------------------- | | SITE_NO||'-'||SITE_NAME||COUNTRY | | 0.1 * WIND_RES | | TOTAL | | CASE WHEN LENGTH(MIN_DATE) < 8 THEN NULL ELSE TO_DATE(MIN_DATE,'yyyymmdd')END | | ABC(+)=PQR and LMN(+)=XYZ | | ONE, TWO, THREE, FOUR, FIVE, SIX, SEVEN, EIGHT, NINE, TEN, ELEVEN | | HELLO |
db<>fiddle here
Or as a PL/SQL function:
CREATE FUNCTION substitute_values(
i_value IN VARCHAR2,
i_terms IN VARCHAR2,
i_term_delimiter IN VARCHAR2 DEFAULT '| '
) RETURN VARCHAR2 DETERMINISTIC
IS
TYPE term_list_type IS TABLE OF VARCHAR2(200);
v_start PLS_INTEGER := 1;
v_end PLS_INTEGER;
v_index PLS_INTEGER;
v_terms term_list_type := term_list_type();
v_output VARCHAR2(4000) := i_value;
BEGIN
LOOP
v_end := INSTR(i_terms, i_term_delimiter, v_start);
v_terms.EXTEND;
IF v_end > 0 THEN
v_terms(v_terms.COUNT) := SUBSTR(i_terms, v_start, v_end - v_start);
ELSE
v_terms(v_terms.COUNT) := SUBSTR(i_terms, v_start);
EXIT;
END IF;
v_start := v_end + LENGTH(i_term_delimiter);
END LOOP;
LOOP
v_index := TO_NUMBER(REGEXP_SUBSTR(v_output, '^(.*?)%(\d+)', 1, 1, NULL, 2));
IF v_index IS NULL THEN
RETURN v_output;
ELSIF v_index > v_terms.COUNT THEN
RETURN NULL;
END IF;
v_output := REGEXP_REPLACE(v_output, '^(.*?)%(\d+)', '\1' || v_terms(v_index));
END LOOP;
END;
/
Then:
SELECT SUBSTITUTE_VALUES(
SUBSTR(value, 1, INSTR(value, ', ', -1) - 1),
SUBSTR(value, INSTR(value, ', ', -1) + 2)
) AS value
FROM table_name
Outputs:
VALUE SITE_NO||'-'||SITE_NAME||COUNTRY 0.1 * WIND_RES TOTAL CASE WHEN LENGTH(MIN_DATE) < 8 THEN NULL ELSE TO_DATE(MIN_DATE,'yyyymmdd')END ABC(+)=PQR and LMN(+)=XYZ ONE, TWO, THREE, FOUR, FIVE, SIX, SEVEN, EIGHT, NINE, TEN, ELEVEN HELLO
db<>fiddle here
Upvotes: 1