Reputation: 13
I'm trying to datamine a table. The table contains a column where xml-messages are stored, using the CLOB datatype. The XML-message has the following structure:
<t:MsgNameXML typeInfo="source" xsi:schemaLocation="http://example/location FileName.xsd" xmlns:t="http://url/of/custom/namespace" xmlns:"http://url/of/default/namespace" >
<t:info1>data1</t:info1>
<t:info2>data2</t:info2>
<t:Section>
<t:SubSection>
<t:SubSubSection>
<t:variable1>data3</t:variable1>
<t:variable2>data4</t:variable2>
</t:SubSubSection>
</t:SubSection>
</t:Section>
<t:Section>
<t:SubSection>
<t:SubSubSection>
<t:variable1>data5</t:variable1>
<t:variable2>data6</t:variable2>
</t:SubSubSection>
</t:SubSection>
</t:Section>
</t:MsgNameXML>
So, at least for me, the main difficulties are the namespaces and the fact that there a multiple nodes.
I hope to get the following output (for the record containing the above XML):
INFO1 INFO2 VARIABLE1 VARIABLE2
-------------------------------------
data1 data2 data3 data4
data1 data2 data5 data6
I've constructed the following query to try to extract "variable1" (Oracle database, so PL SQL, I guess):
select
extractvalue(xmltype(CLOB_COLUMN_NAME),
'/t:MsgNameXML/t:Section/t:SubSection/t:SubSubSection/t:variable1',
'xmlns:t="xmlns:t="http://url/of/custom/namespace" xmlns:"http://url/of/default/namespace"')
from
TABLE_NAME,
TABLE(XMLSEQUENCE(EXTRACT(CLOB_COLUMN_NAME, '/t:MsgNameXML/t:Section/t:SubSection/t:SubSubSection/t:variable1')))
This unfortunately produces the following error message:
ORA-00932: inconsistent datatypes: expected - got -
00932. 00000 - "inconsistent datatypes: expected %s got %s"
*Cause:
*Action:
Error at Line: 9 Column: 33
(Line 9, column 33, refers to where CLOB_COLUMN_NAME is located after the EXTRACT-function in the from-clause).
Does anyone know what I'm doing wrong, and how I could correct this?
In addition, I'd need to expand this query to extract more values from the XML-message, rather than just the "variable1" (see desired result above). Any help on that would be much appreciated as well.
Thank you for your help,
Tom
Upvotes: 1
Views: 4763
Reputation: 191275
The immediate cause of your ORA-00932 is that you're trying to extract directly from a CLOB; you're converting on one place already, but this:
TABLE(XMLSEQUENCE(EXTRACT(CLOB_COLUMN_NAME, ... )))
... would need to be:
TABLE(XMLSEQUENCE(EXTRACT(XMLType(CLOB_COLUMN_NAME), ... )))
But the namespace still isn't recognised in that clause, and you'd get ORA-31011 if you only change that. You can get something out by including the namespace in the EXTRACT
call as well, and correcting the typo in that:
select extractvalue(column_value, '/t:variable1',
'xmlns:t="http://url/of/custom/namespace" xmlns:"http://url/of/default/namespace"')
as variable1
from
TABLE_NAME,
TABLE(XMLSEQUENCE(EXTRACT(XMLType(CLOB_COLUMN_NAME),
'/t:MsgNameXML/t:Section/t:SubSection/t:SubSubSection/t:variable1',
'xmlns:t="http://url/of/custom/namespace" xmlns:"http://url/of/default/namespace"')));
VARIABLE1
----------
data3
data5
Or both variables at once by shifting the extract back a level:
select extractvalue(column_value, 't:SubSubSection/t:variable1',
'xmlns:t="http://url/of/custom/namespace" xmlns:"http://url/of/default/namespace"')
as variable1,
extractvalue(column_value, 't:SubSubSection/t:variable2',
'xmlns:t="http://url/of/custom/namespace" xmlns:"http://url/of/default/namespace"')
as variable2
from
TABLE_NAME,
TABLE(XMLSEQUENCE(EXTRACT(XMLType(CLOB_COLUMN_NAME),
'/t:MsgNameXML/t:Section/t:SubSection/t:SubSubSection',
'xmlns:t="http://url/of/custom/namespace" xmlns:"http://url/of/default/namespace"')));
VARIABLE1 VARIABLE2
---------- ----------
data3 data4
data5 data6
Including the info values as well is going to make it more complicated and repetitive:
select extractvalue(XMLType(CLOB_COLUMN_NAME), 't:MsgNameXML/t:info1',
'xmlns:t="http://url/of/custom/namespace" xmlns:"http://url/of/default/namespace"')
as info1,
extractvalue(XMLType(CLOB_COLUMN_NAME), 't:MsgNameXML/t:info2',
'xmlns:t="http://url/of/custom/namespace" xmlns:"http://url/of/default/namespace"')
as info2,
extractvalue(column_value, 't:SubSubSection/t:variable1',
'xmlns:t="http://url/of/custom/namespace" xmlns:"http://url/of/default/namespace"')
as variable1,
extractvalue(column_value, 't:SubSubSection/t:variable2',
'xmlns:t="http://url/of/custom/namespace" xmlns:"http://url/of/default/namespace"')
as variable2
from
TABLE_NAME,
TABLE(XMLSEQUENCE(EXTRACT(XMLType(CLOB_COLUMN_NAME),
'/t:MsgNameXML/t:Section/t:SubSection/t:SubSubSection',
'xmlns:t="http://url/of/custom/namespace" xmlns:"http://url/of/default/namespace"')));
INFO1 INFO2 VARIABLE1 VARIABLE2
---------- ---------- ---------- ----------
data1 data2 data3 data4
data1 data2 data5 data6
SQL Fiddle of this approach.
You can use XQuery and XMLTable instead of the older extract syntax:
select info1, info2, variable1, variable2
from table_name,
xmltable('declare namespace t = "http://url/of/custom/namespace"; (: :)
for $i in /t:MsgNameXML
let $info1 := $i/t:info1,
$info2 := $i/t:info2
for $j in $i/t:Section/t:SubSection/t:SubSubSection
let $variable1 := $j/t:variable1,
$variable2 := $j/t:variable2
return <tmp>
<info1>{$info1}</info1>
<info2>{$info2}</info2>
<variable1>{$variable1}</variable1>
<variable2>{$variable2}</variable2>
</tmp>'
passing xmltype(CLOB_COLUMN_NAME)
columns
info1 char(10) path '/tmp/info1',
info2 char(10) path '/tmp/info2',
variable1 char(10) path '/tmp/variable1',
variable2 char(10) path '/tmp/variable2'
);
INFO1 INFO2 VARIABLE1 VARIABLE2
---------- ---------- ---------- ----------
data1 data2 data3 data4
data1 data2 data5 data6
I had to tweak the CLOB value a little to make it valid; changing xsi=
to xmlns:xsi=
and setting a dummy name for the default namespace, xmlns:x="http://url/of/default/namespace"
. I think that was all...
XQuery is a "functional language based on sequences", and this uses "FLOWR" expressions. Specifically two for
loops to iterate over the structure, and let
expressions to assign the namespace variables to local variables which are then returned in a simpler XML structure, from which the columns can be extracted via their new paths. (You don't seem to be able to use namespaces directly in the path
strings).
It should be fairly easy to expand this to get more values out of your message - more let
expressions, and more matching path
extractions.
But I'm really just scratching the surface of this stuff myself, so there may be simpler/better/faster ways to do this; hopefully a starting point for an XQuery approach, anyway.
Upvotes: 1