Siqueira
Siqueira

Reputation: 443

How to use dbms_xmldom in Oracle PLSQL in order to get a good function performance?

I am using Oracle version 12 Release 1

I have been trying to code a function that calculates a sort of distance among objects stored as XML.

For doing so, I have maden the following...

First, register the XML schema.

BEGIN
-- Register the schema
DBMS_XMLSCHEMA.registerSchema('http://www.example.com/fvInteger.xsd',
'<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:element name="FeatureVector">
<xs:complexType>
<xs:sequence>
<xs:element name="feature" type="xs:integer" minOccurs="5" maxOccurs="999"/>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:schema>',
   TRUE, TRUE, FALSE);
END;
/

Then create the table

CREATE TABLE cophirfvXML_int (
   id    NUMBER,
   complex_obj  XMLTYPE)
   XMLTYPE complex_obj STORE AS OBJECT RELATIONAL
      XMLSCHEMA "http://www.example.com/fvInteger.xsd"
      ELEMENT "FeatureVector";

Inserted the following data into the above table

ID=1
<FeatureVector><feature>85</feature><feature>-41</feature><feature>29</feature><feature>26</feature><feature>-29</feature><feature>1</feature><feature>-29</feature><feature>-8</feature><feature>15</feature><feature>6</feature><feature>-17</feature><feature>6</feature><feature>-27</feature><feature>8</feature><feature>-12</feature><feature>5</feature></FeatureVector>

ID=2
<FeatureVector><feature>98</feature><feature>77</feature><feature>-127</feature><feature>27</feature><feature>-30</feature><feature>-13</feature><feature>-1</feature><feature>14</feature><feature>-31</feature><feature>-56</feature><feature>-10</feature><feature>6</feature><feature>-10</feature><feature>-12</feature><feature>5</feature><feature>19</feature></

FeatureVector>

... and so on (I am working with 200.000 objects).

Finally, the function >>

create or replace FUNCTION myDistance(
  innXML XMLType,
  outXML XMLType
) RETURN number
IS
  total NUMBER := 0;
  docInn xmldom.DOMDocument;
  docOut xmldom.DOMDocument;
  nInn xmldom.DOMNode;
  nOut xmldom.DOMNode;
  nlInn xmldom.DOMNodeList;
  nlOut xmldom.DOMNodeList;
  len number;
BEGIN
--Converte os atributos xmltype para DOMDocuments.
docInn := dbms_xmldom.newDOMDocument(innXML);
docOut := dbms_xmldom.newDOMDocument(outXML);

nlInn := xmldom.getElementsByTagName(docInn, '*');
nlOut := xmldom.getElementsByTagName(docOut, '*');
len := xmldom.getLength(nlInn);

for i in 1..len-1 loop
    nInn := xmldom.item(nlInn, i);
        nOut := xmldom.item(nlOut, i);
    total := total + ABS(xmldom.getNodeValue(DBMS_XMLDOM.getFirstChild(nInn)) - xmldom.getNodeValue(DBMS_XMLDOM.getFirstChild(nOut)));
end loop;

RETURN total;
END;
/

The function has a terrible performance. It uses too much memory and is much slower than the expected (mainly due to the fact that the object-relational storage is been used).

I even get the error:

ORA-00039: erro durante ac?o periodica ORA-04036: Memoria PGA usada pela instancia excede PGA_AGGREGATE_LIMIT ORA-06512: em "XDB.DBMS_XMLDOM", line 5027 ORA-06512: em "XDB.DBMS_XMLDOM", line 5052 ORA-06512: em "HIGIIA.XML_MANHATTAN_DISTANCE", line 19

Also, I tried this different solution bellow using a query, the performance is not good as well, though.

SELECT SUM( ABS(oFV.feature - iFV.feature) )
  INTO   total
  FROM   XMLTABLE(
           '//FeatureVector/feature'
           PASSING outXML
           COLUMNS rn       FOR ORDINALITY,
                   feature  NUMBER  PATH '.'
         ) oFV
         INNER JOIN
         XMLTABLE(
           '//FeatureVector/feature'
           PASSING innXML
           COLUMNS rn       FOR ORDINALITY,
                   feature  NUMBER  PATH '.'
         ) ifv
         ON ( oFV.rn = iFV.rn );

What can I make to improve its performance?

I do need to improve the performance, not solve the ORA-00039 error increasing the PGA Aggregate Limit.

Hope someone can help! Thanks in advance!!

Upvotes: 1

Views: 2229

Answers (1)

theoretical
theoretical

Reputation: 420

Couple of things - I'm guessing your out of memory is due to not calling DBMS_XMLDOM.freeDocument(docXXX);

But as to the issue of performance - without specifics on time, it's difficult to say if there's an issue with the code or if that's essentially expected overhead associated with parsing out the related xmltype field values. My immediate impression is, precalculate and store the "distance" values (when the xml is inserted/updated/deleted). That way you can just query the data via straight sql without all the parsing overhead on the read side. You can store the calculated values in the xml or in a relational table if you want to keep your xml schema clean.

Upvotes: 1

Related Questions