VijayD
VijayD

Reputation: 820

How to read namespace from tags using XdmNode?

I wish to read all the namespaces from one tag present in net.sf.saxon.s9api.XdmNode. I'm able to read them using the code below, but due to the performance issue, I should use the existing DOM to parse and read the namespaces.

input.xml

<?xml version="1.0" encoding="utf-8"?>
<?taxonomy-version 2.2.3.0?> <?taxonomy-set-overall-version 2.6.0.0?>
<!--(C) EBA-->
<link:linkbase xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:link="http://www.xbrl.org/2003/linkbase" xmlns:gen="http://xbrl.org/2008/generic" xmlns:label="http://xbrl.org/2008/label" xmlns:formula="http://xbrl.org/2008/formula" xmlns:df="http://xbrl.org/2008/filter/dimension" xmlns:table="http://xbrl.org/2014/table" xmlns:model="http://www.eurofiling.info/xbrl/ext/model" xmlns:eba_dim="http://www.eba.europa.eu/xbrl/crr/dict/dim" xmlns:eba_PL="http://www.eba.europa.eu/xbrl/crr/dict/dom/PL" xmlns:eba_met="http://www.eba.europa.eu/xbrl/crr/dict/met" xmlns:eba_BA="http://www.eba.europa.eu/xbrl/crr/dict/dom/BA" xmlns:eba_MC="http://www.eba.europa.eu/xbrl/crr/dict/dom/MC" xmlns:eba_IM="http://www.eba.europa.eu/xbrl/crr/dict/dom/IM" xmlns:eba_AP="http://www.eba.europa.eu/xbrl/crr/dict/dom/AP" xmlns:eba_TR="http://www.eba.europa.eu/xbrl/crr/dict/dom/TR" xmlns:eba_EC="http://www.eba.europa.eu/xbrl/crr/dict/dom/EC" xmlns:eba_CT="http://www.eba.europa.eu/xbrl/crr/dict/dom/CT" xmlns:eba_GA="http://www.eba.europa.eu/xbrl/crr/dict/dom/GA" xsi:schemaLocation="http://www.xbrl.org/2003/linkbase http://www.xbrl.org/2003/xbrl-linkbase-2003-12-31.xsd http://xbrl.org/2008/generic http://www.xbrl.org/2008/generic-link.xsd http://xbrl.org/2008/label http://www.xbrl.org/2008/generic-label.xsd http://xbrl.org/2008/formula http://www.xbrl.org/2008/formula.xsd http://xbrl.org/2008/filter/dimension http://www.xbrl.org/2008/dimension-filter.xsd http://xbrl.org/2014/table http://www.xbrl.org/2014/table.xsd http://www.eurofiling.info/xbrl/ext/model http://www.eurofiling.info/eu/fr/xbrl/ext/model.xsd">
<link:arcroleRef arcroleURI="http://xbrl.org/arcrole/2014/aspect-node-filter" xlink:type="simple" xlink:href="http://www.xbrl.org/2014/table.xsd#aspect-node-filter" />
<link:arcroleRef arcroleURI="http://xbrl.org/arcrole/2014/breakdown-tree" xlink:type="simple" xlink:href="http://www.xbrl.org/2014/table.xsd#breakdown-tree" />
<link:arcroleRef arcroleURI="http://xbrl.org/arcrole/2014/definition-node-subtree" xlink:type="simple" xlink:href="http://www.xbrl.org/2014/table.xsd#definition-node-subtree" />
<link:arcroleRef arcroleURI="http://xbrl.org/arcrole/2014/table-breakdown" xlink:type="simple" xlink:href="http://www.xbrl.org/2014/table.xsd#table-breakdown" />
<link:roleRef roleURI="http://www.eba.europa.eu/xbrl/crr/role/dict/dom/GA/GA5_1" xlink:type="simple" xlink:href="../../../../../../dict/dom/ga/hier.xsd#eba_GA5_1" />
<link:roleRef roleURI="http://www.eba.europa.eu/xbrl/crr/role/fws/COREP/its-2016-03/2016-11-15/tab/C_09.01.a" xlink:type="simple" xlink:href="c_09.01.a.xsd#role" />
</link:linkbase>

From the above file, I wish to read all the "xmlns" attributes from the link:linkbase tag.

The below code snippet is working as expected but hitting the performance.

Code

private List<Namespace> getNameSpaceListFromFile() throws ValidationException {
    List <Namespace>nsList = new ArrayList<Namespace>();

    try {
        if(inputFile!=null){
            BufferedReader bufferedReader = new BufferedReader(new FileReader(inputFile)); //I18NOK:IOE
            String line;
            StringBuilder stringBuilder = new StringBuilder();

            while((line=bufferedReader.readLine())!= null){
                stringBuilder.append(line.trim());
            }
            XMLStreamReader reader =  XMLInputFactory.newFactory().createXMLStreamReader(new StringReader(stringBuilder.toString().trim().replaceFirst("^([\\W]+)<","<"))); /*I18NOK:LSM*/ //removing byte order markers by using "^([\\W]+)<","<" 

            while (reader.hasNext()) {
                int event = reader.next();
                if (XMLStreamConstants.START_ELEMENT == event) {
                    if (reader.getNamespaceCount() > 0) {
                        for (int nsIndex = 0; nsIndex < reader.getNamespaceCount(); nsIndex++) {
                            System.out.println(reader.getNamespacePrefix(nsIndex).trim()+"\t\t:\t\t"+ reader.getNamespaceURI(nsIndex).trim());
                            nsList.add(new Namespace(reader.getNamespacePrefix(nsIndex).trim(), reader.getNamespaceURI(nsIndex).trim()));
                        }
                    }
                } 
            }
            bufferedReader.close();
        }
        if(nsList.isEmpty()){
            return new NamespaceLoader(context).getNsListFromProperties();
        }
    } catch (Exception e) {
        e.printStackTrace();
    }
    return nsList;
}

I searched for the proper working solution but could not find any.

Iterator <XdmItem> itemList  = document.axisIterator(Axis.CHILD);
        while(itemList.hasNext()) {
            XdmItem item = itemList.next();
            System.err.println(item.getStringValue());
        }

in above code, I get the full "link" tag in XdmItem but could not find a way to read the linkbase tag and fetch the namespaces.

Any sort of help will be appreciated. also, let me know if any more information is needed.

Upvotes: 0

Views: 480

Answers (1)

Michael Kay
Michael Kay

Reputation: 163262

If I understand correctly, you already have the document held as an instance of XdmNode. If that's the case then you can use the s9api interface to execute the XPath expression

/*/namespace(*)

and this will return a XdmValue containing the list of namespace nodes on the outermost element. You can then just do

for (XdmItem item : result) {
   XdmNode ns = (XdmNode)item;
   String prefix = ns.getNodeName()==null ? "" : ns.getNodeName().getLocalName();
   String uri = ns.getStringValue();
   ...
}

If you prefer you can achieve the same effect by using XdmNode.axisIterator(Axis.CHILD) on the document node to find the outermost element, then XdmNode.axisIterator(Axis.NAMESPACE) to find the namespace nodes.

Upvotes: 1

Related Questions