Chetya
Chetya

Reputation: 1317

XPath to select nodes in different namespaces

I need some help with coming up with a proper XPath expression to extract values out of the XML.

I can get the values using jaxb however I need xpath because I have a decision table kind of mapping rules that I want to externalise, which if I use jaxb will result in lot of nested if/else statements that I want to avoid and hence the need for xpath approach.

I have an xml file that is constructed off at least 4 schemas. I mean the root schema has an element at a particular point that says xs:any and at this location a xml based off a different schema is injected and this in turn has a similar xs:any where another xml is injected to build the final/actual xml that i work with.

This is the actual XML structure that I'm dealing with (I have intentionally modified the values).The two Document nodes in the xml below are based off different schemas

<?xml version="1.0"?>
<env:Envelope xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:env="CDTS-SUBMIT">
  <env:Body>
    <cdtPrefix>
      <cdtprVersion>01</cdtprVersion>
      <cdtprOperation>SUBMIT</cdtprOperation>
      <cdtprFunction>GCAMS1O</cdtprFunction>
      <cdtprDirectionFlag>O</cdtprDirectionFlag>
    </cdtPrefix>
    <cdtDataDescription>
      <cdtddVersion>01</cdtddVersion>
      <cdtddFirmId>ABC</cdtddFirmId>
      <cdtddBusinessDataFormat>GCAMS1O-XML</cdtddBusinessDataFormat>
      <cdtddReferenceNum>123</cdtddReferenceNum>
      <cdtddTrackingNum>123</cdtddTrackingNum>
      <cdtddDestination>AQ</cdtddDestination>
      <cdtddSeqNum>0000000</cdtddSeqNum>
      <cdtddCycleNum>00</cdtddCycleNum>
      <cdtddBusinessDate>00000000</cdtddBusinessDate>
    </cdtDataDescription>
    <cdtBusinessData>
      <AppHdr xmlns="urn:iso:std:iso:20022:tech:xsd:head.001.001.01" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
        <Fr>
          <FIId>
            <FinInstnId>
              <BICFI>ABC</BICFI>
            </FinInstnId>
          </FIId>
        </Fr>
        <To>
          <FIId>
            <FinInstnId>
              <BICFI>ABC   </BICFI>
            </FinInstnId>
          </FIId>
        </To>
        <BizMsgIdr>ABC</BizMsgIdr>
        <MsgDefIdr>seev.031.002.05</MsgDefIdr>
        <BizSvc>CSD</BizSvc>
        <CreDt>9999-99-99T00:02:17Z</CreDt>
      </AppHdr>
      <Document xmlns="urn:swift:xsd:seev.031.002.05" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
        <CorpActnNtfctn>
          <NtfctnGnlInf>
            <NtfctnTp>REPL</NtfctnTp>
            <PrcgSts>
              <Cd>
                <EvtCmpltnsSts>COMP</EvtCmpltnsSts>
                <EvtConfSts>CONF</EvtConfSts>
              </Cd>
            </PrcgSts>
          </NtfctnGnlInf>
          <PrvsNtfctnId>
            <Id>00000000</Id>
          </PrvsNtfctnId>
          <EvtsLkg>
            <EvtId>
              <LkdOffclCorpActnEvtId>US8</LkdOffclCorpActnEvtId>
            </EvtId>
            <LkgTp>
              <Cd>INFO</Cd>
            </LkgTp>
          </EvtsLkg>
          <CorpActnGnlInf>
            <CorpActnEvtId>000</CorpActnEvtId>
            <OffclCorpActnEvtId>US7</OffclCorpActnEvtId>
            <EvtPrcgTp>
              <Cd>DISN</Cd>
            </EvtPrcgTp>
            <EvtTp>
              <Cd>INTR</Cd>
            </EvtTp>
            <MndtryVlntryEvtTp>
              <Cd>CHOS</Cd>
            </MndtryVlntryEvtTp>
            <UndrlygScty>
              <FinInstrmId>
                <OthrId>
                  <Id>J54675AA1</Id>
                  <Tp>
                    <Cd>CUSP</Cd>
                  </Tp>
                </OthrId>
                <Desc>JASDFKASDFADSFAFADSF</Desc>
              </FinInstrmId>
              <ClssfctnTp>
                <ClssfctnFinInstrm>DBXXXX</ClssfctnFinInstrm>
              </ClssfctnTp>
            </UndrlygScty>
          </CorpActnGnlInf>
          <AcctDtls>
            <ForAllAccts>
              <IdCd>GENR</IdCd>
            </ForAllAccts>
          </AcctDtls>
          <CorpActnDtls>
            <DtDtls>
              <RcrdDt>
                <Dt>
                  <Dt>0000-04-03</Dt>
                </Dt>
              </RcrdDt>
            </DtDtls>
            <RateAndAmtDtls>
              <Intrst>
                <Rate>0</Rate>
              </Intrst>
            </RateAndAmtDtls>
            <IntrstAcrdNbOfDays>0</IntrstAcrdNbOfDays>
          </CorpActnDtls>
          <CorpActnOptnDtls>
            <OptnNb>001</OptnNb>
            <OptnTp>
              <Cd>CASH</Cd>
            </OptnTp>
            <DfltPrcgOrStgInstr>
              <DfltOptnInd>true</DfltOptnInd>
            </DfltPrcgOrStgInstr>
            <DtDtls>
              <RspnDdln>
                <Dt>
                  <DtTm>0000-04-10T20:00:00-04:00</DtTm>
                </Dt>
              </RspnDdln>
            </DtDtls>
            <PrdDtls>
              <ActnPrd>
                <Prd>
                  <StartDt>
                    <Dt>
                      <DtTm>0000-04-06T00:00:00-04:00</DtTm>
                    </Dt>
                  </StartDt>
                  <EndDt>
                    <NotSpcfdDt>UKWN</NotSpcfdDt>
                  </EndDt>
                </Prd>
              </ActnPrd>
            </PrdDtls>
            <CshMvmntDtls>
              <CdtDbtInd>CRDT</CdtDbtInd>
              <IncmTp>
                <Id>0004</Id>
                <Issr>IRSX</Issr>
              </IncmTp>
              <DtDtls>
                <PmtDt>
                  <Dt>
                    <Dt>0000-04-18</Dt>
                  </Dt>
                </PmtDt>
              </DtDtls>
              <RateAndAmtDtls>
                <IntrstRateUsdForPmt>
                  <RateTpAndAmtAndRateSts>
                    <RateTp>
                      <Cd>SCHD</Cd>
                    </RateTp>
                    <Amt Ccy="USD">21.17125</Amt>
                  </RateTpAndAmtAndRateSts>
                </IntrstRateUsdForPmt>
                <WhldgOfLclTax>
                  <Rate>15.315</Rate>
                </WhldgOfLclTax>
              </RateAndAmtDtls>
            </CshMvmntDtls>
          </CorpActnOptnDtls>
          <CorpActnOptnDtls>
            <OptnNb>002</OptnNb>
            <OptnTp>
              <Cd>CASH</Cd>
            </OptnTp>
            <OptnFeatrs>
              <Cd>ASVO</Cd>
            </OptnFeatrs>
            <DfltPrcgOrStgInstr>
              <DfltOptnInd>false</DfltOptnInd>
            </DfltPrcgOrStgInstr>
            <DtDtls>
              <RspnDdln>
                <Dt>
                  <DtTm>0000-04-10T20:00:00-04:00</DtTm>
                </Dt>
              </RspnDdln>
            </DtDtls>
            <PrdDtls>
              <ActnPrd>
                <Prd>
                  <StartDt>
                    <Dt>
                      <DtTm>0000-04-06T00:00:00-04:00</DtTm>
                    </Dt>
                  </StartDt>
                  <EndDt>
                    <NotSpcfdDt>UKWN</NotSpcfdDt>
                  </EndDt>
                </Prd>
              </ActnPrd>
            </PrdDtls>
            <CshMvmntDtls>
              <CdtDbtInd>CRDT</CdtDbtInd>
              <IncmTp>
                <Id>0004</Id>
                <Issr>IRSX</Issr>
              </IncmTp>
              <DtDtls>
                <PmtDt>
                  <Dt>
                    <Dt>0000-04-18</Dt>
                  </Dt>
                </PmtDt>
              </DtDtls>
              <RateAndAmtDtls>
                <IntrstRateUsdForPmt>
                  <RateTpAndAmtAndRateSts>
                    <RateTp>
                      <Cd>SCHD</Cd>
                    </RateTp>
                    <Amt Ccy="USD">25</Amt>
                  </RateTpAndAmtAndRateSts>
                </IntrstRateUsdForPmt>
                <WhldgOfLclTax>
                  <Rate>0</Rate>
                </WhldgOfLclTax>
              </RateAndAmtDtls>
            </CshMvmntDtls>
          </CorpActnOptnDtls>
          <AddtlInf>
            <AddtlTxt>
              <UpdDt>0000-04-04</UpdDt>
              <AddtlInf> adfafadfasdfasdfasdfsdafadfdsafdf</AddtlInf>
            </AddtlTxt>
          </AddtlInf>
          <Regar>
            <NmAndAdr>
              <Nm>Not Available</Nm>
            </NmAndAdr>
          </Regar>
          <SplmtryData>
            <Envlp>
              <Document xmlns="urn:swift:xsd:supl.001.001.05" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
                <DTCCCANOCSDDataSD1>
                  <NtfctnGnlInf>
                    <PlcAndNm>/Document/CorpActnNtfctn/NtfctnGnlInf</PlcAndNm>
                    <CretDtAndTm>0000-04-24T11:34:09</CretDtAndTm>
                    <UpdDtAndTm>0000-04-24T20:02:16</UpdDtAndTm>
                  </NtfctnGnlInf>
                  <CorpActnGnlInf>
                    <PlcAndNm>/Document/CorpActnNtfctn/CorpActnGnlInf</PlcAndNm>
                    <EDSMsggElgbltyFlg>true</EDSMsggElgbltyFlg>
                    <DTCFCPElctnFlg>false</DTCFCPElctnFlg>
                    <AsstSvcrPrcgFlg>true</AsstSvcrPrcgFlg>
                  </CorpActnGnlInf>
                  <UndrlygScty>
                    <PlcAndNm>/Document/CorpActnNtfctn/CorpActnGnlInf/UndrlygScty</PlcAndNm>
                    <CtryOfListg>DE</CtryOfListg>
                    <IncmSrcCtry>JP</IncmSrcCtry>
                    <DTCAsstClss>CRPB</DTCAsstClss>
                    <DTCAsstTp>S500</DTCAsstTp>
                  </UndrlygScty>
                  <CorpActnDtls>
                    <PlcAndNm>/Document/CorpActnNtfctn/CorpActnDtls</PlcAndNm>
                    <CutOffDays>0</CutOffDays>
                    <EDSMsggCtryCd>JP</EDSMsggCtryCd>
                    <RDPRefNb>yyyyyyJ54675xxxxxxxxxxxxxxxxxxxx</RDPRefNb>
                  </CorpActnDtls>
                  <CorpActnDtDtls>
                    <PlcAndNm>/Document/CorpActnNtfctn/CorpActnDtls/DtDtls</PlcAndNm>
                    <DTCPosCaptrDt>0000-04-03</DTCPosCaptrDt>
                  </CorpActnDtDtls>
                  <OptnDtls>
                    <PlcAndNm>/Document/CorpActnNtfctn/CorpActnOptnDtls[1]</PlcAndNm>
                    <XtndedOptnFeatrs>FORU</XtndedOptnFeatrs>
                    <DfltOptnFlg>true</DfltOptnFlg>
                    <RDPRefNb>yyyyyyJ54675xxxxxxxxxxxxxxxxxxxx</RDPRefNb>
                  </OptnDtls>
                  <OptnDtls>
                    <PlcAndNm>/Document/CorpActnNtfctn/CorpActnOptnDtls[2]</PlcAndNm>
                    <XtndedOptnFeatrs>FORX</XtndedOptnFeatrs>
                    <RDPRefNb>yyyyyyJ54675xxxxxxxxxxxxxxxxxxxx</RDPRefNb>
                  </OptnDtls>
                  <CshMvmntDtls>
                    <PlcAndNm>/Document/CorpActnNtfctn/CorpActnOptnDtls[1]/CshMvmntDtls[1]</PlcAndNm>
                    <DTCPayMtd>1</DTCPayMtd>
                    <DTCPayOrdr>0</DTCPayOrdr>
                    <NRATaxRptblFlg>false</NRATaxRptblFlg>
                    <DclrdGrssRate>
                      <AmtPricPerFinInstrmQty>
                        <AmtPricTp>ACTU</AmtPricTp>
                        <PricVal Ccy="USD">25</PricVal>
                        <FinInstrmQty>
                          <FaceAmt>1000</FaceAmt>
                        </FinInstrmQty>
                      </AmtPricPerFinInstrmQty>
                    </DclrdGrssRate>
                    <RDPRefNb>yyyyyyJ54675xxxxxxxxxxxxxxxxxxxx</RDPRefNb>
                  </CshMvmntDtls>
                  <CshMvmntDtls>
                    <PlcAndNm>/Document/CorpActnNtfctn/CorpActnOptnDtls[2]/CshMvmntDtls[1]</PlcAndNm>
                    <DTCPayMtd>1</DTCPayMtd>
                    <DTCPayOrdr>0</DTCPayOrdr>
                    <NRATaxRptblFlg>false</NRATaxRptblFlg>
                    <RDPRefNb>yyyyyyJ54675xxxxxxxxxxxxxxxxxxxx</RDPRefNb>
                  </CshMvmntDtls>
                  <Agt>
                    <PlcAndNm>/Document/CorpActnNtfctn/Regar/NmAndAdr</PlcAndNm>
                    <AgtId>00009910</AgtId>
                  </Agt>
                </DTCCCANOCSDDataSD1>
              </Document>
            </Envlp>
          </SplmtryData>
        </CorpActnNtfctn>
      </Document>
    </cdtBusinessData>
  </env:Body>
</env:Envelope>

I have no problems extracting the first few elements like /env:Envelope/env:Body/cdtBusinessData

the cdtBusinessData is the element in the main schema that takes a xs:any .The schema snippet is as follows

                <xs:element name="cdtBusinessData" form="unqualified">
                    <xs:complexType>
                        <xs:sequence>
                            <xs:any minOccurs="0"/>
                        </xs:sequence>
                    </xs:complexType>
                </xs:element>

Precisely from this point on my xpath queries don't work the way I expect them to.

i.e when I try /env:Envelope/env:Body/cdtBusinessData/Document then it doesn't identify it to be a proper path on jxpath. On different tools that provide a xpath(like xpather/firepath/XpathBuilder) for a selected node I get different values,none of which are accepted by xpath.

Could you please help me in understanding how I can go about extracting values from the two embedded nodes in the above xml.

I have struggled with this for quite sometime now and finally reaching out for help here. Would appreciate if you can help me to correct this path //env:Envelope/env:Body/cdtBusinessData/Document


UPDATE

This is what I came up with based on your suggestions.I'm using jxpath 1.3. What am I doing wrong here ? I have comments inline next to the sysouts to indicate what I get

package com.testbed;

import java.io.ByteArrayInputStream;

import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;

import org.apache.commons.jxpath.JXPathContext;

import com.xyz.ib.pb.dtcc.util.FileUtils;

public class TestJXPathApproach {

    public static void main(String a[]) throws Exception {              
        String xmlMsg = FileUtils.readFileContents("C:\\dtcc-stuff\\SR\\1.xml");
        //xmlMsg = StringUtils.remove(xmlMsg, "<?xml version=\"1.0\"?>");
        TestJXPathApproach myTest = new TestJXPathApproach();
        myTest.testJxPathExpressions(xmlMsg);
    }

    private void testJxPathExpressions(String xmlMsg) {
        org.w3c.dom.Document doc = null;
        try {
            DocumentBuilder builder = DocumentBuilderFactory.newInstance().newDocumentBuilder();
            ByteArrayInputStream bais = new ByteArrayInputStream(xmlMsg.getBytes("UTF8"));
            doc = builder.parse(bais);
            bais.close();
            JXPathContext context = JXPathContext.newContext(doc);
            context.setLenient(true);
            context.registerNamespace("d", "urn:swift:xsd:seev.031.002.05");
            context.registerNamespace("dd", "urn:swift:xsd:supl.001.001.05");


            String cdtddTrackingNumVal = (String)context.getValue("/env:Envelope/env:Body/cdtDataDescription/cdtddTrackingNum");
            System.out.println("cdtddTrackingNumVal : "+cdtddTrackingNumVal); // prints the value correctly


            String cdVal = (String)context.getValue("/env:Envelope/env:Body/cdtBusinessData/d:Document/CorpActnNtfctn/CorpActnGnlInf/EvtTp/Cd");
            System.out.println("cdVal : "+cdVal);// prints null with namespace mappping specified

            cdVal = (String)context.getValue("/env:Envelope/env:Body/cdtBusinessData/Document/CorpActnNtfctn/CorpActnGnlInf/EvtTp/Cd");
            System.out.println("cdVal : "+cdVal);// prints null with no namespace mapping 

            cdVal = (String)context.getValue("/env:Envelope/env:Body/cdtBusinessData/*:Document/CorpActnNtfctn/CorpActnGnlInf/EvtTp/Cd");
            System.out.println("cdVal : "+cdVal);// prints null with wildcard namespace mapping 

            Object nodeObj  = context.selectSingleNode("/env:Envelope/env:Body/cdtBusinessData/d:Document/CorpActnNtfctn");
            System.out.println("nodeObj : "+nodeObj);// prints null


        }catch(Exception e) {
            e.printStackTrace();
        }
    }

}

Upvotes: 1

Views: 3984

Answers (2)

Charles Duffy
Charles Duffy

Reputation: 295242

Use a namespace wildcard for Document, if you want to be able to select either one:

/env:Envelope/env:Body/cdtBusinessData/*:Document

...or, to get both documents in one query:

//*:Document

See a full XQuery document which you can run yourself to see this working at https://gist.github.com/charles-dyfis-net/983d4054f4f9424a1698


Versions of the above compatible with XPath 1.0 (many thanks to @kjhughes):

/env:Envelope/env:Body/cdtBusinessData/*[local-name()='Document']

...or...

//*[local-name() = 'Document']

Upvotes: 6

kjhughes
kjhughes

Reputation: 111491

If you've registered the following namespace prefixes in JXPath,

JXPathContext.registerNamespace("sw", "urn:swift:xsd:seev.031.002.05")
JXPathContext.registerNamespace("env", "CDTS-SUBMIT")

then the following XPath,

/env:Envelope/env:Body/cdtBusinessData/sw:Document

will select the Document element in the urn:swift:xsd:seev.031.002.05 namespace successfully.

Update

If you want to select a Document element in a different namespace, register a prefix for the new namespace similarly and use it instead in your XPath.

Using registered namespace prefixes is generally prefered practice, but if you want to disregard the namespaces, in XPath 2.0 you can use the *:Document technique shown by Charles Duffy.

In XPath 1.0, the *: technique won't work but instead you can test against element's local name:

//*[local-name() = 'Document']

will select all Document elements regardless of namespace (and regardless of heritage).

Upvotes: 1

Related Questions