SiL3NC3
SiL3NC3

Reputation: 770

How to select XML nodes by position (Linq or XPATH)

have been trying around and does not get to any working result.

Given is an excel xml like this structure:

<?xml version="1.0"?>
<?mso-application progid="Excel.Sheet"?>
<Workbook xmlns="urn:schemas-microsoft-com:office:spreadsheet"
 xmlns:o="urn:schemas-microsoft-com:office:office"
 xmlns:x="urn:schemas-microsoft-com:office:excel"
 xmlns:ss="urn:schemas-microsoft-com:office:spreadsheet"
 xmlns:html="http://www.w3.org/TR/REC-html40">
 <DocumentProperties xmlns="urn:schemas-microsoft-com:office:office">
...
 </DocumentProperties>
 <OfficeDocumentSettings xmlns="urn:schemas-microsoft-com:office:office">
...
 </OfficeDocumentSettings>
 <ExcelWorkbook xmlns="urn:schemas-microsoft-com:office:excel">
...
 </ExcelWorkbook>
 <Styles>
...
 </Styles>
 <Worksheet ss:Name="Report">
  <Table ss:ExpandedColumnCount="41" ss:ExpandedRowCount="4082" x:FullColumns="1"
   x:FullRows="1" ss:DefaultColumnWidth="60" ss:DefaultRowHeight="15">
   <Row>
    <Cell ss:StyleID="s62"><Data ss:Type="String">Cell_1</Data></Cell>
    <Cell ss:StyleID="s62"><Data ss:Type="String">Cell_2</Data></Cell>
    ...
    <Cell ss:StyleID="s62"><Data ss:Type="String">Cell_40_Active</Data></Cell>
   </Row>
   <Row>
    <Cell ss:StyleID="s62"><Data ss:Type="String">Cell_1</Data></Cell>
    <Cell ss:StyleID="s62"><Data ss:Type="String">Cell_2</Data></Cell>
    ...
    <Cell ss:StyleID="s62"><Data ss:Type="String">Cell_40_Active</Data></Cell>
   </Row>
  </Table>
  <WorksheetOptions xmlns="urn:schemas-microsoft-com:office:excel">
...
  </WorksheetOptions>
 </Worksheet>
</Workbook>

The goal is to select only these rows, which contains "Cell_40_Active" within the 40th-Cell (ID) of the Row. Like: Cell[40].Data.InnerText = "Cell_40_Active" ...

        XmlDocument doc = new XmlDocument();
        doc.Load(file);
        XmlElement root = doc.DocumentElement;
        // does return all Row-elements >> working
        XmlNodeList nodes = root.GetElementsByTagName("Row");
        //does not return any element (0)
        XmlNodeList nodes = root.SelectNodes("/Worksheet/Row/Cell[40]='Cell_40_Active'");

How can this be done? Haven't found something similar ... Any hint? Thank you very much.

Upvotes: 1

Views: 788

Answers (2)

sowrd299
sowrd299

Reputation: 149

Perhaps you can take advantage of XmlNodeList's index function, say in the context of a Linq query, such as:

var result = from row in root.GetElementsByTagName("Row")
             where row.index(40).InnerText == "Cell_40_Active"
             //uses inner text to "skip over" the data tag.
             //this won't work if you have other child nodes with inner text.
             select row;

Upvotes: 0

wero
wero

Reputation: 32990

All elements are in namespace urn:schemas-microsoft-com:office:spreadsheet so you have to prepare for that:

var nsmgr = new XmlNamespaceManager(doc.NameTable);
nsmgr.AddNamespace("x", "urn:schemas-microsoft-com:office:spreadsheet");
XmlNodeList nodes = root.SelectNodes("<xpath expr using x prefix>", nsmgr);

And given your description the XPath expression should then probably be (using the previously defined prefix x):

/x:Workbook/x:Worksheet/x:Table/x:Row[x:Cell[40]/x:Data='Cell_40_Active']

Upvotes: 1

Related Questions