Extracting the first table's first row

Question

I'm trying to extract the first table row (tr) of the first table (table) object in a parsed XML document.

I thought that the following will do the trick:

//table[1]//tr[1]//text()

Yet it returns too many nodes, for example in this page I wish to return:

Wikimedia Commons has media related to 
Public transport schedules

but the text of the following node which is clearly not part of the first row also returns:

Public transport

(only the text appears yet I patch the full node so it will be easier to find it)

Ian Roberts · Accepted Answer

This is a subtlety of the way // is defined - //table[1] does not mean "the first table" but rather "every table that is the first table element in its respective parent". The same applies to the tr step - you'll get the first row in the thead and the first row in the tbody.

If you want the first row of the first table in the whole document you need to use parentheses:

(//table//tr)[1]

This says "find all rows in all tables, then from that list select just the first element in document order".

Extracting the first table's first row

Answers (2)

Related Questions

Extracting the first table&#39;s first row

Answers (2)

Related Questions

Extracting the first table's first row