Lin
Lin

Reputation: 663

How to find Xpath for nodes with text including linebreak or html fomatters

I am trying to locate a specific node content from an html response. I am trying to find a bit difficult to locate a very specific node as the node element contains line breaks. I am trying out in xpathtester site and my test xml is a provided below.

    <html> 
      <table > 
        <tr > 
          <th colspan="3"> 
            <table  > 
              <tr  valign="bottom"> 
                <th   scope="col" align="left">Test
                  <br/> Item1</th>  
                <th   scope="col">:</th>  
                <th   scope="col" align="left">ABC123</th>  
                <th rowspan="7"> 
                  <img width="100" height="140" src="xyzcontenturl.jpg"/> 
                </th> 
              </tr>   
              <tr  valign="bottom"> 
                <th   scope="col" align="left">Test
                  <br/> Item2</th>  
                <th  scope="col" >:</th>  
                <th  scope="col" align="left" colspan="2" >DEF789</th> 
              </tr> 
            </table> 
          </th> 
        </tr>  
    </table>  
      <p> 
        <strong/> 
      </p> 
    </html>

The idea is to pick up the third column header text and i can place a condition //th[contains(text(),"Test")]/following-sibling::th[2]/text() to locate that(value returned is ABC123).

The challenge is when i try to locate the value based on a specific node ie. "Test Item1" . Since the Line break is sitting between The text "Test" and "Item1" I could not use functions "contains or starts-with.

How do I write the XPATH so that i can pick up the TH element with value `"Test <br/> Item1"?

Note: The xml provided is a sample illustrating the problem hence first table header ( th element) or second Table Header (th) element etc won't help.

Upvotes: 3

Views: 3462

Answers (2)

har07
har07

Reputation: 89285

Compare against normalize-space() which replace newlines (not HTML <br/> to be clear) with single space :

//th[normalize-space()='Test Item1']/following-sibling::th[2]/text()

demo

The function receives concatenation of all text nodes within th as argument, do whitespaces normalization on the argument and return the result. Quoted from the linked specification :

The normalize-space function returns the argument string with whitespace normalized by stripping leading and trailing whitespace and replacing sequences of whitespace characters by a single space.

Upvotes: 3

William Walseth
William Walseth

Reputation: 2923

If you're using XPath in code, then get the element and use the "InnerText" property. If from XSL use the text() function. What are you calling your XPath from?

Upvotes: 0

Related Questions