Dee
Dee

Reputation: 21

XPath expression to extract addresses from this HTML

I need to extract the following 3 addresses separately before the phone numbers from this hideous HTML but I am absolutely stumped

<div class='additional-locations collapsible'>
    <div class='row'>
        <div class='location'>
             CompanyName<br /> 123 Some Street<br />City Province PostalCode<br />Country<br /><strong>Phone:</strong>123 456 7890<br /><strong>Fax:</strong> 123 456 7890
            <br />
            <strong>County:</strong> County<br />
            <strong>Electoral District:</strong> 01<br />

            <hr />

            CompanyName<br /> 546 SomeOther Street<br />City Province PostalCode<br />Country<br /><strong>Phone:</strong>123 456 7890<br /><strong>Fax:</strong> 123 456 7890
            <br />
            <strong>County:</strong> County<br />
            <strong>Electoral District:</strong> 02<br />

            <hr />

            CompanyName<br /> 378 Another Street<br />City Province PostalCode<br />Country<br /><strong>Phone:</strong>123 456 7890<br /><strong>Fax:</strong> 123 456 7890
            <br />
            <strong>County:</strong> County<br />
            <strong>Electoral District:</strong> 03<br />
        </div>
    </div>
</div>

I thought I would query for

//div[contains(@class,'additional-practice-location')]//div[@class='practice-location']/text()[preceding::strong[contains(text(), 'Phone')][1]]

and try to grab the text before it but I can't seem to figure it out, can anyone help?

Upvotes: 2

Views: 57

Answers (1)

Andersson
Andersson

Reputation: 52665

As you've added xpath-2.0 tag try below XPath expression that should be applicable for XPath 2.0 to get required data:

for $i in //div[@class='location']/text()[normalize-space()="CompanyName"] 
    return $i/string-join(following-sibling::text()[position()<4], ", ")

Output:

123 Some Street, City Province PostalCode, Country
546 SomeOther Street, City Province PostalCode, Country
378 Another Street, City Province PostalCode, Country

Upvotes: 1

Related Questions