Reputation: 361
I am trying to extract addresses using xPath from URLs like
https://www.americangemsociety.org/bradshaw-s-jewelers https://www.americangemsociety.org/fincher-ozment-jewelers
etc.
However the problem is that the position of the addresses isn't uniform throughout the pages. Some of the pages have the address on Paragraph node # 4 while some others have them on Paragraph Node # 2 and so on.
I was wondering if I could use an xPath that identifies the addresses based on the 'strong class' of Address instead of a specific Node #
Example of an address within the HTML
<p><strong class="">Address:</strong> 4355 Montgomery Hwy, Ste 2, Dothan, Alabama 36303-1696</p>
Kindly advise
Thanks
Upvotes: 0
Views: 24
Reputation: 167516
If you use //p[strong[not(normalize-space(@class)) and . = 'Address:']]
then you select all p
elements which contain a strong
element with contents Address:
.
Upvotes: 1