London guy
London guy

Reputation: 28022

XPath ignoring closing tags

I have the following line in my xml file that I am trying to parse using XPath. The XML file itself was got by converting a PDF document using PDFtoHTML converter. As you can see, it has not added a closing tag for
here. So when I try to execute XPath by trying to capture the textValue of tag, it throws me an error saying
should be followed by a closing tag. How do I overcome this in XPath? Hovewer, when I open the file in a browser, everything is rendered fine in the browser.

<DIV style="position:absolute;top:222;left:143">
  <nobr>
    <span class="ft8">Dear Mr. AMIT KUMAR,
      <br>We are happy to enclose<br>31st March, 2011
    </span>
  </nobr>
</DIV>

Thanks Abhishek S

Upvotes: 1

Views: 746

Answers (1)

Paul Butcher
Paul Butcher

Reputation: 6956

What you have posted is not XML. You cannot overcome this with XPath.

After generating the HTML, you could use HTML Tidy, to make it into XML, or you could try using a converter that converts PDF to well-formed XML.

Upvotes: 4

Related Questions