Manish
Manish

Reputation: 3521

How to parse html embedded in xml in R?

I have one input file which has html tag embedded in xml for example

 <Root>
   <Section1>
   <p>some text</p>
   <br>
   <table>
       <th></th>
       <tr>
       <td></td> 
       </tr>    
   </table>
   </Section1>
  <Section2>
  <ol>
      <li>1</li>
      <li>2</li>
      <li>3</li>
  </ol>
  </Section2>
</Root>

Is there any way to parse html embedded in xml document in R?

Upvotes: 1

Views: 213

Answers (1)

Spacedman
Spacedman

Reputation: 94192

If its XHTML then it should be XML, so you use the standard XML parsers. You can find plenty about those elsewhere.

Note your <Section1> tag doesn't close properly. If this is a file you've pasted in, then there's a problem with it.

Upvotes: 2

Related Questions