Reputation: 11
I would like to convert a HTML product sheet into XML (DocBook) document. The problem is to split my HTML descriptions in XML's <simplesect>
sections.
I would like to transform this (HTML):
<div class="description">
<h3>Title 1</h3>
<p>Paragraph one</p>
<p>Paragraph two</p>
...
<figure>
...
</figure>
...
<p>Paragraph three</p>
<h3>Title 2</h3>
<p>Paragraph one</p>
...
<figure>
...
</figure>
<p>Paragraph two</p>
<p>Paragraph three</p>
...
</div>
to this (DocBook XML) :
<section>
<title>My Main Title</title>
<simplesect>
<title>Title 1</title>
<para>Paragraph one</para>
<para>Paragraph two</para>
...
<mediaobject>
...
</mediaobject>
...
<para>Pargraph three</para>
</simplesect>
<simplesect>
<title>Title 2</title>
<para>Paragraph one</para>
...
<mediaobject>
...
</mediaobject>
<para>Paragraph two</para>
<para>Paragraph three</para>
</simplesect>
</section>
I've tried to select all tags between H2 tags using the following-sibling and other methods, without success.
How can I find the right XPath expression?
Upvotes: 0
Views: 38