FlorisdG
FlorisdG

Reputation: 774

Get HTML content between 2 elements

I need to make a PDF generator using TCPDF and PHP. I could just write everything on the PDF, but that would look awful. Therefore, I need to get every product in the HTML on a different page.

With the newer pages, it's pretty easy. Just use dom document to find the <div> around the products, put it in an array and write that to the PDF.

Unfortunately, not every page is the same, so not every page has <div>. This page for example.

'<h3>sample#1</h3>
<p>Lorem ipsum dolor sit amet, consectetuer adipiscing elit.</p>
<img>
<table>
</table>

<h3>sample#2</h3>
<p>Aenean commodo ligula eget dolor. Aenean massa.</p>
<img>
<table>
</table>

<h3>sample#3</h3>
<p>Cum sociis natoque penatibus et magnis dis parturient montes, nascetur ridiculus mus.</p>
<img>
<table>
</table>

<h3>sample#4</h3>
<p>Donec quam felis, ultricies nec, pellentesque eu, pretium quis, sem.</p>
<img>
<table>
</table>'

So what I'm trying to get, is something like this:

array (size=4)
0 => string "
<h3>sample#1</h3>
<p>Lorem ipsum dolor sit amet, consectetuer adipiscing elit.</p>
<img>
<table>
</table>"
1=> string "
<h3>sample#2</h3>
<p>Aenean commodo ligula eget dolor. Aenean massa.</p>
<img>
<table>
</table>"

etc.

I have no problem with including something into the server files if needed, but preferably not.

Upvotes: 2

Views: 178

Answers (1)

swidmann
swidmann

Reputation: 2792

If the pages really look like your given example, you can try a simple preg_match_all(). If the structure of some pages is different from you example, you can adjust your regular expression. Here is a good site to test the function.

$html = '<h3>sample#1</h3>
<p>Lorem ipsum dolor sit amet, consectetuer adipiscing elit.</p>
<img>
<table>
</table>

<h3>sample#2</h3>
<p>Aenean commodo ligula eget dolor. Aenean massa.</p>
<img>
<table>
</table>

<h3>sample#3</h3>
<p>Cum sociis natoque penatibus et magnis dis parturient montes, nascetur ridiculus mus.</p>
<img>
<table>
</table>

<h3>sample#4</h3>
<p>Donec quam felis, ultricies nec, pellentesque eu, pretium quis, sem.</p>
<img>
<table>
</table>';


$matches = array();
$elements = array();

preg_match_all( "#<h3>.*?</table>#s" , $html, $matches );

if( count( $matches[0] ) > 1 ) {
    $elements = $matches[0];
}

echo "<pre>";
var_dump( $elements );

OUTPUT:

array(4) {
  [0]=>
  string(105) "<h3>sample#1</h3>
<p>Lorem ipsum dolor sit amet, consectetuer adipiscing elit.</p>
<img>
<table>
</table>"
  [1]=>
  string(95) "<h3>sample#2</h3>
<p>Aenean commodo ligula eget dolor. Aenean massa.</p>
<img>
<table>
</table>"
  [2]=>
  string(133) "<h3>sample#3</h3>
<p>Cum sociis natoque penatibus et magnis dis parturient montes, nascetur ridiculus mus.</p>
<img>
<table>
</table>"
  [3]=>
  string(116) "<h3>sample#4</h3>
<p>Donec quam felis, ultricies nec, pellentesque eu, pretium quis, sem.</p>
<img>
<table>
</table>"
}

Upvotes: 4

Related Questions