usertest
usertest

Reputation: 27648

Split HTML files

How would I split a HTML formatted file into several HTML files (complete with with HTML, HEAD and BODY tags) with PHP? I would have a placeholder tag (something like <div class='placeholder'></div> ) for all the places I want to cut.

Thanks.

Upvotes: 1

Views: 2674

Answers (2)

crissiant
crissiant

Reputation: 1

The preg seems only to work for small files...

Anyway... To split an HTML file of this form :

(header...) <body><div class='container'> (intro...) 
<h3>Sect 1</h3> (section...) 
<h3>Sect 2</h3> (section...) 
(etc...) 
</div></body></html>

I manage this way :

$splitContents = explode("<h3", $sourceHTML);
$i=0;
$last=count($splitContents)-1;
foreach ($splitContents as $chunk) {
    if($i==0) {
        $beginning=explode("<body", $chunk);
        $top=$beginning[0];
        $html = $chunk ;
    } else {
        $html = $top . "<body><div class='container'><h3" . $chunk ;
    }
    if($i !=$last) $html .= "</div></body></html>";
    // save html to file
    ++$i;
}

Upvotes: 0

sidereal
sidereal

Reputation: 1120

$sourceHTML = file_get_contents('sourcefile');

$splitContents = explode("<div class='placeholder'></div>", $sourceHTML);

foreach ($splitContents as $html) {
    // save html to file
}

Edit: whoops. As user201140 correctly points out, I missed the fact that each html file has to be a valid document. Since it's not specified exactly what the head tag should contain, I'll assume that the head tag of the combined document should be replicated to each copy. In that case:

$sourceHTML = file_get_contents('sourcefile');
preg_match("/(^.*<body.*?>)(.*)(<\/body.*$)/is", $sourceHTML, &$matches);
$top = $matches[1];
$contents = $matches[2];
$bottom = $matches[3];
$splitContents = explode("<div class='placeholder'></div>", $contents);
foreach ($splitContents as $chunk) {
    $html = $top.$chunk.$bottom;
    // save html to file
}

Upvotes: 5

Related Questions