Harikrishna
Harikrishna

Reputation: 4305

Inserting Ending Tags For Missing Tags In html

How to insert the ending html tags where there are missing ending tags ?

Like

 <tr>
 <td>Index No.</td><td>Name</td>

 <tr>
 <td>1</td><td>Harikrishna</td>

Where two missing ending tags.That is "/tr".Now in this case how to Search where are the missing tag and there how to insert appropriate ending tag such as "/tr".

Upvotes: 0

Views: 1453

Answers (3)

MicE
MicE

Reputation: 5128

I cannot comment on the above, so I'll note it here. You can use HTML Tidy also for cleaning HTML fragments. See examples here:
http://www.php.net/manual/en/tidy.examples.basic.php

An alternative to HTML Tidy is to clean your output code with regular expressions - I provide an example below. However please note that even though this might be faster in terms of processing, it is not that universal not robust (maintenance-wise) as HTML Tidy is.

Code

<?php

$html = "
<table>
<tr class=\"lorem\">
<td>Index No.</td>
<td>Name</td>

<tr>
<td>0</td>
<td>FooBaz</td>

<tr>
<td>1</td>
<td>Harikrishna</td>

<tr class=\"ipsum\">
<td>2</td>
<td>Foo</td>
</tr>

<tr>
<td>3</td>
<td>Bar</td>


</table>
";

// regex magic
$start_cond = "<tr(?:\s[^>]*)?>";
$end_cond = "(?:{$start_cond}|<\/table>)";
$row_contents = "(?:(?!{$end_cond}).)*";

// first remove all </tr> tags
$xhtml = preg_replace( "/<\/tr>/ism", "", $html );

// now re-add </tr> tags where appropriate
$xhtml = preg_replace( "/({$start_cond})({$row_contents})/ism", "$1$2</tr>\n", $xhtml );

// ignore: just for writing comparision output
echo "<h2>Before:</h2>"; show_count( $html );
echo "<h2>After</h2>"; show_count( $xhtml );

function cmp($patt,$html) {
    $count = preg_match_all( "/{$patt}/ism", $html, $matches);
    return htmlentities("\n{$count} x {$patt}");
}
function show_count($html) {
    echo "<pre>"
        . cmp("<tr(\s[^>]*)?>",$html)
        . cmp("<\/tr>",$html)
        . "</pre>";
}
?>

Output


Before:
5 x <tr(\s[^>]*)?>
1 x <\/tr>

After
5 x <tr(\s[^>]*)?>
5 x <\/tr>

Upvotes: 1

Amber
Amber

Reputation: 527183

You might take a look at HTML Tidy and see if it works for what you need.

Upvotes: 1

Darin Dimitrov
Darin Dimitrov

Reputation: 1039298

This seems like a very though task to do if you want to handle all possible cases. HTML is not a regular language. IMHO you should try to solve the problem at the source which is how in the first place you got invalid HTML.

Upvotes: 2

Related Questions