Reputation: 91
I'm trying to fix an XML file with thousands of lines that have the error:
Opening and ending tag mismatch error
I'm using right now simpleXML to parse this file, so before parse with this librarie I need to fix the XML file:
Right now I'm trying with this solution but it's not enough:
libxml_use_internal_errors(true);
$xml = @simplexml_load_file($temp_name);
$errors = libxml_get_errors();
foreach ($errors as $error) {
if (strpos($error->message, 'Opening and ending tag mismatch')!==false) {
$tag = trim(preg_replace('/Opening and ending tag mismatch: (.*) line.*/', '$1', $error->message));
$lines = file($temp_name, FILE_IGNORE_NEW_LINES);
$line = $error->line+1;
echo $line;
echo "<br>";
$lines[$line] = '</'.$tag.'>'.$lines[$line];
file_put_contents($temp_name, implode("\n", $lines));
}
}
Any idea?
Upvotes: 7
Views: 63436
Reputation: 49
I get this error mostly when I include <
in my content (between an opening and a closing tag such as, say, <t>
):
<t>
Mentioning an <element>
surrounded by <
and >
in the content gives errors.</t>
<t>
Mentioning an <element>
surrounded by <
and >
in the content does NOT give errors.</t>
Upvotes: 0
Reputation: 4636
I think this is simple solution.
Please check on your ending tag.
For example this should be correct.
$xml.="</childelement>";
Instead of
$xml.="<childelement/>";
Upvotes: -2
Reputation: 163262
First, if you've got corrupt data then fixing the program that generated it is usually more important than repairing the data.
If the only errors in the file are mismatched end tags, then presumably the repair strategy is to ignore what's in the end tag entirely, given that the name appearing in an XML end tag is redundant. You might find that an existing tool such as TagSoup or validator.nu handles this the way you want; or you might find that such a tool outputs XML which can be transformed into the form you want. That's a better prospect than writing your own parser for this non-XML grammar.
Upvotes: 2