Reputation: 103
i have this code xml
<?php header("Content-Type: text/xml;charset=ISO-8859-7");?>
<pages>
<link>
<title>κεμενο</title>
<url>http://www.example.com</url>
</link>
</pages>
and the html code here for live search when i have latin characters on y xml it's working fine but when i change the characters from english to greek i have this error message. Warning: DOMDocument::load() [domdocument.load]: Input is not proper UTF-8, indicate encoding ! Bytes: 0xE1 0x3C 0x2F 0x74 in /Applications/XAMPP/
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-7" />
<script>
function showResult(str)
{
if (str.length==0)
{
document.getElementById("livesearch").innerHTML="";
document.getElementById("livesearch").style.border="0px";
return;
}
if (window.XMLHttpRequest)
{// code for IE7+, Firefox, Chrome, Opera, Safari
xmlhttp=new XMLHttpRequest();
}
else
{// code for IE6, IE5
xmlhttp=new ActiveXObject("Microsoft.XMLHTTP");
}
xmlhttp.onreadystatechange=function()
{
if (xmlhttp.readyState==4 && xmlhttp.status==200)
{
document.getElementById("livesearch").innerHTML=xmlhttp.responseText;
document.getElementById("livesearch").style.border="1px solid #A5ACB2";
}
}
xmlhttp.open("GET","livesearch.php?q="+str,true);
xmlhttp.send();
}
</script>
</head>
<body>
<form>
<input type="text" size="30" onkeyup="showResult(this.value)">
<div id="livesearch"></div>
</form>
</body>
</html>
Upvotes: 1
Views: 8969
Reputation: 197554
You are using the method DOMDocument::load()
to load an XML document from a file.
That file is making use of the ISO-8859-7
encoding, however, the XML does not signal this encoding in it's XML Declaration (btw, the header()
call does not signal the encoding for load()
).
Therefore DOMDocument assumes the file is in UTF-8, however it runs over illegal binary sequences:
The binary octet "\xE2"
signals two folloing octets encoding one Unicode code-point. However the next two octets in your case are "\x3C\x2F"
which are no valid continuation bytes.
See again the error message:
Warning: DOMDocument::load() [domdocument.load]: Input is not proper UTF-8, indicate encoding ! Bytes: 0xE1 0x3C 0x2F 0x74 in ...
This hints two potential solutions:
The first option would mean to add an XML Declaration on top of the file signalling the encoding used:
<?xml version=\"1.0\" encoding=\"ISO-8859-7\"?>
<pages>
This file can then be loaded and re-encoded:
$doc->load($path);
$doc->encoding = 'UTF-8';
The second alternative is to re-encode the string before you load it in, however you normally won't need to do that if you set the the XML declaration which I do recommend.
Re-encoding a string (not a filename!) works the following:
$xmlUTF_8 = iconv('ISO-8859-7', 'UTF-8', $xmlISO_8859_7);
Hope this helps. Also see as well How to keep the Chinese or other foreign language as they are instead of converting them into codes? and the other linked questions there that are showing the workarounds.
Upvotes: 2
Reputation: 146340
Input is not proper UTF-8, indicate encoding
... so I guess your question is how to indicate encoding in XML. Since it appears to be a static document:
<?xml version="1.0" encoding="ISO-8859-7"?>
<pages>
<link>
<title>κεμενο</title>
<url>http://www.example.com</url>
</link>
</pages>
Depending your PHP settings, you may need to obfuscate the <?
tag so it doesn't get interpreted as a PHP tag.
Upvotes: 2