Reputation: 3692
I have a PHP script which exports a CSV file. My users then edit the file in Excel, save it, and re-upload it.
If they type a euro symbol into a field, when the file is uploaded, the euro symbol, and everything afterwards is missing. I'm using the str_getcsv function.
If I try to convert the encoding (say to UTF-8), the euro symbol disappears, and I get a missing character marker (usually represented by a blank square or a question mark in a diamond).
How to I convert the encoding to UTF-8, but also keep the euro symbol (and other non-standard characters)?
Edit:
Here is my code:
/**
* Decodes html entity encoded characters back to their original
*
* @access public
* @param String The element of the array to process
* @param Mixed The key of the current element of the array
* @return void
*/
public function decodeArray(&$indexValue, $key)
{
$indexValue = html_entity_decode($indexValue, ENT_NOQUOTES, 'Windows-1252');
}
/**
* Parses the contents of a CSV file into a two dimensional array
*
* @access public
* @param String The contents of the uploaded CSV file
* @return Array Two dimensional-array.
*/
public function parseCsv($contents)
{
$changes = array();
$lines = split("[\n|\r]", $contents);
foreach ($lines as $line) {
$line = utf8_encode($line);
$line = htmlentities($line, ENT_NOQUOTES);
$lineValues = str_getcsv($line);
array_walk($lineValues, 'decodeArray');
$changes[] = $lineValues;
}
return $changes;
I have also tried the following instead of the utf8_encode function:
iconv("Windows-1252", "UTF-8//TRANSLIT", $line);
And also just:
$line = htmlentities($line, ENT_NOQUOTES, 'Windows-1252');
With the utf8_encode function, the offending character is removed from the string. With any other method, the character and everything after the character is missing.
Example:
The field value : "Promo € Mobile"
is interpreted as : "Promo Mobile"
Upvotes: 0
Views: 1421
Reputation: 676
Add these to the beginning of your CSV file
chr(239) . chr(187) . chr(191)
Upvotes: 0