Francis
Francis

Reputation: 343

PHP import CSV with umlauts

I have a website where my user can upload a file. This file contains some German umlauts (as ö and ä) and is latin_1 encoded. Now I need to convert this file to UTF-8 because this charset is used by my database.

I use the following code:

        $csvFile = fopen($_FILES['file']['tmp_name'], 'r');
        //parse data from csv file line by line
        while(($line = fgetcsv($csvFile, 0, "\t")) !== FALSE){
                $dbupload->query("INSERT INTO db (a, b, c) 
                             VALUES ('".$line[0]."', '".$line[3]."', '".$line[1]."')");
            }
        }

        //close opened csv file
        fclose($csvFile);

If I use this code and import a latin_1 file, PHP skips every line containing an umlaut.

What can I do?

PS: The file is directly passed from the frontend (the page the user uses) to this file which processes it.

Upvotes: 2

Views: 2304

Answers (2)

marv255
marv255

Reputation: 818

First of all, do not pass data from user's upload directly to database! Please use pdo statements instead.

Also make sure which encoding you have in the file. In example i have use ISO-8859-1 but i can missunderstand you.

Here is a part of my code that using iconv:

$csvFile = fopen($_FILES['file']['tmp_name'], 'r');

//parse data from csv file line by line
while(($line = fgetcsv($csvFile, 0, "\t")) !== false) {
    foreach ($line as $key => $value) {
        $line[$key] = iconv('ISO-8859-1', 'UTF-8', $value); //but be sure in your charset name
    }
    //do not pass data from user upload directly to database!
    //use pdo or addslashes at least
    $dbupload->query("INSERT INTO db (a, b, c) VALUES ('".addslashes($line[0])."', '".addslashes($line[3])."', '".addslashes($line[1])."')");
}

//close opened csv file
fclose($csvFile);

Please see iconv documentation for more examples.

Upvotes: 2

Juxhin Metaj
Juxhin Metaj

Reputation: 52

If you check this list here, you will see that die Umlaute are included, which mentioned, that they can be en- and decoded. As it was mentioned, saving the file with and in an UTF-8 Encoding, will do the trick. Otherwise the file will not do it automatically, that is why you have the problem.

Upvotes: 0

Related Questions