Tomas Diaz
Tomas Diaz

Reputation: 95

Problems inserting strings in MySQL

Well, ill try to explain it but please, apologize my english.

I have a script that dumps an entire database into an SQL file and then another script splits the lines and execute them to drop, create and insert the data. The problem is that some strings are "trimmed". It just insert the string until it reach the first special character, for example:

For the string:

"Pantalon azul marino de Poliéster con cinta blanca bordada con el nombre de la institución en uno de sus costados."

it just insert:

 "Pantalon azul marino de Poli"

No error is thrown. But this happens only using the script, but when i run the queries manually and importing the SQL file in phpMyAdmin everything works. Everything is set to utf8 by the way.

I'm out of ideas, any help will be very appreciated.

    include ('../core/connection.inc.php');
    $conn = dbConnect('admin');
    $conn->query("SET NAMES 'utf8'");
    $conn->set_charset("utf8");
    $type = 0;

    // Temporary variable, used to store current query
    $templine = '';
    // Read in entire file
    $lines = file('db-backup.sql');
    // Loop through each line
    $correct = 0;
    $failed = 0;
    foreach ($lines as $line){
    // Skip it if it's a comment
    if (substr($line, 0, 2) == '--' || $line == '')
        continue;
    // Add this line to the current segment
    $templine .= $line;
    // If it has a semicolon at the end, it's the end of the query
    if (substr(trim($line), -1, 1) == ';'){
        $templine = str_replace("latin1","utf8",$templine);
        $templine = trim($templine);

        // Perform the query
        $conn->query($templine);
        $errno = $conn->errno;
        $error = $conn->error;
        if($conn->affected_rows > 0){
            echo "OK: (".$templine.")<br/>";
            $correct++;
        } else {
            echo "Failed: (".$templine.")<br/>";
            echo "&nbsp;&nbsp; Errno: ".$errno." <br/>";
            echo "&nbsp;&nbsp; Error: ".$error." <br/>";
            $failed++;
        }
        $templine = '';
    }
    }

Upvotes: 3

Views: 168

Answers (2)

simpleigh
simpleigh

Reputation: 2894

I'm guessing that the dump file you're importing isn't UTF-8.

PHP is piping the bytes from the file to MySQL without any conversion. The é character in your file is probably in latin1 based on the change you're making, likely represented by a single byte with a value > 127. This isn't UTF-8. You've promised MySQL that you'll send valid UTF-8, and it stops reading the string when it gets to an invalid byte.

You might consider:

  • re-encoding the dump file as UTF-8
  • figuring out what encoding the dump file is in, and loading it into MySQL using that encoding

Personally I think I'd approach the problem a different way:

  1. Load the dump file into MySQL using the command-line client, or something similar. You know this works.
  2. Alter the character set of each column after importing - you can use the data in information_schema to assemble ALTER TABLE statements and get MySQL to do the conversion properly.

Upvotes: 2

dustfeather
dustfeather

Reputation: 86

I don't know how your table looks like, but i can give you this bit of extra advice just in case. Make sure you have your DB cols set to UTF8:

ALTER TABLE {table_name} CHANGE COLUMN {col_name} {col_name} TEXT CHARACTER SET 'utf8' COLLATE 'utf8_general_ci' NOT NULL ;

Upvotes: 0

Related Questions