Reputation: 14889
I am trying to insert UTF8 strings into PostgreSQL, using DBD::Pg and always get double-encoded UTF8 in the database. I only get the correct characters in the DB if the strings I insert are Latin1.
How can I tell DBD::Pg to not re-encode my strings which are already UTF8?
Test script:
!/usr/bin/perl
use strict;
my $PGDB = 'dbi:Pg:dbname=test;host=localhost';
my $PGDB_USER = 'username';
my $SQL_INSERT = 'INSERT INTO tmp1 (t1, t2) VALUES (?, ?)';
use DBI;
my $dbh = DBI->connect($PGDB, $PGDB_USER)
|| die "Couldn't connect to $PGDB as user $PGDB_USER: $DBI::errstr\n";
#$dbh->do("SET client_encoding TO UTF8");
my $sth = $dbh->prepare( $SQL_INSERT )
|| die "Can't prepare insert statement $SQL_INSERT: $DBI::errstr";
my $cp1252 = "\xe9 voil\xe0";
my $utf8 = "é voilà";
utf8::upgrade($utf8);
use utf8;
#use bytes;
my $text = 'sent utf8 w. utf8::upgrade';
$sth->execute($utf8, $text) or die $sth->errstr, "\n";
The resulting table after a few tests:
é voilà sent cp1252 as_is w. use bytes
é voilà sent utf8 as_is w. use bytes
é voilà sent utf8 as_is w. use utf8
é voilà sent cp1252 as_is w. use utf8
é voilà sent cp1252 as_is w. do(SET client_encoding TO UTF8)
é voilà sent utf8 as_is w. do(SET client_encoding TO UTF8)
é voilà sent utf8 as_is w. use utf8 + do(SET client_encoding TO UTF8)
é voilà sent utf8 w. utf8::upgrade + do(SET client_encoding TO UTF8)
é voilà sent utf8 w. utf8::upgrade
(This is on Ubuntu 16.04 with DBD::Pg version 3.5.3. I did not have this problem with the DBD::Pg version that installed on Ubuntu 12.04)
Upvotes: 3
Views: 1332