Reputation: 25117
How can I induce Term::Readline to set the UTF8 flag one the results from readline
?
#!/usr/local/bin/perl
use warnings FATAL => qw(all);
use strict;
use 5.10.1;
use utf8;
use open qw( :encoding(UTF-8) :std );
use Term::ReadLine;
use Devel::Peek;
my $term = Term::ReadLine->new( 'test', *STDIN, *STDOUT );
$term->ornaments( 0 );
my $char;
$char = $term->readline( 'Enter char: ' );
Dump $char;
print 'Enter char: ';
$char = <>;
chomp $char;
Dump $char;
The output:
Enter char: ü
SV = PV(0x11ce4c0) at 0x1090078
REFCNT = 1
FLAGS = (PADMY,POK,pPOK)
PV = 0x14552c0 "\374"\0
CUR = 1
LEN = 16
Enter char: ü
SV = PV(0x11ce4c0) at 0x1090078
REFCNT = 1
FLAGS = (PADMY,POK,pPOK,UTF8)
PV = 0x14552c0 "\303\274"\0 [UTF8 "\x{fc}"]
CUR = 2
LEN = 16
Comment:
When I am searching in a mysql
database (with mysql_enable_utf8
enabled):
my $stmt = "SELECT * FROM $table WHERE City REGEXP ?";
say $stmt;
# my $term = Term::ReadLine->new( 'table_watch', *STDIN, *STDOUT );
# $term->ornaments( 0 );
# my $arg = $term->readline( 'Enter argument: ' ); # ü -> doesn't find 'München'
print "Enter argument: ";
my $arg = <>; # ü -> finds 'München'
chomp $arg;
Upvotes: 1
Views: 148
Reputation: 386361
Why? Those two strings are equivalent. It's like 0 stored as an IV vs stored as a UV.
Well, it's possible that you have to deal with buggy XS code. If that's the case, utf8::upgrade($s)
and utf8::downgrade($s)
can be used to change how the string is stored in the scalar.
Unlike encoding and decoding, utf8::upgrade
and utf8::downgrade
don't change the string, just how it's stored.
$ perl -MDevel::Peek -E'
$_="\xFC";
utf8::downgrade($d=$_); Dump($d);
utf8::upgrade($u=$_); Dump($u);
say $d eq $u ?1:0;
'
SV = PV(0x86875c) at 0x4a9214
REFCNT = 1
FLAGS = (POK,pPOK)
PV = 0x8699b4 "\374"\0
CUR = 1
LEN = 12
SV = PV(0x868784) at 0x4a8f44
REFCNT = 1
FLAGS = (POK,pPOK,UTF8)
PV = 0x869d14 "\303\274"\0 [UTF8 "\x{fc}"]
CUR = 2
LEN = 12
1
Upvotes: 2