Reputation: 358
In my application, we are using Spreadsheet::Read to read a excel and perform some task on the rows and finally add it to a database.
The file we are importing is an Excel file ( .XLSX). This excel file is actually a glossary which serves support to different user languages.
The problem, I am facing with this process is we have special character cells in some of the rows/columns which are not getting decoded correctly.
For example, if i have a SPANISH Excel FIle:
Información de cuenta => Informaci\x{f3}n de cuenta
Página de consola de administración de curso => P\x{e1}gina de consola de administraci\x{f3}n de curso
Informaci\x{f3}n de cuenta is getting added in the Db and when fetched it is displaying extraneous characters in the UI.
I tried this solution but it is not working. this is basically Hacking of Spreadsheet::Read
use Text::Iconv;
package Spreadsheet::XLSX;
sub new {
my $converter = Text::Iconv->new("ASCII","utf-8");
return __PACKAGE__->SUPER::new(@_, $converter);
}
Please suggest me what is wrong or any better solution ?
Upvotes: 2
Views: 1621
Reputation: 39158
Spreadsheet::Read returns strings as octets encoded in Latin1. To make Perl characters, use the Encode module. Read the introduction to the topic of encoding in Perl.
use Encode qw(decode);
use Spreadsheet::Read qw(ReadData);
my $ref = ReadData 'spanish.xls';
my $characters = decode 'Latin-1', $ref->[1]{A1};
Upvotes: 3