Celdor
Celdor

Reputation: 2607

SQLite converts all unicode characters into ANSI

I have a slight problem with sqlite and its text encoding. I read from documents that sqlite handles UTF-8 encoding or I can use one by executing the command PRAGMA encoding = "UTF-8"; etc. The database needs to store Polish text. The database is going to be used with Qt later on. I have a script with two commands: CREATE TABLE ... and INSERT INTO ... The file is encoded in UTF-8. However, when I use command line: sqlite3 myname.db < the_file.sql, I can create both database and the table but all Polish specific characters such as ą, ć, ź, Ż etc. are automatically converted into their simpler ANSI equivalent characters: a, c, z, Z etc. I thought it would be a problem with the command line. So I downloaded SQLite Manager 2009 and when I copy / pasted the whole script to execute it in SQLite Manager, I noticed the effect is the same. Characters are automatically converted during copy / pasting. Is the SQLite limitted to use only with ANSI characters?

Upvotes: 4

Views: 6904

Answers (2)

alt.126
alt.126

Reputation: 1107

Try replacing your sqlite3.exe file. Sometimes it gets corrupt and becomes able to output only malformed characters (ANSI ones when they should be UTF-8).

Download a fresh copy for the latest version again from: https://www.sqlite.org/download.html

Upvotes: -2

mvp
mvp

Reputation: 116187

If there is anything wrong in your setup, it is certainly NOT SQLite.

Few simple tests:

Linux:

$ cat > test.sql <<EOF
DROP TABLE IF EXISTS t;
CREATE TABLE t (str varchar(20));
INSERT INTO  t (str) VALUES ("ą, ć, ź, Ż");
SELECT * FROM t;
EOF

$ file test.sql
test.sql: UTF-8 Unicode text

$ sqlite3 test.db < test.sql
ą, ć, ź, Ż

So, it works as doctor prescribed.

Windows:

Use the same test.sql as above. If you need to create it anew, copy and paste following text:

DROP TABLE IF EXISTS t;
CREATE TABLE t (str varchar(20));
INSERT INTO  t (str) VALUES ("ą, ć, ź, Ż");
SELECT * FROM t;

into Notepad++ and save as file with Encoding -> Encode in UTF-8 without BOM.

sqlite3 test.db < test.sql
─Е, ─З, ┼║, ┼╗

This sounds bad. But, this is buggy Windows console! Save output to file instead:

sqlite3 test.db < test.sql > out.txt

Open out.txt in Notepad++ - looks great: ą, ć, ź, Ż

EDIT: It works in Windows console as well, if you use chcp 65001:

chcp 65001
sqlite3 test.db < test.sql
ą, ć, ź, Ż

QED.

Upvotes: 4

Related Questions