Jov
Jov

Reputation: 93

Postgres - ERROR: invalid byte sequence for encoding "UTF8": 0xca 0x2d

I'm trying to import a massive txt file into the postgres. When I typed the following command:

\COPY denton_2018_rawdata FROM 'C:\Users\testu\Downloads\denton_county\2018-website-all-property\2018-08-28_005183_APPRAISAL_INFO.txt' delimiter E'\x01'

I got the following error:

ERROR: invalid byte sequence for encoding "UTF8": 0xca 0x2d CONTEXT: COPY denton_2018_rawdata, line 22769: "000000027205R 02018000000000000 ..."

So I tried the following command (adding ENCODING 'WINDOWS-1252') :

\COPY denton_2018_rawdata FROM 'C:\Users\testu\Downloads\denton_county\2018-website-all-property\2018-08-28_005183_APPRAISAL_INFO.txt' delimiter E'\x01' ENCODING 'WINDOWS-1252';

But still got the same error. Could anyone help please?

Upvotes: 0

Views: 1266

Answers (1)

Pavel Stehule
Pavel Stehule

Reputation: 45910

PostgreSQL is very strict about UTF8 encoding. It is due possible SQL injection attacks based on invalid UTF8 characters. First you have to know, what is source encoding. Second you should to eliminate all broken chars before import to Postgres.

There are some application that can do this work - like iconv

Upvotes: 1

Related Questions