How to parse CSV with pgloader or PostgreSQL copy when there's a
(line break) inside double quotes?

Question

I have a csv line like this:

"aaa"|"bbb"|"ccc"|"dddd
eeeee"

I want to import this data using pgloader (http://pgloader.io/) or PostgreSQL COPY (https://www.postgresql.org/docs/9.6/static/sql-copy.html). My issue here is the fact tha according to the CSV standards it's possible to have a line break ( ) inside a quoted field value. But pgloader and also the COPY command treat it like a brand new line of data, instead of one column with line break inside.

COPY schema.table (
    col_aaa,
    col_bbb,
    col_ccc,
    col_ddd
) 
    FROM 'file.csv' WITH DELIMITER '|' ENCODING 'LATIN1' CSV;

My COPY command My pgloader command

LOAD CSV
FROM 'file.csv' 
INTO postgresql://user:password@host:5432/database?schema.table (col_aaa, col_bbb, col_ccc, col_ddd)

WITH   
skip header = 0,
fields optionally enclosed by '"',
fields escaped by double-quote,
fields terminated by '|'  

SET client_encoding to 'latin1'
BEFORE LOAD DO
    $$ TRUNCATE anac.aerodromos_csv RESTART IDENTITY; $$;

I've search a lot from the PostgreSQL docs and also from google.

The only thing that I've found was this: Parsing CSV file with in double quoted fields but awk is too slow for a file with more than 1 million lines.

Any hint about how to do it?

My preference is for pgloader, but I can accept the use of sed or perl as regex agent to treat file at a linux shell script.

Any clues about how to do it?

Jonathas Pac&#237;fico · Accepted Answer

The problem was that at the end of line there was an . And to solve that I removed the only from the line break inside the data.

That way pgloader was able to do the job.

I've done that by using an one line perl.

perl -0777 -pi -e 's/(?<="[^"|])*(?



Perl command line multi-line replace

How to parse CSV with pgloader or PostgreSQL copy when there's a \n (line break) inside double quotes?

Answers (2)

Related Questions

How to parse CSV with pgloader or PostgreSQL copy when there&#39;s a \n (line break) inside double quotes?

Answers (2)

Related Questions

How to parse CSV with pgloader or PostgreSQL copy when there's a \n (line break) inside double quotes?