Reputation: 10969
Example CSV line:
"2012","Test User","ABC","First","71.0","","","0","0","3","3","0","0","","0","","","","","0.1","","4.0","0.1","4.2","80.8","847"
All values after "First" are numeric columns. Lots of NULL values just quoted as such, right.
Attempt at COPY:
copy mytable from 'myfile.csv' with csv header quote '"';
NOPE: ERROR: invalid input syntax for type numeric: ""
Well, yeah. It's a null value. Attempt 2 at COPY:
copy mytable from 'myfile.csv' with csv header quote '"' null '""';
NOPE: ERROR: CSV quote character must not appear in the NULL specification
What's a fella to do? Strip out all double quotes from the file before running COPY
? Can do that, but I figured there's a proper solution to what must be an incredibly common problem.
Upvotes: 25
Views: 60180
Reputation: 95
This worked for me in Python 3.8.X
import psycopg2
import csv
from io import StringIO
db_conn = psycopg2.connect(host=t_host, port=t_port,
dbname=t_dbname, user=t_user, password=t_pw)
cur = db_conn.cursor()
csv.register_dialect('myDialect',
delimiter=',',
skipinitialspace=True,
quoting=csv.QUOTE_MINIMAL)
with open('files/emp.csv') as f:
next(f)
reader = csv.reader(f, dialect='myDialect')
buffer = StringIO()
writer = csv.writer(buffer, dialect='myDialect')
writer.writerows(reader)
buffer.seek(0)
cur.copy_from(buffer, 'personnes', sep=',', columns=('nom', 'prenom', 'telephone', 'email'))
db_conn.commit()
Upvotes: 0
Reputation: 161
COPY mytable from '/dir/myfile.csv' DELIMITER ',' NULL ''
WITH CSV HEADER FORCE QUOTE *;
Upvotes: 3
Reputation: 5058
as an alternative, using
sed 's/""//g' myfile.csv > myfile-formatted.csv
psql
# copy mytable from 'myfile-formatted.csv' with csv header;
works as well.
Upvotes: 7
Reputation: 59
I think all you need to do here is the following:
COPY mytable from '/dir/myfile.csv' DELIMITER ',' NULL '' WITH CSV HEADER QUOTE ;
Upvotes: 5
Reputation: 19471
While some database products treat an empty string as a NULL value, the standard says that they are distinct, and PostgreSQL treats them as distinct.
It would be best if you could generate your CSV file with an unambiguous representation. While you could use sed or something to filter the file to good format, the other option would be to COPY
the data in to a table where a text
column could accept the empty strings, and then populate the target table. The NULLIF
function may help with that: http://www.postgresql.org/docs/9.1/interactive/functions-conditional.html#FUNCTIONS-NULLIF -- it will return NULL if both arguments match and the first value if they don't. So, something like NULLIF(txtcol, '')::numeric
might work for you.
Upvotes: 13