Cameron
Cameron

Reputation: 31

Copying Data from .csv into Vertica Table

I am attempting to copy data from a flat file into a Vertica Table. My data is ENCLOSED BY '^' and is using a DELIMITER '|'. In case the markdown translates that funny, by data is enclosed by a carat, and delimited by a vertical pipe.

Here is the CREATE TABLE table statement that was used to create the table:

CREATE TABLE SCHEMA.TEST_TABLE ( UNIQUE_ID float , TEST_DT_TM timestamptz , TEST2_DT_TM timestamptz ) UNSEGMENTED ALL NODES;

And here is the copy statement that I am using:

COPY SCHEMA.TEST_TABLE ( UNIQUE_ID , TEST_DT_TM , TEST2_DT_TM ) from local 'file_name.csv' ENCLOSED BY '^' DELIMITER '|' SKIP 1 ABORT ON ERROR TRAILING NULLCOLS;

I'm getting the following error:

ERROR: COPY: Input record 5754 has been rejected (Invalid timestamptz format '' for column 4 (TEST2_DT_TM).Invalid input syntax for timestamptz: "")

Record 5754, in its raw format, looks something like this...

123|^04/09/17 12:23:33^|^^|

In the database this flat file is being extracted from, TEST2_DT_TM (the field writing out as ^^) is null. However, for some reason, Vertica isn't wanting to accept this into the timestamptz field I have defined for TEST2_DT_TM in the create table statement above. I am fairly certain this is the issue, because once I manually removed the two carets from the file, so that it just read.. || .. the record was then accepted.

I also tried adding in combinations of NULL AS '' OR NULL AS ' ' in my copy statement in the event that the field was being translated into an empty string instead of a true null, but that didn't seem to work either.

Does anybody have any explanation for this behavior? I've used this combination of enclosed by + delimiter combination when copying .csv's into tables several times before, without any issue.

I know there are probably alternatives/workarounds to get this into the table, but after looking into this for a while, I'm really interested in understanding the root cause.

Let me know if there is any other information I can provide to better assist in communicating my questions.

Upvotes: 3

Views: 3540

Answers (1)

marcothesane
marcothesane

Reputation: 6721

You're stumbling into the ever-repeating discussion: an empty string (^^ in your example using carets as string delimiters) is not a NULL value. And timestamps are not strings, so an empty string is indeed a wrong literal for a timestampz.

Try with no value at all for the null, and specify the NULL literal, by adding the NULL option to your COPY command.

Copy command:

COPY test_table (
  UNIQUE_ID
, TEST_DT_TM
, TEST2_DT_TM
)
FROM LOCAL 'copyemptytz.csv' 
ENCLOSED BY '^' 
DELIMITER '|' 
NULL '' 
SKIP 1 
ABORT ON ERROR 
TRAILING NULLCOLS;

Test data: (last line is your problem line)

unique_id|test_dt_tm|test2_dt_tm|
123|^04/09/17 12:23:33^|^04/09/17 12:23:33^|
123|^04/09/17 12:23:33^|^04/09/17 12:23:33^|
123|^04/09/17 12:23:33^|^04/09/17 12:23:33^|
123|^04/09/17 12:23:33^|^04/09/17 12:23:33^|
123|^04/09/17 12:23:33^|^04/09/17 12:23:33^|
123|^04/09/17 12:23:33^|^04/09/17 12:23:33^|
123|^04/09/17 12:23:33^|^04/09/17 12:23:33^|
123|^04/09/17 12:23:33^||

Upvotes: 1

Related Questions