r s
r s

Reputation: 115

Parse error with a simple CSV import into Clickhouse

I have an extremely simple CSV I'm trying to import into clickhouse with no success. The create table statement is:

CREATE TABLE staging.EloLBK
(
    `Month` DateTime64(3),
    `1958` Int32,
    `1959` Int32,
    `1960` Int32
)
ENGINE = MergeTree
PRIMARY KEY Month
ORDER BY Month
SETTINGS index_granularity = 8192

The CSV data looks like this:

"Month", "1958", "1959", "1960"
"JAN",  340,  360,  417
"FEB",  318,  342,  391
"MAR",  362,  406,  419
"APR",  348,  396,  461
"MAY",  363,  420,  472
"JUN",  435,  472,  535
"JUL",  491,  548,  622
"AUG",  505,  559,  606
"SEP",  404,  463,  508
"OCT",  359,  407,  461
"NOV",  310,  362,  390
"DEC",  337,  405,  432

My import statement is:

INSERT INTO EloLBK SELECT * FROM file('EloLBK/*.csv', 'CSVWithNames', '"Month" datetime64, "1958" integer, "1959" integer, "1960" integer')

The error coming back from clickhouse is:

Code: 27. DB::Exception: Cannot parse input: expected '"' before: '417\n"FEB",  318,  342,  391\n"MAR",  362,  406,  419\n"APR",  348,  396,  461\n"MAY",  363,  420,  472\n"JUN",  435,  472,  535\n"JUL",  491,  548,  622\n"AUG",  505,':
Row 1:
Column 0,   name: Month, type: DateTime64(3), parsed text: "<DOUBLE QUOTE>JAN<DOUBLE QUOTE>,  340,  360,  "ERROR
Code: 27. DB::ParsingException: Cannot parse input: expected '"' before: '417\n"FEB",  318,  342,  391\n"MAR",  362,  406,  419\n"APR",  348,  396,  461\n"MAY",  363,  420,  472\n"JUN",  435,  472,  535\n"JUL",  491,  548,  622\n"AUG",  505,'. (CANNOT_PARSE_INPUT_ASSERTION_FAILED) (version 21.12.1.8928 (official build))

: While executing CSVRowInputFormat: While executing File. (CANNOT_PARSE_INPUT_ASSERTION_FAILED)

I'm not sure how to resolve this so any advice would be appreciated!

Upvotes: 2

Views: 2249

Answers (2)

Algunenano
Algunenano

Reputation: 134

You can tell ClickHouse to do a best effort guess by passing the option date_time_input_format='best_effort', for example:

INSERT INTO EloLBK SELECT * FROM file('EloLBK/*.csv', 'CSVWithNames', '"Month" datetime64, "1958" integer, "1959" integer, "1960" integer') settings date_time_input_format='best_effort';

Will lead to:

┌───────────────────Month─┬─1958─┬─1959─┬─1960─┐
│ 2000-01-01 00:00:00.000 │  340 │  360 │  417 │
│ 2000-02-01 00:00:00.000 │  318 │  342 │  391 │
│ 2000-03-01 00:00:00.000 │  362 │  406 │  419 │
│ 2000-04-01 00:00:00.000 │  348 │  396 │  461 │
│ 2000-05-01 00:00:00.000 │  363 │  420 │  472 │
│ 2000-06-01 00:00:00.000 │  435 │  472 │  535 │
│ 2000-07-01 00:00:00.000 │  491 │  548 │  622 │
│ 2000-08-01 00:00:00.000 │  505 │  559 │  606 │
│ 2000-09-01 00:00:00.000 │  404 │  463 │  508 │
│ 2000-10-01 00:00:00.000 │  359 │  407 │  461 │
│ 2000-11-01 00:00:00.000 │  310 │  362 │  390 │
│ 2000-12-01 00:00:00.000 │  337 │  405 │  432 │
└─────────────────────────┴──────┴──────┴──────┘

Upvotes: 2

r s
r s

Reputation: 115

Ok, after some more messing around, it appears the error message is just a bit misleading. The actual issue is that clickhouse is (understandably) unable to parse the month as a datetime.

The following CSV input works fine:

"Month","1958","1959","1960"
"1970-01-01T00:00:00","340","360","417"
"1970-01-02T00:00:00","318","342","391"

Upvotes: 1

Related Questions