Arthur J.
Arthur J.

Reputation: 11

How to import a CSV file with a column with YYYYMMDD to a DATE column in ClickHouse

I´d like to import a CSV file with a column with YYYYMMDD to a DATE column in ClickHouse. Sample of CSV content: 20191231. I realized that ClickHouse only import a DATE column in CSV if it in the format as is YYYY-MM-DD (only with "-" between the year, month and day). I can not change the format in the CSV file that is stored in a http server. Please let me know an easy way or command to import hundred CSV files with YYYYMMDD from a http server or where can I change the configuration in my clickhouse server to accept YYYYMMDD format instead of YYYY-MM-DD?

I already added a new column, imported all files and updated the table. See the query below. The field 'olddatefield' is a STRING and 'newdatefield' is a DATE field.

ALTER TABLE test ADD COLUMN newdatefield Date AFTER olddatefield;

ALTER TABLE test
UPDATE newdatefield = toDate(concat(substring(olddatefield, 1, 4), '-' , substring(olddatefield, 5, 2), '-' , substring(olddatefield, 7, 2)))
WHERE olddatefield = '20191231';

It worked very well, but there are thousands files with different DATES to import and I dont want edit thousands of 'alter table' queries.

Upvotes: 1

Views: 3093

Answers (1)

vladimir
vladimir

Reputation: 15208

It needs to rely on parseDateTimeBestEffort function:

SELECT toDate(parseDateTimeBestEffort('20191231')) AS date
/*
┌───────date─┐
│ 2019-12-31 │
└────────────┘
*/

And then using input-table function need to change a data upload way to fix related field(s):

cat data.csv \
  | clickhouse-client --query="INSERT INTO test SELECT toDate(parseDateTimeBestEffort(date)) AS date, id FROM input('date String, id Int32') FORMAT CSV";

data.csv

20191231, 1
20200101, 2

Upvotes: 3

Related Questions