Reputation: 11
I´d like to import a CSV
file with a column with YYYYMMDD
to a DATE
column in ClickHouse
. Sample of CSV
content: 20191231
. I realized that ClickHouse
only import a DATE
column in CSV
if it in the format as is YYYY-MM-DD
(only with "-
" between the year, month and day). I can not change the format in the CSV
file that is stored in a http server
.
Please let me know an easy way or command to import hundred CSV
files with YYYYMMDD
from a http server
or where can I change the configuration in my clickhouse
server to accept YYYYMMDD
format instead of YYYY-MM-DD
?
I already added a new column, imported all files and updated the table. See the query below. The field 'olddatefield
' is a STRING
and 'newdatefield
' is a DATE field.
ALTER TABLE test ADD COLUMN newdatefield Date AFTER olddatefield;
ALTER TABLE test
UPDATE newdatefield = toDate(concat(substring(olddatefield, 1, 4), '-' , substring(olddatefield, 5, 2), '-' , substring(olddatefield, 7, 2)))
WHERE olddatefield = '20191231';
It worked very well, but there are thousands files with different DATES
to import and I dont want edit thousands of 'alter table'
queries.
Upvotes: 1
Views: 3093
Reputation: 15208
It needs to rely on parseDateTimeBestEffort function:
SELECT toDate(parseDateTimeBestEffort('20191231')) AS date
/*
┌───────date─┐
│ 2019-12-31 │
└────────────┘
*/
And then using input-table function need to change a data upload way to fix related field(s):
cat data.csv \
| clickhouse-client --query="INSERT INTO test SELECT toDate(parseDateTimeBestEffort(date)) AS date, id FROM input('date String, id Int32') FORMAT CSV";
data.csv
20191231, 1
20200101, 2
Upvotes: 3