Reputation: 123
I have a problem with create external table in AWS Athena. I have over 1000 csv files, all with header and footer, and i would like to create an Athena table to visualize and analyze all data togheter.
I tried with the following code but it seems that the property to remove the footer does not work:
CREATE EXTERNAL TABLE test.multi_file_test(
`value1` string COMMENT '',
`value2` string COMMENT '',
`value3` string COMMENT '')
ROW FORMAT DELIMITED
FIELDS TERMINATED BY '\;'
STORED AS INPUTFORMAT
'org.apache.hadoop.mapred.TextInputFormat'
OUTPUTFORMAT
'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
LOCATION
's3://s3_path'
TBLPROPERTIES (
'areColumnsQuoted'='false',
'skip.header.line.count'='1',
'skip.footer.line.count'='1' <<<--- It doesn't seem to work
I get this result
value1 | value2 | value3 |
---|---|---|
from_file1 | A | 1 |
from_file1 | B | 1 |
footer_file1 | ||
from_file2 | A | 2 |
from_file2 | B | 2 |
footer_file2 | ||
from_file3 | A | 3 |
from_file3 | B | 3 |
footer_file3 |
but I need to get this result:
value1 | value2 | value3 |
---|---|---|
from_file1 | A | 1 |
from_file1 | B | 1 |
from_file2 | A | 2 |
from_file2 | B | 2 |
from_file3 | A | 3 |
from_file3 | B | 3 |
any suggestion or solution would be great.
I thank you all
Upvotes: 2
Views: 426
Reputation: 5124
If you are using Athena engine version 1 then this will not work as it is based on Presto 0.172 where as the property 'skip.footer.line.count'='1'
added in Presto 0.199. You have to switch to Athena engine version 2 which is based on Presto 0.217 for it to work properly.
I have tested in version 2 and able to see this work. Refer to this for changing versions in Athena.
Upvotes: 2