Reputation: 983
I am trying to create an Athena
table from shell scrip. This is my code
aws athena start-query-execution \
--query-string "CREATE EXTERNAL TABLE <table name>( `user_id` string, `file_name` string, `file_type` string, `count` string) ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.OpenCSVSerde' WITH SERDEPROPERTIES ( 'escapeChar'='\\', 'quoteChar'='\"', 'separatorChar'=',') STORED AS INPUTFORMAT 'org.apache.hadoop.mapred.TextInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat' LOCATION '<s3 location>/"{$client_id}/ \
--work-group "primary" \
--query-execution-context Database=<dbname>\
--result-configuration "OutputLocation="<s3 location>"
When i am executing the query it shows below error
start.sh: 1: start.sh: user_id: not found
start.sh: 1: start.sh: file_name: not found
start.sh: 1: start.sh: file_type: not found
start.sh: 1: start.sh: count: not found
An error occurred (InvalidRequestException) when calling the StartQueryExecution operation: line 1:8: mismatched input 'EXTERNAL'. Expecting: 'OR', 'SCHEMA', 'TABLE', 'VIEW'
generating report
{
"QueryExecutionId": "4c5b47cc-a87b-4d9a-8048-537a12534b0c"
}
Above Query is working fine in Athena console
Edit 1
Removed backtick now the query like this
aws athena start-query-execution \
--query-string "CREATE EXTERNAL TABLE <table name>( user_id string, file_name string, file_type string, count string) ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.OpenCSVSerde' WITH SERDEPROPERTIES ( 'escapeChar'='\\', 'quoteChar'='\"', 'separatorChar'=',') STORED AS INPUTFORMAT 'org.apache.hadoop.mapred.TextInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat' LOCATION '<s3 location>/{$client_id}/'" \
--work-group "primary" \
--query-execution-context Database=<dbname>\
--result-configuration "OutputLocation="<s3 location>"
Now am getting a new error
An error occurred (InvalidRequestException) when calling the StartQueryExecution operation: line 1:8: mismatched input 'EXTERNAL'. Expecting: 'OR', 'SCHEMA', 'TABLE', 'VIEW'
Edit 2 During the script execution back slash was removed ,that this the issue. while running am getting like this
SERDEPROPERTIES ( 'escapeChar'='\', 'quoteChar'='"', 'separatorChar'=',')
Actual expectation will be like this
SERDEPROPERTIES ( 'escapeChar'='\\', 'quoteChar'='\"', 'separatorChar'=',')
Does any way to resolve this?
Upvotes: 0
Views: 2060
Reputation: 132862
In your shell, backticks are command substitutions – or in other words: if you put something in backticks it's executed as a command and whatever it prints out is put into the string. Your shell does this in double quoted strings, but most shells won't do it in single quoted strings.
I understand you're using double quotes since the SQL contains single quoted strings. You can either surround the SQL with single quotes and escape all the single quotes inside, or you can escape the backticks.
However, the backticks are optional in Athena DDL, so the easiest way forward is to just remove them. Unless one of your columns is called "date" or "table", or any other reserved word you don't need the backticks.
A completely different approach would be to use the Glue Data Catalog API, this is what Athena does behind the scenes. It's a bit verbose, but so is creating DDL statements.
Upvotes: 1