Reputation: 810
I am trying to use boto3
, v. 1.7.4, to interact with AWS Athena through the following script:
import boto3
import botocore
# Test access to the input bucket
bucket = boto3.resource('s3').Bucket('s3_input')
print(list(bucket.objects.all())
client = boto3.client('athena', region_name='us-east-1')
# Create a new database
db_query = 'CREATE DATABASE IF NOT EXISTS france;'
response = client.start_query_execution(
QueryString=db_query,
ResultConfiguration={'OutputLocation': 's3_output'})
# Create a new table
table_query = '''
CREATE EXTERNAL TABLE IF NOT EXISTS france.by_script (`content` string )
ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.OpenCSVSerde'
WITH SERDEPROPERTIES ('separatorChar' = ',')
LOCATION 's3_input';'''
response = client.start_query_execution(
QueryString=table_query,
ResultConfiguration={'s3_output'},
QueryExecutionContext={'Database': 'france'})
With the current permissions of my account, the test to read the content of s3_input
works well. I can also create the database through the db_query
but the table creation fails with the following error message:
Your query has the following errors:FAILED: Execution Error, return
code 1 from org.apache.hadoop.hive.ql.exec.DDLTask.
MetaException(message:Got exception: java.io.IOException
com.amazon.ws.emr.hadoop.fs.shaded.com.amazonaws.services.s3.model.AmazonS
Exception: Access Denied (Service: Amazon S3; Status Code: 403; Error Code:
AccessDenied; Request ID: [...]), S3 Extended Request ID: [...])
If I run the table_query
command from the console, console.aws.amazon.com/athena/home, using the same account, there is no problem and the table is properly created.
The permissions are
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "VisualEditor0",
"Effect": "Allow",
"Action": "s3:GetObject",
"Resource": "s3_input"
},
{
"Sid": "VisualEditor1",
"Effect": "Allow",
"Action": [
"s3:ListAllMyBuckets",
"s3:HeadBucket"
],
"Resource": "*"
}
]
}
I would be happy to understand what I am missing here. Thanks in advance.
Upvotes: 3
Views: 4447
Reputation: 1977
I ran into the same problem as above, but in addition to the permissions mentioned by Flavien in the answer above my process (a Lambda function) needed to add also s3:PutObject and s3:AbortMultipartUpload.
Athena apparently creates objects named like folderName_$Folder$ in the SOURCE data folders, so it needs to have PutObject permission to that (not just read-only). Don't ask me why the AbortMultipartUpload is needed... but it comes straight from Athena docs at https://docs.aws.amazon.com/athena/latest/ug/access.html
The entire statement for your IAM policy looks like this:
{
"Effect": "Allow",
"Action": [
"s3:GetBucketLocation",
"s3:GetObject",
"s3:ListBucket",
"s3:ListBucketMultipartUploads",
"s3:ListMultipartUploadParts",
"s3:AbortMultipartUpload",
"s3:CreateBucket",
"s3:PutObject"
],
"Resource": [
"arn:aws:s3:::your-source-data-bucket-name*"
]
}
Upvotes: 1
Reputation: 3098
Here is the way to create policy for the user who needs to run athena query from Boto3.
-- S3 files bucket: sqladmin-cloudtrail
-- S3 output bucket: aws-athena-query-results-XXXXXXXXXX-us-east-1
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "VisualEditor0",
"Effect": "Allow",
"Action": "s3:ListBucket",
"Resource": [
"arn:aws:s3:::aws-athena-query-results-XXXXXXXXXX-us-east-1",
"arn:aws:s3:::sqladmin-cloudtrail"
]
},
{
"Sid": "VisualEditor1",
"Effect": "Allow",
"Action": "s3:PutObject",
"Resource": "arn:aws:s3:::aws-athena-query-results-XXXXXXXXXXXXXXXX-us-east-1/*"
},
{
"Sid": "VisualEditor2",
"Effect": "Allow",
"Action": [
"s3:GetObjectAcl",
"s3:GetObject",
"s3:GetObjectTagging",
"s3:GetBucketPolicy"
],
"Resource": [
"arn:aws:s3:::sqladmin-cloudtrail",
"arn:aws:s3:::sqladmin-cloudtrail/*"
]
},
{
"Sid": "VisualEditor3",
"Effect": "Allow",
"Action": [
"athena:StartQueryExecution",
"athena:CreateNamedQuery",
"athena:RunQuery"
],
"Resource": "*"
}
]
}
Here is my blog I did for an automation: https://www.sqlgossip.com/automate-aws-athena-create-partition-on-daily-basis/
Upvotes: 2
Reputation: 810
It turns out that the following permissions make it work
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"s3:Get*",
"s3:List*"
],
"Resource": "*"
}
]
}
Upvotes: 0