Garet Jax
Garet Jax

Reputation: 1171

(403) when calling the HeadObject operation: Forbidden when accessing S3 from AWS Batch in python

I have created a docker image that was generated from amazonlinux. It in I manually installed python3, pip and awscli. I also configured the AWSCLI to use my key and secret key. When I create a container from the image, I can interact with S3 without problems.

I then generate a new image from my custom image above using a Dockerfile. In it, I install the modules needed for this task (boto3, numpy, pandas, scipy and spacy) and also the custom python code. This image is the container I use for AWS Batch.

My custom python code tries to download a file from S3 using:

jsonText = None
s3 = boto3.client('s3', region_name='us-east-1')
with io.BytesIO() as file_stream:
    s3.download_fileobj(args.input_transcript_s3_bucket, args.input_transcript_s3_key, file_stream)
    wrapper = io.TextIOWrapper(file_stream, encoding='utf-8')
    wrapper.seek(0)
    jsonText = wrapper.read()
    wrapper.flush()
    file_stream.flush()

When the python code gets triggered through AWS Batch, I get the following error:

botocore.exceptions.ClientError: An error occurred (403) when calling the HeadObject operation: Forbidden

Another post on stackoverflow suggests adding the region to the S3 client create call. As in the other poster's case, this didn't help me.

I am thinking this is a policy issue. I have checked out the VPC Endpoint Policy and found it to be sufficient:

{
    "Statement": [
        {
            "Action": "*",
            "Effect": "Allow",
            "Resource": "*",
            "Principal": "*"
        }
    ]
}

I have generated a custom batch service role. To it I added the AWSBatchServiceRole and also the AmazonS3FullAccess policies. I assigned this new service role to a brand new compute environment with no luck.

I am not sure what to do next or how to get more information. Any thoughts?

Thanks.

Upvotes: 1

Views: 1653

Answers (1)

Garet Jax
Garet Jax

Reputation: 1171

The docker container was being started with 'nobody' as the user. The AWS configuration (specifically the .aws directory) was only accessible as the root user.

I changed the docker container to copy the .aws directory from /root to the root of the HD and then made it accessible to 'nobody':

cp -r /root/.aws /
chown nobody /.aws
chgrp nobody /.aws
cd /.aws
chown nobody *
chgrp nobody *

I then tested to make sure 'nobody' could access the credentials:

su -s /bin/bash nobody
aws s3 ls

This solved my problem.

I am not very happy with this solution since it exposes my AWS key and secret key to the 'nobody' user, but find myself in a bit of a catch 22. I created this post to see if there are other options.

Upvotes: 1

Related Questions