JamesMatson
JamesMatson

Reputation: 2922

AWS Batch - How to access AWS Batch environment variables within python script running inside Docker container

I have a Docker container which executes a python script inside it as the ENTRYPOINT. This is the DockerFile

FROM python:3
ADD script.py / 
EXPOSE 80
RUN pip install boto3
RUN pip install uuid
ENTRYPOINT ["python","./script.py"]

This is the Python script:

import boto3
import time
import uuid
import os

guid = uuid.uuid4()
timestr = time.strftime("%Y%m%d-%H%M%S")
job_index = os.environ['AWS_BATCH_JOB_ARRAY_INDEX']

filename = 'latest_test_' + str(guid) + '_.txt'
with open(filename, 'a+') as f:
    data = job_index
    f.write(data)

client = boto3.client(
    's3',
    # Hard coded strings as credentials, not recommended.
    aws_access_key_id='',
    aws_secret_access_key=''
)
response = client.upload_file(filename, 'api-dev-dpstorage-s3', 'docker_data' + filename + '.txt')
with open('response2.txt', 'a+') as f:
    f.write('all done')
    exit

It is simply designed to create a file, write the job array index into the file and push it to an S3 Bucket. The job array index from AWS Batch is being sourced from one of the pre-defined environment variables. I have uploaded the image to AWS ECR, and have set up an AWS Batch to run a job with an array of 10. This should execute the job 10 times, with my expectation that 10 files are dumped into S3, each containing the array index of the job itself.

If I don't include the environment variable and instead just hard code a value into the text file, the AWS Batch job works. If I include the call to os.environ to get the variable, the job fails with this AWS Batch error:

Status reasonEssential container in task exited

I'm assuming there is an issue with how I'm trying to obtain the environment variable. Does anyone know how I could correctly reference either one of the built in environment variables and/or a custom environment variable defined in the job?

Upvotes: 1

Views: 4591

Answers (1)

knh190
knh190

Reputation: 2882

AWS provides docker env configuration by job definition parameters, where you specify:

"environment" : [
    { "AWS_BATCH_JOB_ARRAY_INDEX" : "string"},
]

This will be turned into docker env parameter:

$ docker run --env AWS_BATCH_JOB_ARRAY_INDEX=string $container $cmd

Thus can be accessed by:

import os

job_id = os.environ['AWS_BATCH_JOB_ARRAY_INDEX']

But watch out if you are passing sensitive data in this way, it is not wise to pass credentials in plain text. Instead, in this case, you may want to create a compute environment.

Upvotes: 5

Related Questions