deedeeck28
deedeeck28

Reputation: 367

AWS EMR Bootstrap action calling additional file

I would like to install additional python libraries when setting up AWS EMR (release 6.0.0)

I know I can do this by creating a file called boostrap.sh and uploading this file to s3 and set a bootstrap action to call this file when setting up the cluster. Contents of bootstrap.sh:

sudo pip3 install mlxtend imbalanced-learn etc etc...

However I have a separate requirements.txt file which contains the list of all my python libraries I need.

If I put 'pip3 install -r requirements.txt' into bootstrap.sh, the bootstrap.sh wont be able to find requirements.txt since I am only allowed to upload one s3 file per bootstrap action.

Is there any way around this?

Upvotes: 0

Views: 787

Answers (1)

Ben
Ben

Reputation: 36

You can copy your requirements.txt from your S3 bucket to EMR node's local directory then run pip install on the file, e.g.

#!/bin/bash

aws s3 cp s3://<my-bucket>/requirements.txt .
sudo pip-3.6 install -r requirements.txt

Upvotes: 2

Related Questions