keyptl
keyptl

Reputation: 51

Extra files are not copied to job run directory

I am trying a simple python shell job where I am trying to read a config file which is in S3 bucket folder. The Glue service role has bucket object read/write permission. I have set --extra-files special parameter to point it to the config file S3 location.

When I run a job, I still get FileNotFound exception. I also used listdir() to see the content and noticed that the config file is missing.

Any help is much appreciated. Thanks

import os
import yaml

print(os.listdir("."))

file_path = "config_aws.yaml"
with open(file_path, 'r') as configfile:
    config = yaml.load(configfile, Loader=yaml.FullLoader)

for section in config:
    print(section)

Upvotes: 5

Views: 2954

Answers (2)

Tim James
Tim James

Reputation: 83

I know this question is over 3 years old and AWS Glue has moved on, but you can currently determine the location of any --extra-files (for Python shell Glue Jobs) by looking at the OS environment variable EXTRA_FILES_DIR e.g.

import os
extra_files_dir = os.environ['EXTRA_FILES_DIR']

In my case, the files had been copied to /tmp/glue-python-libs-IbWD

Hope this helps someone.

Upvotes: 2

matiasm
matiasm

Reputation: 101

I'm facing the same issue. I found that the file is under a directory named glue-python-libs-....

So, I had to do what follows (horrible solution btw):

config_dir = [f for f in os.listdir("./") if f.startswith("glue-python-libs-")][0]
config_file = f"{config_dir}/config.json"

Upvotes: 7

Related Questions