Reputation: 51
I am trying a simple python shell job where I am trying to read a config file which is in S3 bucket folder. The Glue service role has bucket object read/write permission. I have set --extra-files special parameter to point it to the config file S3 location.
When I run a job, I still get FileNotFound exception. I also used listdir() to see the content and noticed that the config file is missing.
Any help is much appreciated. Thanks
import os
import yaml
print(os.listdir("."))
file_path = "config_aws.yaml"
with open(file_path, 'r') as configfile:
config = yaml.load(configfile, Loader=yaml.FullLoader)
for section in config:
print(section)
Upvotes: 5
Views: 2954
Reputation: 83
I know this question is over 3 years old and AWS Glue has moved on, but you can currently determine the location of any --extra-files
(for Python shell Glue Jobs) by looking at the OS environment variable EXTRA_FILES_DIR
e.g.
import os
extra_files_dir = os.environ['EXTRA_FILES_DIR']
In my case, the files had been copied to /tmp/glue-python-libs-IbWD
Hope this helps someone.
Upvotes: 2
Reputation: 101
I'm facing the same issue. I found that the file is under a directory named glue-python-libs-...
.
So, I had to do what follows (horrible solution btw):
config_dir = [f for f in os.listdir("./") if f.startswith("glue-python-libs-")][0]
config_file = f"{config_dir}/config.json"
Upvotes: 7