sale108
sale108

Reputation: 597

Assuming AWS role in smart_open (python) doesn't work

I'm using smart_open to stream items data from S3 to use in my code.

According to the docs, if I want to access the service with different credentials that the system user profile, this is how it's supposed to be done:

session = boto3.Session(
    aws_access_key_id=os.environ['AWS_ACCESS_KEY_ID'],
    aws_secret_access_key=os.environ['AWS_SECRET_ACCESS_KEY'])

path="mybucket/file.json"
with smart_open.open(f"s3://{path}", mode='r', transport_params={'client': session.client('s3')}) as file:
    for line in file:
        print(json.loads(line))

But that just throws the following error:

unable to access bucket: 'mybucket' key: 'file.json' version: None error: An error occurred (AccessDenied) when calling the GetObject operation: Access Denied

The user that owns these credentials has all the necessary permissions. I know that because when I have tried to set the system profile to that user it worked fine.

Does anyone know how to solve this? Thanks

Upvotes: 0

Views: 1928

Answers (1)

sale108
sale108

Reputation: 597

EDIT: I'm using conda, and I installed the package using conda install don't know why but the "smart_open" version it installed was 3.0.0.

The most current version, for now, is 5.0.0. So, apparently, if I was to update the package it was supposed to work just as described in the docs.

I'm not going to update, so I can't verify that. However, if you do use the same package manager and/or using version 3.0.0, This is the solution I found:

After digging into the library code, the docs are misleading. Instead of passing the actual client, you need to pass the session. The following works:

session = boto3.Session(
    aws_access_key_id=os.environ['AWS_ACCESS_KEY_ID'],
    aws_secret_access_key=os.environ['AWS_SECRET_ACCESS_KEY'])

with smart_open.open(f"s3://{path}", mode='r', transport_params={'session': session}) as file:
    for line in file:
        print(json.loads(line))

Upvotes: 1

Related Questions