Shark Deng
Shark Deng

Reputation: 1078

how SageMaker to access s3 bucket data

I was using the code pd.read_json('s3://example2020/kaggle.json') to access S3 bucket data, but it threw the error of FileNotFoundError: example2020/kaggle.json.

The methods I tried:

[Region] The s3 bucket is in Ohio region while the SageMaker notebook instance is in Singapore. Not sure if this matters. I tried to recreate a s3 bucket in Singapore region but I still cannot access it and got the same file not found error.

[IAM Role] I checked the permission of IAM-SageMaker Execution role enter image description here

Upvotes: 1

Views: 3036

Answers (2)

Adam Ryason
Adam Ryason

Reputation: 588

While providing access to all of S3 for an IAM role solves your immediate access problem, it's not best practice, as it could lead to security vulnerabilities down the road.

A better solution would be to provide that role with a policy specific to the bucket it needs to work in. For your example, you'd add this inline policy:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "s3:GetObject",
                "s3:PutObject",
                "s3:ListBucket",
                "s3:DeleteObject"
            ],
            "Resource": [
                "arn:aws:s3:::example2020",
                "arn:aws:s3:::example2020/*"
            ]
        }
    ]
}

Upvotes: 0

Shark Deng
Shark Deng

Reputation: 1078

The problem is still IAM permission.

I created a new notebook instance and a new IAM role. You would be asked how to access s3 bucket. I chose all s3 bucket. Then the problem solved. enter image description here



[Solution] In Resource tab, check whether bucket name is general. enter image description here

If you changed old IAM and it is not working, you can create a new IAM role. And attach this role to the notebook.

Upvotes: 2

Related Questions