Reputation: 1078
I was using the code pd.read_json('s3://example2020/kaggle.json')
to access S3 bucket data, but it threw the error of FileNotFoundError: example2020/kaggle.json
.
The methods I tried:
[Region] The s3 bucket is in Ohio region while the SageMaker notebook instance is in Singapore. Not sure if this matters. I tried to recreate a s3 bucket in Singapore region but I still cannot access it and got the same file not found error.
[IAM Role]
I checked the permission of IAM-SageMaker Execution role
Upvotes: 1
Views: 3036
Reputation: 588
While providing access to all of S3 for an IAM role solves your immediate access problem, it's not best practice, as it could lead to security vulnerabilities down the road.
A better solution would be to provide that role with a policy specific to the bucket it needs to work in. For your example, you'd add this inline policy:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"s3:GetObject",
"s3:PutObject",
"s3:ListBucket",
"s3:DeleteObject"
],
"Resource": [
"arn:aws:s3:::example2020",
"arn:aws:s3:::example2020/*"
]
}
]
}
Upvotes: 0
Reputation: 1078
The problem is still IAM permission.
I created a new notebook instance and a new IAM role. You would be asked how to access s3 bucket. I chose all s3 bucket
. Then the problem solved.
[Solution]
In Resource tab, check whether bucket name is general.
If you changed old IAM and it is not working, you can create a new IAM role. And attach this role to the notebook.
Upvotes: 2