user2496965
user2496965

Reputation: 65

Amazon S3 - Unable to create a datasource


I tried creating a datasource using boto for machine learning but ended up with an error.
Here's my code :

import boto

bucketname = 'mybucket'
filename = 'myfile.csv'
schema = 'myfile.csv.schema'
conn = boto.connect_s3()
datasource = 'my_datasource'

ml = boto.connect_machinelearning()

#create a data source
ds = ml.create_data_source_from_s3(
data_source_id = datasource,
data_spec ={
    'DataLocationS3':'s3://'+bucketname+'/'+filename,
    'DataSchemaLocationS3':'s3://'+bucketname+'/'+schema},
data_source_name=None,
compute_statistics = True)

print ml.get_data_source(datasource,verbose=None)

I get this error as a result of get_data_source call:

Could not access 's3://mybucket/myfile.csv'. Either there is no file at that location, or the file is empty, or you have not granted us read permission.

I have checked and I have FULL_CONTROL as my permissions. The bucket, file and schema all are present and are non-empty. How do I solve this?

Upvotes: 0

Views: 289

Answers (1)

garnaat
garnaat

Reputation: 45876

You may have FULL_CONTROL over that S3 resource but in order for this to work you have to grant the Machine Learning service the appropriate access to that S3 resource.

I know links to answers are frowned upon but in this case I think its best to link to the definitive documentation from the Machine Learning Service since the actual steps are complicated and could change in the future.

Upvotes: 1

Related Questions