Reputation: 11
How are you?
I'm trying to execute a sagemaker job but i get this error:
ClientError: Failed to download data. Cannot download s3://pocaaml/sagemaker/xsell_sc1_test/model/model_lgb.tar.gz, a previously downloaded file/folder clashes with it. Please check your s3 objects and ensure that there is no object that is both a folder as well as a file.
I'm have that model_lgb.tar.gz on that s3 path as you can see here:
This is my code:
project_name = 'xsell_sc1_test'
s3_bucket = "pocaaml"
prefix = "sagemaker/"+project_name
account_id = "029294541817"
s3_bucket_base_uri = "{}{}".format("s3://", s3_bucket)
dev = "dev-{}".format(strftime("%y-%m-%d-%H-%M", gmtime()))
region = sagemaker.Session().boto_region_name
print("Using AWS Region: {}".format(region))
# Get a SageMaker-compatible role used by this Notebook Instance.
role = get_execution_role()
boto3.setup_default_session(region_name=region)
boto_session = boto3.Session(region_name=region)
s3_client = boto3.client("s3", region_name=region)
sagemaker_boto_client = boto_session.client("sagemaker") #este pinta?
sagemaker_session = sagemaker.session.Session(
boto_session=boto_session, sagemaker_client=sagemaker_boto_client
)
sklearn_processor = SKLearnProcessor(
framework_version="0.23-1", role=role, instance_type='ml.m5.4xlarge', instance_count=1
)
PREPROCESSING_SCRIPT_LOCATION = 'funciones_altas.py'
preprocessing_input_code = sagemaker_session.upload_data(
PREPROCESSING_SCRIPT_LOCATION,
bucket=s3_bucket,
key_prefix="{}/{}".format(prefix, "code")
)
preprocessing_input_data = "{}/{}/{}".format(s3_bucket_base_uri, prefix, "data")
preprocessing_input_model = "{}/{}/{}".format(s3_bucket_base_uri, prefix, "model")
preprocessing_output = "{}/{}/{}/{}/{}".format(s3_bucket_base_uri, prefix, dev, "preprocessing" ,"output")
processing_job_name = params["project_name"].replace("_", "-")+"-preprocess-{}".format(strftime("%d-%H-%M-%S", gmtime()))
sklearn_processor.run(
code=preprocessing_input_code,
job_name = processing_job_name,
inputs=[ProcessingInput(input_name="data",
source=preprocessing_input_data,
destination="/opt/ml/processing/input/data"),
ProcessingInput(input_name="model",
source=preprocessing_input_model,
destination="/opt/ml/processing/input/model")],
outputs=[
ProcessingOutput(output_name="output",
destination=preprocessing_output,
source="/opt/ml/processing/output")],
wait=False,
)
preprocessing_job_description = sklearn_processor.jobs[-1].describe()
and on funciones_altas.py i'm using ohe_altas.tar.gz and not model_lgb.tar.gz making this error super weird.
can you help me?
Upvotes: 0
Views: 1565
Reputation: 36
Looks like you are using sagemaker generated execution role and the error is related to S3 permissions.
Here are a couple of things you can do:
You can always create your own role as well and pass the arn to the code to run the processing job.
Upvotes: 1