Reputation: 11
I want to install additional packages which will be used in processing step.
sklearn_processor = FrameworkProcessor(
estimator_cls=SKLearn,
framework_version='0.23-1',
instance_type="ml.t3.medium",
instance_count=1,
base_job_name="sklearn-abalone-process",
sagemaker_session=sagemaker_session,
role=role
)
outputs = [
ProcessingOutput(output_name="train", source="/opt/ml/processing/train"),
ProcessingOutput(output_name="validation", source="/opt/ml/processing/validation"),
ProcessingOutput(output_name="test", source="/opt/ml/processing/test")
]
step_process = ProcessingStep(
name="Preprocess_Data",
processor = sklearn_processor.run(outputs=outputs,
code="pre-process.py", dependencies=["/home/sagemaker-user/dependencies/requirements.txt"])
)
After running ProcessingStep, I am getting ValueError: either step_args or processor need to be given, but not both.
Sagemaker version is 2.197.0
I tried going through AWS Documentation but no luck
Upvotes: 1
Views: 262
Reputation: 970
try this:
step_args = sklearn_processor.run(outputs=outputs, code="pre-process.py", source_dir=BASE_DIR)
step_process = ProcessingStep(
name="Preprocess_Data",
step_args = step_args
)
Adding a note:
You do not need to specify requirements.txt
as dependencies
in the sklearn_processor.run(..)
. The FrameworkProcessor will upload it for you. Just set the source_dir
in sklearn_processor.run(..)
. For more information check here:
https://sagemaker-examples.readthedocs.io/en/latest/sagemaker_processing/scikit_learn_data_processing_and_model_evaluation/scikit_learn_data_processing_and_model_evaluation.html#Running-processing-jobs-with-FrameworkProcessor-to-include-custom-dependencies
and
Upvotes: 0