Jayanth_
Jayanth_

Reputation: 41

SageMaker: TypeError: Object of type Join is not JSON serializable

I'm trying to build a SM pipeline for a computer vision model. The data is images stored in S3 bucket. I did the preprocessing using ScriptProcessor and now am trying to build the estimator. Preprocessing works alright. But the estimator part is giving me TypeError: Object of type Join is not JSON serializable: error.

from sagemaker.tensorflow import TensorFlow


output_config = preprocessing_job_description["ProcessingOutputConfig"]
for output in output_config["Outputs"]:
    if output["OutputName"] == "train_data":
        preprocessed_training_data = output["S3Output"]["S3Uri"]
    if output["OutputName"] == "valid_data":
        preprocessed_test_data = output["S3Output"]["S3Uri"]

s3_train = "s3://bucketname/image_data/train/"
s3_val = "s3://bucketname/image_data/val/"


tf_estimator = TensorFlow(entry_point="train.py",
                          sagemaker_session=sess,
                          role=role,
                          instance_count=1, 
                          instance_type="ml.m5.xlarge",
                          # output_path = "/opt/ml/processing/output",
                          model_dir="s3://bucketname/image_data/output",
                          py_version='py37',
                          framework_version='2.4', 
                          hyperparameters={'epochs': epochs,
                                           'learning_rate': learning_rate, 
                                           'train_batch_size': 64,
                                          },
                          metric_definitions=metrics_definitions,
                          script_mode=True,
                          max_run=7200 # max 2 hours * 60 minutes seconds per hour * 60 sec per minutes
                         )

tf_estimator.fit({"train": preprocessed_training_data})

This gives me the following error:


TypeError Traceback (most recent call last) in 36 ) 37 ---> 38 tf_estimator.fit({"train": preprocessed_training_data}) 39 # tf_estimator.fit({"train": s3_train})

/opt/conda/lib/python3.7/site-packages/sagemaker/workflow/pipeline_context.py in wrapper(*args, **kwargs) 207 return self_instance.sagemaker_session.context 208 --> 209 return run_func(*args, **kwargs) 210 211 return wrapper

/opt/conda/lib/python3.7/site-packages/sagemaker/estimator.py in fit(self, inputs, wait, logs, job_name, experiment_config) 976 self._prepare_for_training(job_name=job_name) 977 --> 978 self.latest_training_job = _TrainingJob.start_new(self, inputs, experiment_config) 979 self.jobs.append(self.latest_training_job) 980 if wait:

/opt/conda/lib/python3.7/site-packages/sagemaker/estimator.py in start_new(cls, estimator, inputs, experiment_config) 1806
train_args = cls._get_train_args(estimator, inputs, experiment_config) 1807 -> 1808 estimator.sagemaker_session.train(**train_args) 1809 1810 return cls(estimator.sagemaker_session, estimator._current_job_name)

/opt/conda/lib/python3.7/site-packages/sagemaker/session.py in train(self, input_mode, input_config, role, job_name, output_config, resource_config, vpc_config, hyperparameters, stop_condition, tags, metric_definitions, enable_network_isolation, image_uri, algorithm_arn, encrypt_inter_container_traffic, use_spot_instances, checkpoint_s3_uri, checkpoint_local_path, experiment_config, debugger_rule_configs, debugger_hook_config, tensorboard_output_config, enable_sagemaker_metrics, profiler_rule_configs, profiler_config, environment, retry_strategy) 592 encrypt_inter_container_traffic=encrypt_inter_container_traffic, 593 use_spot_instances=use_spot_instances, --> 594 checkpoint_s3_uri=checkpoint_s3_uri, 595 checkpoint_local_path=checkpoint_local_path, 596 experiment_config=experiment_config,

/opt/conda/lib/python3.7/site-packages/sagemaker/session.py in _intercept_create_request(self, request, create, func_name) 4201 """ 4202 region = self.boto_session.region_name -> 4203 sts_client = self.boto_session.client( 4204 "sts", region_name=region, endpoint_url=sts_regional_endpoint(region) 4205 )

/opt/conda/lib/python3.7/site-packages/sagemaker/session.py in submit(request) 589 enable_network_isolation=enable_network_isolation, 590 image_uri=image_uri, --> 591 algorithm_arn=algorithm_arn, 592 encrypt_inter_container_traffic=encrypt_inter_container_traffic, 593 use_spot_instances=use_spot_instances,

/opt/conda/lib/python3.7/json/init.py in dumps(obj, skipkeys, ensure_ascii, check_circular, allow_nan, cls, indent, separators, default, sort_keys, **kw) 236 check_circular=check_circular, allow_nan=allow_nan, indent=indent, 237 separators=separators, default=default, sort_keys=sort_keys, --> 238 **kw).encode(obj) 239 240

/opt/conda/lib/python3.7/json/encoder.py in encode(self, o) 199 chunks = self.iterencode(o, _one_shot=True) 200 if not isinstance(chunks, (list, tuple)): --> 201 chunks = list(chunks) 202 return ''.join(chunks) 203

/opt/conda/lib/python3.7/json/encoder.py in _iterencode(o, _current_indent_level) 429 yield from _iterencode_list(o, _current_indent_level) 430 elif isinstance(o, dict): --> 431 yield from _iterencode_dict(o, _current_indent_level) 432 else: 433 if markers is not None:

/opt/conda/lib/python3.7/json/encoder.py in _iterencode_dict(dct, _current_indent_level) 403 else: 404 chunks = _iterencode(value, _current_indent_level) --> 405 yield from chunks 406 if newline_indent is not None: 407 _current_indent_level -= 1

/opt/conda/lib/python3.7/json/encoder.py in _iterencode_dict(dct, _current_indent_level) 403 else: 404 chunks = _iterencode(value, _current_indent_level) --> 405 yield from chunks 406 if newline_indent is not None: 407 _current_indent_level -= 1

/opt/conda/lib/python3.7/json/encoder.py in _iterencode(o, _current_indent_level) 436 raise ValueError("Circular reference detected") 437 markers[markerid] = o --> 438 o = _default(o) 439 yield from _iterencode(o, _current_indent_level) 440 if markers is not None:

/opt/conda/lib/python3.7/json/encoder.py in default(self, o) 177 178 """ --> 179 raise TypeError(f'Object of type {o.class.name} ' 180 f'is not JSON serializable') 181

TypeError: Object of type Join is not JSON serializable

I have tried changing all the arguments I have given for the estimator. Sometimes enabling them and sometimes disabling them. --> 594 checkpoint_s3_uri=checkpoint_s3_uri, If this is the origin, I have tried giving it also.

No idea where I'm messing up. I'm using

sagemaker 2.94.0
Python3 Data Science kernel
boto3 '1.24.8'

Upvotes: 3

Views: 1459

Answers (2)

akshat garg
akshat garg

Reputation: 194

functions like Join are part of workflows and work with them only.

from sagemaker.workflow.functions import Join

and in sagemaker, you differentiate between a step and a job by defining the session. If it's sagemaker session then it's a job and if it's a pipeline session then, its a pipeline step. You cannot use Join in inputs to sagemaker jobs. So, if you are using it in this training job, you need to remove it. avoid using Join inside Join as well. Also, it's a training job, so inputs will be passed as TrainingInput. I hope you are doing that as well.

Upvotes: 0

maslick
maslick

Reputation: 3370

Perhaps, try to use PipelineSession instead of a normal Session object:

from sagemaker.workflow.pipeline_context import PipelineSession

tf_estimator = TensorFlow(entry_point="train.py",
                          sagemaker_session=PipelineSession()
)

https://github.com/aws/sagemaker-python-sdk/issues/3860

Upvotes: 1

Related Questions