Reputation: 669
I'm using sagemaker batch transform, with json input files. see below for sample input/output files. i have custom inference code below, and i'm using json.dumps to return prediction, but it's not returning json. I tried to use => "DataProcessing": {"JoinSource": "string", }, to match input and output. but i'm getting error that "unable to marshall ..." . I think because , the output_fn is returning array of list or just list and not json , that is why it is unable to match input with output.any suggestions on how should i return the data?
infernce code
def model_fn(model_dir):
...
def input_fn(data, content_type):
...
def predict_fn(data, model):
...
def output_fn(prediction, accept):
if accept == "application/json":
return json.dumps(prediction), mimetype=accept)
raise RuntimeException("{} accept type is not supported by this script.".format(accept))
input file
{"data" : "input line one" }
{"data" : "input line two" }
....
output file
["output line one" ]
["output line two" ]
{
"BatchStrategy": SingleRecord,
"DataProcessing": {
"JoinSource": "string",
},
"MaxConcurrentTransforms": 3,
"MaxPayloadInMB": 6,
"ModelClientConfig": {
"InvocationsMaxRetries": 1,
"InvocationsTimeoutInSeconds": 3600
},
"ModelName": "some-model",
"TransformInput": {
"ContentType": "string",
"DataSource": {
"S3DataSource": {
"S3DataType": "string",
"S3Uri": "s3://bucket-sample"
}
},
"SplitType": "Line"
},
"TransformJobName": "transform-job"
}
Upvotes: 0
Views: 1382
Reputation: 1314
json.dumps
will not convert your array to a dict structure and serialize it to a JSON String.
What data type is prediction
? Have you tested making sure prediction
is a dict?
You can confirm the data type by adding print(type(prediction))
to see the data type in the CloudWatch Logs.
If prediction is a list
you can test the following:
def output_fn(prediction, accept):
if accept == "application/json":
my_dict = {'output': prediction}
return json.dumps(my_dict), mimetype=accept)
raise RuntimeException("{} accept type is not supported by this script.".format(accept))
DataProcessing
and JoinSource
are used to associate the data that is relevant to the prediction results in the output. It is not meant to be used to match the input and output format.
Upvotes: 1