What is the correct way to read AWS Glue output files into a TensorFlow batch transform on SageMaker

Question

I have a Glue job that outputs a .out file into S3. The format of this file is fine for training a TensorFlow model on SageMaker (using script mode), but I am struggling to parse this data when running a batch transform.

I'm using the input_handler and output_handler functions as per the preferred inference.py scripting approach that is recommended, but I'm not exactly sure if I should treat the .out file as application/json, or text/csv, or something else entirely.

Example of the inference.py file: https://github.com/awslabs/amazon-sagemaker-examples/blob/master/sagemaker_batch_transform/tensorflow_cifar-10_with_inference_script/code/inference.py

What is the correct way to read AWS Glue output files into a TensorFlow batch transform on SageMaker

Answers (1)

Related Questions