Francesco Pochetti
Francesco Pochetti

Reputation: 86

Processing job v.s. training job in SageMaker

What is the difference between a processing job and a training job? I am running training job and did not launch processing job, why does my account have processing job running?

Upvotes: 1

Views: 2211

Answers (1)

Gili Nachum
Gili Nachum

Reputation: 5568

Training job focuses on a process that train a model, for example it will use a PyTorch or TensorFlow container, or SageMaker built-in algorithms, or have a Hyper parameters tuning job associated with the training job. It may include an MPI cluster for distributed training. Finally it outputs a model artifact.

Processing jobs focus on pre/post processing, and have API and containers for ML data processing tools like SK-learn pipeline or Dusk/Spark.

Why do you see processing jobs - When you are profiling a training job (enabled by default), then a matching processing job is created to process the profiler report. You can disable profiling by adding disable_profiler=True parameter to the estimator object.

Generally speaking training jobs have a richer API than processing jobs, and allows more customization. The right choice will depend on your specific use case.

Upvotes: 1

Related Questions