How to handle entrypoints nested in folders with amazon sagemaker pytorch estimator?

Question

I am attempting to run a training job on amazon sagemaker using the python-sagemaker-sdk, estimator class.

I have the following

estimator = PyTorch(entry_point='training_scripts/train_MSCOCO.py',
                            source_dir='./',
                            role=#dummy_role,
                            train_instance_type='ml.p3.2xlarge',
                            train_instance_count=1,
                            framework_version='1.0.0',
                            output_path=#dummy_output_path,
                            hyperparameters={'lr': 0.001,
                                             'batch_size': 32,
                                             'num_workers': 4,
                                             'description': description})

role and output_path hidden for privacy.

I get the following error, "No module named training_scripts rain_MSCOCO".

When I run python -m training_scripts.train_MSCOCO the script runs fine. However when I pass entry_point='training_script.train_MSCOCO.py it will not run as, "No file named "training_scripts.train_MSCOCO.py" was found in directory "./"".

I am confused as to how to run a nested training script from the top level of my repository within AWS sagemaker, as they seem to have conflicting path needs, one in python module dot notation, the other in standard filepath slash notation.

How to handle entrypoints nested in folders with amazon sagemaker pytorch estimator?

Answers (1)

Related Questions