Kendall Reid
Kendall Reid

Reputation: 75

How to handle entrypoints nested in folders with amazon sagemaker pytorch estimator?

I am attempting to run a training job on amazon sagemaker using the python-sagemaker-sdk, estimator class.

I have the following

estimator = PyTorch(entry_point='training_scripts/train_MSCOCO.py',
                            source_dir='./',
                            role=#dummy_role,
                            train_instance_type='ml.p3.2xlarge',
                            train_instance_count=1,
                            framework_version='1.0.0',
                            output_path=#dummy_output_path,
                            hyperparameters={'lr': 0.001,
                                             'batch_size': 32,
                                             'num_workers': 4,
                                             'description': description})

role and output_path hidden for privacy.

I get the following error, "No module named training_scripts\train_MSCOCO".

When I run python -m training_scripts.train_MSCOCO the script runs fine. However when I pass entry_point='training_script.train_MSCOCO.py it will not run as, "No file named "training_scripts.train_MSCOCO.py" was found in directory "./"".

I am confused as to how to run a nested training script from the top level of my repository within AWS sagemaker, as they seem to have conflicting path needs, one in python module dot notation, the other in standard filepath slash notation.

Upvotes: 0

Views: 540

Answers (1)

Julien Simon
Julien Simon

Reputation: 2729

Either one of these will work:

estimator = PyTorch(entry_point='training_scripts/train_MSCOCO.py',
                    role=#dummy_role,
                    ...

estimator = PyTorch(entry_point='train_MSCOCO.py',
                    source_dir='training_scripts',
                    role=#dummy_role,
                    ...

Upvotes: 2

Related Questions