Beginner
Beginner

Reputation: 749

Setting the same seed for torch, random number and numpy throughout all the modules

I am trying to set the same seed throughout all the project. Below are the parameters I am setting in my main file, in which all other modules will be imported -

seed = 42
os.environ['PYTHONHASHSEED'] = str(seed)
# Torch RNG
torch.manual_seed(seed)
torch.cuda.manual_seed(seed)
torch.cuda.manual_seed_all(seed)
# Python RNG
np.random.seed(seed)
random.seed(seed)

My project directory looks like below -

├── Combined_Files_without_label5.csv
├── __pycache__
│   ├── dataset.cpython-37.pyc
│   ├── datasets.cpython-37.pyc
│   └── testing.cpython-37.pyc
├── datasets.py
├── import_packages
│   ├── __init__.py
│   ├── __pycache__
│   │   ├── __init__.cpython-37.pyc
│   │   ├── dataset.cpython-37.pyc
│   │   ├── dataset_class.cpython-37.pyc
│   │   ├── dataset_partition.cpython-37.pyc
│   │   └── visualising.cpython-37.pyc
│   ├── dataset_class.py
│   ├── dataset_partition.py
│   └── visualising.py
├── main.py

Now, the problem is I am importing the module from dataset_partition.py and the function needs a seed value there. e.g -

    df_train, df_temp, y_train, y_temp = train_test_split(X,
                                                      y,
                                                      stratify=y,
                                                      test_size=(1.0 - frac_train), # noqa
                                                      random_state=seed) 

Now, my questions are,
1)If I just remove the random_state parameter from the above statement so will it take the seed from my main file?
If not, then how to set it?
2)Does all the other function which requires seed like torch.manual.seed, torch.cuda.manual_seed(seed) will behave in the same way?(If not, then how to resolve it)

Upvotes: 0

Views: 8476

Answers (1)

Szymon Maszke
Szymon Maszke

Reputation: 24726

1)If I just remove the random_state parameter from the above statement so will it take the seed from my main file?

Yes, as the docs for default (None) value say:

Use the global random state instance from numpy.random. Calling the function multiple times will reuse the same instance, and will produce different results.

As you are using this in __init__ I suppose, it will be run before any other function you use from your package and you are fine.

2)Does all the other function which requires seed like torch.manual.seed, torch.cuda.manual_seed(seed) will behave in the same way?

Yes, those will set global seed for Python and PyTorch to use and you are also fine here.

Upvotes: 2

Related Questions