f10w
f10w

Reputation: 1586

Running a portion of Python code in parallel on two different GPUs

I have a PyTorch script similar to the following:

# Loading data
train_loader, test_loader = someDataLoaderFunction()

# Define the architecture
model = ResNet18()
model = model.cuda()  

# Get method from program argument
method = args.method

# Training
train(method, model, train_loader, test_loader)

In order to run the script with two different methods (method1 and method2), it suffices to run the following commands in two different terminals:

CUDA_VISIBLE_DEVICES=0 python program.py --method method1
CUDA_VISIBLE_DEVICES=1 python program.py --method method2

The problem is, the above data loader function contains some randomness in it, which means that the two methods were applied to two different sets of training data. I would like them to train the exact same set of data, so I modified the script as follows:

# Loading data
train_loader, test_loader = someDataLoaderFunction()

# Define the architecture
model = ResNet18()
model = model.cuda()  

## Run for the first method
method = 'method1'

 # Training
train(method, model, train_loader, test_loader)

## Run for the second method
method = 'method2'

# Must re-initialize the network first
model = ResNet18()
model = model.cuda()

 # Training
train(method, model, train_loader, test_loader)

Is it possible to make it run in parallel for each method? Thank you so much in advance for your help!

Upvotes: 1

Views: 1563

Answers (1)

Mo Hossny
Mo Hossny

Reputation: 742

I guess the easiest way would be to fix the seeds as below.

myseed=args.seed
np.random.seed(myseed)
torch.manual_seed(myseed)
torch.cuda.manual_seed(myseed)

This should force the data loaders to get the same samples every time. The parallel way is to use multithreading but I hardly see it worth the hassle for the problem you posted.

Upvotes: 1

Related Questions