How torch.distributed.launch assign data to each GPU?

Question

When our batch size is 1 or 2 and we have 8 GPUs, how torch.distributed.launch assign data to each GPUs? I converted my model to torch.nn.parallel.DistributedDataParallel,

model = DistributedDataParallel(model,
                                device_ids=[args.local_rank],
                                output_device=args.local_rank,
                                find_unused_parameters=True,
                                )

but it stated in the documentation that DistributedDataParallel:

parallelizes the application of the given module by splitting the input across the specified devices by chunking in the batch dimension.

My question is when batch size is smaller than the number of GPUs how it deal with it?

How torch.distributed.launch assign data to each GPU?

Answers (1)

Related Questions