Reputation: 1575
Goal: Train a model in Distriubted Data Parallel(DDP) setting using Pytorch Lightning Framework
Questions:
Training Data Partition: How is data partition across separate GPUs is handled with Pytorch Lightning? Am I supposed to manually partition the data or Pytorch lightning will take care of that?
Loss Averaging: Do I have to aggregate the losses myself or Pytorch Lightning is going automatically do that?
I have been spending time with the code-base of pytorch lightning, looking for how DDP sync is handled, but unable to find that exact code. Would appreciate a clarification on this.
Upvotes: 1
Views: 705