mundus131
mundus131

Reputation: 21

Delete model from GPU/CPU in Pytorch

I have a big issue with memory. I am developing a big application with GUI for testing and optimizing neural networks. The main program is showing the GUI, but training is done in thread. In my app I need to train many models with different parameters one after one. To do this I need to create a model for each attempt. When I train one I want to delete it and train new one, but I cannot delete old model. I am trying to do something like this:

del model
torch.cuda.empty_cache()

but GPU memory doesn't change,

then i tried to do this:

model.cpu()
del model

When I move model to CPU, GPU memory is freed but CPU memory increase. In each attempt of training, memory is increasing all the time. Only when I close my app and run it again the all memory is freed.

Is there a way to delete model permanently from GPU or CPU?

Edit: Code:

Thread, where the procces of training take pleace:

class uczeniegridsearcch(QObject):
     endofoneloop = pyqtSignal()
     endofonesample = pyqtSignal()
     finished = pyqtSignal()
     def __init__(self, train_loader, test_loader, epoch, optimizer, lenoftd, lossfun, numberofsamples, optimparams, listoflabels, model_name, num_of_class, pret):
          super(uczeniegridsearcch, self).__init__()
          self.train_loaderup = train_loader
          self.test_loaderup = test_loader
          self.epochup = epoch
          self.optimizername = optimizer
          self.lenofdt = lenoftd
          self.lossfun = lossfun
          self.numberofsamples = numberofsamples
          self.acc = 0
          self.train_loss = 0
          self.sendloss = 0
          self.optimparams = optimparams
          self.listoflabels = listoflabels
          self.sel_Net = model_name
          self.num_of_class = num_of_class
          self.sel_Pret = pret
          self.modelforsend = []
          

     def setuptrainmodel(self):

          if self.sel_Net == "AlexNet":
               model = models.alexnet(pretrained=self.sel_Pret)
               model.classifier[6] = torch.nn.Linear(4096, self.num_of_class)
          elif self.sel_Net == "ResNet50":
               model = models.resnet50(pretrained=self.sel_Pret)
               model.fc = torch.nn.Linear(model.fc.in_features, self.num_of_class)
          elif self.sel_Net == "VGG13":
               model = models.vgg13(pretrained=self.sel_Pret)
               model.classifier[6] = torch.nn.Linear(model.classifier[6].in_features, self.num_of_class)
          elif self.sel_Net == "DenseNet201":
               model = models.densenet201(pretrained=self.sel_Pret)
               model.classifier = torch.nn.Linear(model.classifier.in_features, self.num_of_class)

          elif self.sel_Net == "MNASnet":
               model = models.mnasnet1_0(pretrained=self.sel_Pret)
               model.classifier[1] = torch.nn.Linear(model.classifier[1].in_features, self.num_of_class)

          elif self.sel_Net == "ShuffleNet v2":
               model = models.shufflenet_v2_x1_0(pretrained=self.sel_Pret)
               model.fc = torch.nn.Linear(model.fc.in_features, self.num_of_class)

          elif self.sel_Net == "SqueezeNet":
               model = models.squeezenet1_0(pretrained=self.sel_Pret)
               model.classifier[1] = torch.nn.Conv2d(512, self.num_of_class, kernel_size=(1, 1), stride=(1, 1))
               model.num_classes = self.num_of_class

          elif self.sel_Net == "GoogleNet":
               model = models.googlenet(pretrained=self.sel_Pret)
               model.fc = torch.nn.Linear(model.fc.in_features, self.num_of_class)

          return model
     def train(self):
          
          for x in range(self.numberofsamples):



               torch.cuda.empty_cache()


               modelup = self.setuptrainmodel()
               

               device = torch.device('cuda')

               optimizerup = TableWidget.setupotimfun(self, modelup, self.optimizername, self.optimparams[(x, 0)],
                                                      self.optimparams[(x, 1)], self.optimparams[(x, 2)],
                                                      self.optimparams[(x, 3)],
                                                      self.optimparams[(x, 4)], self.optimparams[(x, 5)])

               modelup = modelup.to(device)



               

               best_accuracy = 0.0
               


               train_error_count = 0
               
               for epoch in range(self.epochup):

                    for images, labels in iter(self.train_loaderup):
                         images = images.to(device)
                         labels = labels.to(device)
                         optimizerup.zero_grad()
                         outputs = modelup(images)
                         loss = TableWidget.setuplossfun(self, lossfun=self.lossfun, outputs=outputs, labels=labels)
                         self.train_loss += loss
                         loss.backward()
                         optimizerup.step()
                         train_error_count += float(torch.sum(torch.abs(labels - outputs.argmax(1))))
                    self.train_loss /= len(self.train_loaderup)

                    test_error_count = 0.0

                    for images, labels in iter(self.test_loaderup):
                         images = images.to(device)
                         labels = labels.to(device)
                         outputs = modelup(images)
                         test_error_count += float(torch.sum(torch.abs(labels - outputs.argmax(1))))

                    test_accuracy = 1.0 - float(test_error_count) / float(self.lenofdt)

                    print('%s, %d,%d: %f %f' % ("Próba nr:", x+1, epoch, test_accuracy, self.train_loss), "Parametry: ", self.optimparams[x,:])

                    self.acc = test_accuracy
                    self.sendloss = self.train_loss.item()
                    self.endofoneloop.emit()


               self.endofonesample.emit()

               modelup.cpu()
               
               del modelup,optimizerup,device,test_accuracy,test_error_count,train_error_count,loss,labels,images,outputs
               torch.cuda.empty_cache()
               

          self.finished.emit()

How I call thread in main block:

              self.qtest = uczeniegridsearcch(self.train_loader,self.test_loader, int(self.InputEpoch.text()),
                                              self.sel_Optim,len(self.test_dataset), self.sel_Loss,
                                              int(self.numberofsamples.text()), self.params, self.listoflabels,
                                              self.sel_Net,len(self.sel_ImgClasses),self.sel_Pret)

              self.qtest.endofoneloop.connect(self.inkofprogress)
              self.qtest.endofonesample.connect(self.inksamples)
              self.qtest.finished.connect(self.prints)
              testtret = threading.Thread(target=self.qtest.train)
              testtret.start()

Upvotes: 1

Views: 3100

Answers (1)

Manik Tharaka
Manik Tharaka

Reputation: 308

Assuming that the model creation code is run iteratively inside a loop,I suggest the following

  1. Put code for model creation, training,evaluation and model deletion code inside a separate function and call that function from the loop body.
  2. Call gc.collect() after the function call

The rational for first point is that the model creation, deletion and cache clearing would happen in a separate stack and it would force the GPU memory clearance when the method returns.

Upvotes: 0

Related Questions