Does deleting intermediate tensors affect the computation graph in PyTorch?

Question

To free up memory, I was wondering if it was possible to remove intermediate tensors in the forward method of my model. Here's a minimalized example scenario:

def forward(self, input):
    x1, x2 = input
    x1 = some_layers(x1)
    x2 = some_layers(x2)
    x_conc = torch.cat((x1,x2),dim=1)
    x_conc = some_layers(x_conc)
    return x_conc

Basically, the model passes two tensors through two separate blocks, and then concatenates the results. Further operations are applied on that concatenated tensor. Will it affect the computation graph if I run del x1 and del x2 after creating x_conc?

Wasi Ahmad · Accepted Answer

PyTorch will store the x1, x2 tensors in the computation graph if you want to perform automatic differentiation later on. Also, note that deleting tensors using del operator works but you won't see a decrease in the GPU memory. Why? Because the memory is freed but not returned to the device. It is an optimization technique and from the user's perspective, the memory has been "freed". That is, the memory is available for making new tensors now.

Hence, deleting tensors is not recommended to free up GPU memory.

Does deleting intermediate tensors affect the computation graph in PyTorch?

Answers (1)

Related Questions