PinkBanter
PinkBanter

Reputation: 1976

How to initialize empty tensor with certain dimension and append to it through a loop without CUDA out of memory?

I am trying to append tensors (t) generated in a for-loop to a list [T] that accumulates all these tensors. Next, the list [T] requires to be converted into a tensor and needs to be loaded onto GPU.

        b_output = []

        for eachInputId, eachMask in zip(b_input_ids, b_input_mask):
            # unrolled into each individual document
            # print(eachInputId.size()) # individual document here

            outputs = model(eachInputId, 
                    token_type_ids=None, 
                    attention_mask=eachMask)


            # combine the [CLS] output layer to form the document
            doc_output = torch.mean(outputs[1], dim=0) # size = [1, ncol]

            b_output.append( doc_output )

        t_b_output = torch.tensor( b_output )

Another method that I tried was initializing a tensor {T} with fixed dimensions and appending the tensors (t) to it from the for-loop.

        b_output = torch.zeros(batch_size, hidden_units)
        b_output.to(device) # cuda device

        for index, (eachInputId, eachMask) in enumerate(zip(b_input_ids, b_input_mask)):
            # unrolled into each individual document
            # print(eachInputId.size()) # individual document here

            outputs = model(eachInputId, 
                    token_type_ids=None, 
                    attention_mask=eachMask)


            # combine the [CLS] output layer to form the document
            doc_output = torch.mean(outputs[1], dim=0) # size = [1, ncol]

            b_output[index] = doc_output

Doing either of this produces this error:

RuntimeError: CUDA out of memory. Tried to allocate 20.00 MiB (GPU 0; 11.17 GiB total capacity; 10.65 GiB already allocated; 2.81 MiB free; 10.86 GiB reserved in total by PyTorch)

I assume this is because of appending the tensors (that are on the GPU) to a list (of course not on the GPU) and then trying to convert the list into a tensor (thats not on the GPU).

What could be done to append those tensors to another tensor and then load the tensor to GPU for further processing?

I will be grateful for any hint or information.

Upvotes: 2

Views: 11844

Answers (1)

Noé Achache
Noé Achache

Reputation: 345

Try using torch.cat instead of torch.tensor. You are currently trying to allocate memory for you new tensor while all the other tensors are still stored, which might be the cause of the out of memory error. Change :

t_b_output = torch.tensor( b_output )

with:

t_b_output = torch.cat( b_output )

Hope this help

Upvotes: 2

Related Questions