Nin
Nin

Reputation: 93

Placeholder storage has not been allocated on MPS device

I understand I need to allocate both the input tensor and the model parameters to the mps device in order for PyTorch to use my Mac M1 GPU for training. I did just that and it still gave me this error message:

*Placeholder storage has not been allocated on MPS device! * Here is a snippet of my code.

if torch.backends.mps.is_available():
    mps_device = torch.device("mps")
    x = torch.ones(1, device=mps_device)
    print(x)
else:
    print("MPS device not found")


model = LSTMModel(input_size=197,
                  hidden_size=HIDDEN_UNITS,
                  output_size=1,layer_size=2)

# Transfer the model to the GPU
model = model.to(mps_device)


# iterate over the training data
    for i in range(100000):
        # send the input/labels to the GPU
        
        next_batch = next(train_generate)
        inputs = torch.from_numpy(next_batch[0]).float().to(mps_device)
        labels = torch.from_numpy(next_batch[1][-1]).float().to(mps_device)

        
        with torch.set_grad_enabled(True):
            outputs = model(inputs)
            loss = loss_function(outputs, labels)

            # backward
            loss.backward()
            optimizer.step()

I followed documentation of how to use M1 GPU for PyTorch but it didn't work.

Upvotes: 4

Views: 6283

Answers (2)

Virendra
Virendra

Reputation: 305

Using no_cuda=True solved it for me on an M1 mac. Like so

training_args = TrainingArguments(
        ...,
        no_cuda=True
    )

Upvotes: 7

Nin
Nin

Reputation: 93

I have figured out the problem. The model object has internal variables that are not parameters (weights). I have to set those variables to use mps_device explicitly. Specifying model.to(mps_device) is not sufficient.

def forward(self, x):
    # Start with empty network output and cell state to initialize the sequence
    c_0 = torch.zeros(self.layer_size, BATCH_SIZE, self.hidden_size).requires_grad_().to(mps_device)
    h_0 = torch.zeros(self.layer_size, BATCH_SIZE, self.hidden_size).requires_grad_().to(mps_device)

    # Iterate over all sequence elements across all sequences of the mini-batch
    #print(x.size())
    out, (h_t, c_t) = self.lstm(x, (h_0.detach(), c_0.detach()))
    
    # Final output layer
    return self.sig(self.fc(out[-1]))

Upvotes: 3

Related Questions