small_angel
small_angel

Reputation: 81

Pytorch device and .to(device) method

I'm trying to learn RNN and Pytorch.

So I saw some codes for RNN where in the forward probagation method, they did a check like this:

def forward(self, inputs, hidden):
    if inputs.is_cuda:
        device = inputs.get_device()
    else:
        device = torch.device("cpu")
    embed_out = self.embeddings(inputs)
    logits = torch.zeros(self.seq_len, self.batch_size, self.vocab_size).to(device)

I think the point of the check is to see if we can run the code on faster GPU instead of CPU? To understand the code a bit more, I did the following:

ex= torch.zeros(3,10,5)
ex1= torch.tensor(np.array([[0,0,0,1,0], [1,0,0,0,0],[0,1,0,0,0]]))

print(ex)
print("device is")
print(ex1.get_device())
print(ex.to(ex1.get_device()))

And the output was:

        ...
        [[0., 0., 0., 0., 0.],
         [0., 0., 0., 0., 0.],
         [0., 0., 0., 0., 0.],
         [0., 0., 0., 0., 0.],
         [0., 0., 0., 0., 0.],
         [0., 0., 0., 0., 0.],
         [0., 0., 0., 0., 0.],
         [0., 0., 0., 0., 0.],
         [0., 0., 0., 0., 0.],
         [0., 0., 0., 0., 0.]]])
device is
-1
---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-2-b09342e2ba0f> in <module>()
     67 print("device is")
     68 print(ex1.get_device())
---> 69 print(ex.to(ex1.get_device()))

RuntimeError: Device index must not be negative

I don't understand the "device" in the code and I don't understand the .to(device) method. Can you help me understand it?

Upvotes: 4

Views: 8825

Answers (1)

duburcqa
duburcqa

Reputation: 1131

This code is deprecated. Just do:

def forward(self, inputs, hidden):
    embed_out = self.embeddings(inputs)
    logits = torch.zeros((self.seq_len, self.batch_size, self.vocab_size), device=inputs.device)

Note that to(device) is cost-free if the tensor is already on the requested device. And do not use get_device() but rather device attribute. It is working fine with cpu and gpu out of the box.

Also, note that torch.tensor(np.array(...)) is a bad practice for several reasons. First, to convert numpy array to torch tensor either use as_tensor or from_numpy. THen, you will get a tensor with default numpy dtype instead of torch. In this case it is the same (int64), but for float it would be different. Finally, torch.tensor can be initialized using a list, just as numpy array, so you can get rid of numpy completely and call torch directly.

Upvotes: 2

Related Questions