softmax dims and variable volatile in PyTorch

Question

I have a code for previous version of PyTorch and I receive 2 warning for the 3nd line of it:

import torch.nn.functional as F

def select_action(self, state):
        probabilities = F.softmax(self.model(Variable(state, volatile = True))*100) # T=100
        action = probs.multinomial(num_samples=1)
        return action.data[0,0]

UserWarning: volatile was removed and now has no effect. Use with torch.no_grad(): instead.

UserWarning: Implicit dimension choice for softmax has been deprecated. Change the call to include dim=X > as an argument.

I found that:

Volatile is recommended for purely inference mode, when you’re sure you won’t be even calling .backward(). It’s more efficient than any other autograd setting - it will use the absolute minimal amount of memory to evaluate the model. volatile also determines that requires_grad is False.

Am I right that I should just remove it? And because I want to get probabilities therefore should I use dim=1 ? and the 3nd line of my code should look like:

    probabilities = F.softmax(self.model(Variable(state), dim=1)*100) # T=100

state is created here:

def update(self, reward, new_signal):
   new_state = torch.Tensor(new_signal).float().unsqueeze(0)
   self.memory.push((self.last_state, new_state, torch.LongTensor([int(self.last_action)]), torch.Tensor([self.last_reward])))
   action = self.select_action(new_state)
   if len(self.memory.memory) > 100:
       batch_state, batch_next_state, batch_action, batch_reward = self.memory.sample(100)
       self.learn(batch_state, batch_next_state, batch_reward, batch_action)
   self.last_action = action
   self.last_state = new_state
   self.last_reward = reward
   self.reward_window.append(reward)
   if len(self.reward_window) > 1000:
       del self.reward_window[0]
   return action

Szymon Maszke · Accepted Answer

You are right but not "fully" right.

Except changes you mentioned you should use torch.no_grad() as mentioned like this:

def select_action(self, state):
    with torch.no_grad():
        probabilities = F.softmax(self.model(state), dim=1)*100
        action = probs.multinomial(num_samples=1)
        return action.data[0,0]

This block turns off autograd engine for code within it (so you save the memory similarly to volatile).

Also please notice Variable is deprecated as well (check here) and state should be simply torch.tensor created with requires_grad=True.

BTW. You have probs and probabilities but I assume it's the same thing and merely a typo.

softmax dims and variable volatile in PyTorch

Answers (2)

Related Questions