dablyo
dablyo

Reputation: 39

about torch.nn.CrossEntropyLoss parameter shape

i'm learning pytorch, and taking the anpr project,which is based tensorflow (https://github.com/matthewearl/deep-anpr, http://matthewearl.github.io/2016/05/06/cnn-anpr/) as a exercise, transplant it to pytorch platform.

there is a problem,i'm using nn.CrossEntropyLoss() as loss function:

criterion=nn.CrossEntropyLoss()

the output.data of model is:

  - 1.00000e-02 *
   - 2.5552 2.7582 2.5368 ... 5.6184 1.2288 -0.0076
   - 0.7033 1.3167 -1.0966 ... 4.7249 1.3217 1.8367
   - 0.7592 1.4777 1.8095 ... 0.8733 1.2417 1.1521
   - 0.1040 -0.7054 -3.4862 ... 4.7703 2.9595 1.4263 
   - [torch.FloatTensor of size 4x253]

and targets.data is:

 - 1 0 0 ... 0 0 0
 - 1 0 0 ... 0 0 0
 - 1 0 0 ... 0 0 0
 - 1 0 0 ... 0 0 0
 - [torch.DoubleTensor of size 4x253]

when i call:

loss=criterion(output,targets)

error occured,information is:

TypeError: FloatClassNLLCriterion_updateOutput received an invalid combination of arguments - got (int, torch.FloatTensor, **torch.DoubleTensor**, torch.FloatTensor, bool, NoneType, torch.FloatTensor), but expected (int state, torch.FloatTensor input, **torch.LongTensor** target, torch.FloatTensor output, bool sizeAverage, [torch.FloatTensor weights or None], torch.FloatTensor total_weight)

'expected torch.LongTensor'......'got torch.DoubleTensor',but if i convert the targets into LongTensor:
torch.LongTensor(numpy.array(targets.data.numpy(),numpy.long))

call loss=criterion(output,targets), the error is:

RuntimeError: multi-target not supported at /data/users/soumith/miniconda2/conda-bld/pytorch-0.1.10_1488752595704/work/torch/lib/THNN/generic/ClassNLLCriterion.c:20

my last exercise is mnist, a example from pytorch,i made a bit modification,batch_size is 4,the loss function:

loss = F.nll_loss(outputs, labels)

outputs.data:

 - -2.3220 -2.1229 -2.3395 -2.3391 -2.5270 -2.3269 -2.1055 -2.2321 -2.4943 -2.2996

   -2.3653 -2.2034 -2.4437 -2.2708 -2.5114 -2.3286 -2.1921 -2.1771 -2.3343 -2.2533

   -2.2809 -2.2119 -2.3872 -2.2190 -2.4610 -2.2946 -2.2053 -2.3192 -2.3674 -2.3100

   -2.3715 -2.1455 -2.4199 -2.4177 -2.4565 -2.2812 -2.2467 -2.1144 -2.3321 -2.3009

   [torch.FloatTensor of size 4x10]

labels.data:

 - 8 
 - 6 
 - 0 
 - 1 
 - [torch.LongTensor of size 4]

the labels, for a input image,must be a single element, in upper example, there is 253 numbers, and in 'mnist',there is only one number, the shape of outputs is difference from labels.

i review the tensorflow manual, tf.nn.softmax_cross_entropy_with_logits, 'Logits and labels must have the sameshape [batch_size, num_classes] and the same dtype (either float32 or float64).'

does pytorch support the same function in tensorflow?

many thks

Upvotes: 2

Views: 6647

Answers (2)

Oliver
Oliver

Reputation: 150

CrossEntropyLoss is equivalent to tf.nn.softmax_cross_entropy_with_logits. The input to CrossEntropyLoss is a categorical vector of shape [batch_size]. Use .view() to change the tensor shapes.

labels = labels.view(-1) output = output.view(labels.size(0), -1) loss = criterion(output, loss)

calling .view(x, y, -1) causes the tensor to use the remaining datapoints to fill the -1 dimension and will cause an error if there is not enough to make a full dimension

labels.size(0) gives the size of the 0th dimension of the label tensor

Additional

to convert between tensor types you can call the type on the tensor, for example 'labels = labels.long()`

Second Additional

If you unpack the data from a variable like so output.data then you will lose the gradients for that output and be unable to backprop when the time comes

Upvotes: 1

Roger Trullo
Roger Trullo

Reputation: 1584

You can convert the targets that you have to a categorical representation. In the example that you provide, you would have 1 0 0 0.. 0 if the class is 0, 0 1 0 0 ... if the class is 1, 0 0 1 0 0 0... if the class is 2 etc. One quick way that I can think of is first convert the target Tensor to a numpy array, then convert it from one hot to a categorical array, and convert it back to a pytorch Tensor. Something like this:

targetnp=targets.numpy()
idxs=np.where(targetnp>0)[1]
new_targets=torch.LongTensor(idxs)
loss=criterion(output,new_targets)

Upvotes: 4

Related Questions