Reputation: 7638
What is the difference between torch.tensor
and torch.Tensor
? What was the reasoning for providing these two very similar and confusing alternatives?
Upvotes: 69
Views: 24717
Reputation: 1039
In pytorch there are two ways to create tensors.
torch.float32
Upvotes: 0
Reputation: 24169
In PyTorch torch.Tensor
is the main tensor class. So all tensors are just instances of torch.Tensor
.
When you call torch.Tensor()
you will get an empty tensor without any data
.
In contrast torch.tensor
is a function which returns a tensor. In the documentation it says:
torch.tensor(data, dtype=None, device=None, requires_grad=False) → Tensor
Constructs a tensor with
data
.
tensor_without_data = torch.Tensor()
But on the other side:
tensor_without_data = torch.tensor()
Will lead to an error:
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-12-ebc3ceaa76d2> in <module>()
----> 1 torch.tensor()
TypeError: tensor() missing 1 required positional arguments: "data"
Similar behaviour for creating a tensor without data
like with: torch.Tensor()
can be achieved using:
torch.tensor(())
Output:
tensor([])
Upvotes: 63
Reputation: 240
In addition to the above answers, I noticed:
torch.Tensor()
creates a tensor with the default data type, as defined by torch.get_default_dtype()
.torch.tensor()
will infer data type from the data.For example:
>>> torch.Tensor([1, 2, 3]).dtype
torch.float32
>>> torch.tensor([1, 2, 3]).dtype
torch.int64
Upvotes: 17
Reputation: 46449
torch.Tensor
is a favorite method used when creating parameters (for instance in nn.Linear
, nn._ConvNd
).
Why? Because it is very fast. It is even a bit faster than torch.empty()
.
import torch
torch.set_default_dtype(torch.float32) # default
%timeit torch.empty(1000,1000)
%timeit torch.Tensor(1000,1000)
%timeit torch.ones(1000,1000)
%timeit torch.tensor([[1]*1000]*1000)
Out:
68.4 µs ± 789 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
67.9 µs ± 349 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
1.26 ms ± 8.61 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
36.1 ms ± 610 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
torch.Tensor()
and and torch.empty()
are very similar and return a tensor filled with uninitialized data.
Why do we not initialize parameters in __init__
while technically this is possible?
Here is the torch.Tensor
in practice inside nn.Linear
to create the weight
parameter:
self.weight = nn.Parameter(torch.Tensor(out_features, in_features))
We do not initialize it per design. There is another reset_parameters()
method and because while training it may need to "reset" the parameters again, we call reset_paremeters()
at the end of the __init__()
method.
Maybe in the future torch.empty()
will replace torch.Tensor()
because these are the same in effect.
Also there is one nice option with reset_parameters()
, you may create your own version and alter the original initialization procedure if needed.
Upvotes: 2
Reputation: 894
https://discuss.pytorch.org/t/difference-between-torch-tensor-and-torch-tensor/30786/2
torch.tensor infers the dtype automatically, while torch.Tensor returns a torch.FloatTensor. I would recommend to stick to torch.tensor, which also has arguments like dtype, if you would like to change the type.
Upvotes: 6
Reputation: 300
According to discussion on pytorch discussion
torch.Tensor
constructor is overloaded to do the same thing as both torch.tensor
and torch.empty
. It is thought this overload would make code confusing, so split torch.Tensor
into torch.tensor
and torch.empty
.
So yes, to some extent, torch.tensor
works similarly to torch.Tensor (when you pass in data). no, neither should be more efficient than the other. It's just that the torch.empty
and torch.tensor
have a nicer API than torch.Tensor
constructor.
Upvotes: 9