Reputation: 27946
In order to access a model's parameters in pytorch, I saw two methods:
using state_dict
and using parameters()
I wonder what's the difference, or if one is good practice and the other is bad practice.
Thanks
Upvotes: 34
Views: 19852
Reputation: 944
Besides the differences in @kHarshit 's answer, the attribute requires_grad
of trainable tensors in net.parameters()
is True
, while False
in net.state_dict()
Upvotes: 6
Reputation: 12837
The parameters()
only gives the module parameters i.e. weights and biases.
Returns an iterator over module parameters.
You can check the list of the parameters as follows:
for name, param in model.named_parameters():
if param.requires_grad:
print(name)
On the other hand, state_dict
returns a dictionary containing a whole state of the module. Check its source code
that contains not just the call to parameters
but also buffers
, etc.
Both parameters and persistent buffers (e.g. running averages) are included. Keys are the corresponding parameter and buffer names.
Check all keys that state_dict
contains using:
model.state_dict().keys()
For example, in state_dict
, you'll find entries like bn1.running_mean
and running_var
, which are not present in .parameters()
.
If you only want to access parameters, you can simply use .parameters()
, while for purposes like saving and loading model as in transfer learning, you'll need to save state_dict
not just parameters.
Upvotes: 30