Reputation: 3930
I'd like to initialize the parameters of RNN with np arrays.
In the following example, I want to pass w
to the parameters of rnn
. I know pytorch provides many initialization methods like Xavier, uniform, etc., but is there way to initialize the parameters by passing numpy arrays?
import numpy as np
import torch as nn
rng = np.random.RandomState(313)
w = rng.randn(input_size, hidden_size).astype(np.float32)
rnn = nn.RNN(input_size, hidden_size, num_layers)
Upvotes: 8
Views: 9600
Reputation: 37691
As a detailed answer is provided, I just to add one more sentence. The parameters of an nn.Module
are Tensors (previously, it used to be autograd variables, which is deperecated in Pytorch 0.4). So, essentially you need to use the torch.from_numpy()
method to convert the Numpy array to Tensor and then use them to initialize the nn.Module
parameters.
Upvotes: 2
Reputation: 15119
First, let's note that nn.RNN
has more than one weight variable, c.f. the documentation:
Variables:
weight_ih_l[k]
– the learnable input-hidden weights of thek
-th layer, of shape(hidden_size * input_size)
fork = 0
. Otherwise, the shape is(hidden_size * hidden_size)
weight_hh_l[k]
– the learnable hidden-hidden weights of thek
-th layer, of shape(hidden_size * hidden_size)
bias_ih_l[k]
– the learnable input-hidden bias of thek
-th layer, of shape(hidden_size)
bias_hh_l[k]
– the learnable hidden-hidden bias of thek
-th layer, of shape(hidden_size)
Now, each of these variables (Parameter
instances) are attributes of your nn.RNN
instance. You can access them, and edit them, two ways, as show below:
Parameter
attributes by name (rnn.weight_hh_lK
, rnn.weight_ih_lK
, etc.):import torch
from torch import nn
import numpy as np
input_size, hidden_size, num_layers = 3, 4, 2
use_bias = True
rng = np.random.RandomState(313)
rnn = nn.RNN(input_size, hidden_size, num_layers, bias=use_bias)
def set_nn_parameter_data(layer, parameter_name, new_data):
param = getattr(layer, parameter_name)
param.data = new_data
for i in range(num_layers):
weights_hh_layer_i = rng.randn(hidden_size, hidden_size).astype(np.float32)
weights_ih_layer_i = rng.randn(hidden_size, hidden_size).astype(np.float32)
set_nn_parameter_data(rnn, "weight_hh_l{}".format(i),
torch.from_numpy(weights_hh_layer_i))
set_nn_parameter_data(rnn, "weight_ih_l{}".format(i),
torch.from_numpy(weights_ih_layer_i))
if use_bias:
bias_hh_layer_i = rng.randn(hidden_size).astype(np.float32)
bias_ih_layer_i = rng.randn(hidden_size).astype(np.float32)
set_nn_parameter_data(rnn, "bias_hh_l{}".format(i),
torch.from_numpy(bias_hh_layer_i))
set_nn_parameter_data(rnn, "bias_ih_l{}".format(i),
torch.from_numpy(bias_ih_layer_i))
Parameter
attributes through rnn.all_weights
list attribute:import torch
from torch import nn
import numpy as np
input_size, hidden_size, num_layers = 3, 4, 2
use_bias = True
rng = np.random.RandomState(313)
rnn = nn.RNN(input_size, hidden_size, num_layers, bias=use_bias)
for i in range(num_layers):
weights_hh_layer_i = rng.randn(hidden_size, hidden_size).astype(np.float32)
weights_ih_layer_i = rng.randn(hidden_size, hidden_size).astype(np.float32)
rnn.all_weights[i][0].data = torch.from_numpy(weights_ih_layer_i)
rnn.all_weights[i][1].data = torch.from_numpy(weights_hh_layer_i)
if use_bias:
bias_hh_layer_i = rng.randn(hidden_size).astype(np.float32)
bias_ih_layer_i = rng.randn(hidden_size).astype(np.float32)
rnn.all_weights[i][2].data = torch.from_numpy(bias_ih_layer_i)
rnn.all_weights[i][3].data = torch.from_numpy(bias_hh_layer_i)
Upvotes: 3