Reputation: 439
This is a simple thing which I just couldn't figure out how to do.
I converted a pre-trained VGG caffe model to tensorflow using the github code from https://github.com/ethereon/caffe-tensorflow and saved it to vgg16.npy...
I then load the network to my sess default session as "net" using:
images = tf.placeholder(tf.float32, [1, 224, 224, 3])
net = VGGNet_xavier({'data': images, 'label' : 1})
with tf.Session() as sess:
net.load("vgg16.npy", sess)
After net.load, I get a graph with a list of tensors. I can access individual tensors per layer using net.layers['conv1_1']... to get weights and biases for the first VGG convolutional layer, etc.
Now suppose that I make another graph that has as its first layer "h_conv1_b":
W_conv1_b = weight_variable([3,3,3,64])
b_conv1_b = bias_variable([64])
h_conv1_b = tf.nn.relu(conv2d(im_batch, W_conv1_b) + b_conv1_b)
My question is -- how do you get to assign the pre-trained weights from net.layers['conv1_1'] to h_conv1_b ?? (both are now tensors)
Upvotes: 12
Views: 9035
Reputation: 276
I suggest you have a detailed look at network.py from the https://github.com/ethereon/caffe-tensorflow, especially the function load()
. It would help you understand what happened when you called net.load(weight_path, session)
.
FYI, variables in Tensorflow can be assigned to a numpy array by using var.assign(np_array)
which is executed in the session. Here is the solution to your question:
with tf.Session() as sess:
W_conv1_b = weight_variable([3,3,3,64])
sess.run(W_conv1_b.assign(net.layers['conv1_1'].weights))
b_conv1_b = bias_variable([64])
sess.run(b_conv1_b.assign(net.layers['conv1_1'].biases))
h_conv1_b = tf.nn.relu(conv2d(im_batch, W_conv1_b) + b_conv1_b)
I would like to kindly remind you the following points:
var.assign(data)
where 'data' is a numpy array and 'var' is a TensorFlow variable should be executed in the same session where you want to continue to execute your network either inference or training. var=tf.Variable(shape=data.shape)
. Otherwise, you need to create the 'var' by the method var=tf.Variable(validate_shape=False)
, which means the variable shape is feasible. Detailed explainations can be found in the Tensorflow's API doc.I extend the same repo caffe-tensorflow to support theano in caffe so that I can load the transformed model from caffe in Theano. Therefore, I am a reasonable expert w.r.t this repo's code. Please feel free to get in contact with me as you have any further question.
Upvotes: 19