Friday, May 5, 2017

Tensorflow: How can I assign numpy pre-trained weights to subsections of graph?

Leave a Comment

This is a simple thing which I just couldn't figure out how to do.

I converted a pre-trained VGG caffe model to tensorflow using the github code from https://github.com/ethereon/caffe-tensorflow and saved it to vgg16.npy...

I then load the network to my sess default session as "net" using:

images = tf.placeholder(tf.float32, [1, 224, 224, 3]) net = VGGNet_xavier({'data': images, 'label' : 1})  with tf.Session() as sess:   net.load("vgg16.npy", sess)  

After net.load, I get a graph with a list of tensors. I can access individual tensors per layer using net.layers['conv1_1']... to get weights and biases for the first VGG convolutional layer, etc.

Now suppose that I make another graph that has as its first layer "h_conv1_b":

  W_conv1_b = weight_variable([3,3,3,64])   b_conv1_b = bias_variable([64])   h_conv1_b = tf.nn.relu(conv2d(im_batch, W_conv1_b) + b_conv1_b) 

My question is -- how do you get to assign the pre-trained weights from net.layers['conv1_1'] to h_conv1_b ?? (both are now tensors)

2 Answers

Answers 1

I suggest you have a detailed look at network.py from the https://github.com/ethereon/caffe-tensorflow, especially the function load(). It would help you understand what happened when you called net.load(weight_path, session).

FYI, variables in the Tensorflow can be assigned to the numpy array by using var.assign(np_array) which is executed in the session. Here is the solution to your question:

with tf.Session() as sess:       W_conv1_b = weight_variable([3,3,3,64])   sess.run(W_conv1_b.assign(net.layers['conv1_1'].weights))   b_conv1_b = bias_variable([64])   sess.run(b_conv1_b.assign(net.layers['conv1_1'].biases))   h_conv1_b = tf.nn.relu(conv2d(im_batch, W_conv1_b) + b_conv1_b) 

I would likely to kindly remind you the following points:

  1. var.assign(data) where 'data' is a numpy array and 'var' is a tensorflow variable should be executed in the same session where you want to continue to execute your network either inference or training.
  2. The 'var' should be created as the same shape as the 'data' by default. Therefore, if you can obtain the 'data' before creating the 'var', I suggest you create the 'var' by the method var=tf.Variable(shape=data.shape). Otherwise, you need to create the 'var' by the method var=tf.Variable(validate_shape=False), which means the variable shape is feasible. Detailed explainations can be found in the Tensorflow's API doc.

I extend the same repo caffe-tensorflow to support theano in the kaffe that I can load the transformed model from caffe in the Theano. Therefore, I am a reasonable expert to this repo's code. Please feel free to get contact with me as you have any further question.

Answers 2

You can get variable values using eval method of tf.Variable-s from the first network and load that values into variables of the second network using load method (also method of the tf.Variable).

If You Enjoyed This, Take 5 Seconds To Share It

0 comments:

Post a Comment