This is a simple thing which I just couldn't figure out how to do.
I converted a pre-trained VGG caffe model to tensorflow using the github code from https://github.com/ethereon/caffe-tensorflow and saved it to vgg16.npy...
I then load the network to my sess default session as "net" using:
images = tf.placeholder(tf.float32, [1, 224, 224, 3]) net = VGGNet_xavier({'data': images, 'label' : 1}) with tf.Session() as sess: net.load("vgg16.npy", sess)
After net.load, I get a graph with a list of tensors. I can access individual tensors per layer using net.layers['conv1_1']... to get weights and biases for the first VGG convolutional layer, etc.
Now suppose that I make another graph that has as its first layer "h_conv1_b":
W_conv1_b = weight_variable([3,3,3,64]) b_conv1_b = bias_variable([64]) h_conv1_b = tf.nn.relu(conv2d(im_batch, W_conv1_b) + b_conv1_b)
My question is -- how do you get to assign the pre-trained weights from net.layers['conv1_1'] to h_conv1_b ?? (both are now tensors)
2 Answers
Answers 1
I suggest you have a detailed look at network.py from the https://github.com/ethereon/caffe-tensorflow, especially the function load(). It would help you understand what happened when you called net.load(weight_path, session).
FYI, variables in the Tensorflow can be assigned to the numpy array by using var.assign(np_array) which is executed in the session. Here is the solution to your question:
with tf.Session() as sess: W_conv1_b = weight_variable([3,3,3,64]) sess.run(W_conv1_b.assign(net.layers['conv1_1'].weights)) b_conv1_b = bias_variable([64]) sess.run(b_conv1_b.assign(net.layers['conv1_1'].biases)) h_conv1_b = tf.nn.relu(conv2d(im_batch, W_conv1_b) + b_conv1_b)
I would likely to kindly remind you the following points:
- var.assign(data) where 'data' is a numpy array and 'var' is a tensorflow variable should be executed in the same session where you want to continue to execute your network either inference or training.
- The 'var' should be created as the same shape as the 'data' by default. Therefore, if you can obtain the 'data' before creating the 'var', I suggest you create the 'var' by the method
var=tf.Variable(shape=data.shape)
. Otherwise, you need to create the 'var' by the methodvar=tf.Variable(validate_shape=False)
, which means the variable shape is feasible. Detailed explainations can be found in the Tensorflow's API doc.
I extend the same repo caffe-tensorflow to support theano in the kaffe that I can load the transformed model from caffe in the Theano. Therefore, I am a reasonable expert to this repo's code. Please feel free to get contact with me as you have any further question.
Answers 2
You can get variable values using eval method of tf.Variable
-s from the first network and load that values into variables of the second network using load method (also method of the tf.Variable
).
0 comments:
Post a Comment