I'm trying to use the grad_loss
parameter in optimizer.minimize(loss, grad_loss=)
to modify the network gradients with existing gradients. I followed the comments here: Use of grads_ys parameter in tf.gradients - TensorFlow
and I would like to run a toy example, in which I recreate the default 1
values for grad_ys
, as specified in the documentation.
Here's the relevant code segment:
grads_and_vars = optimizer.compute_gradients(loss_op) vars_with_grad = [v for g, v in grads_and_vars if g is not None] grad_loss = [] for grad,var in grads_and_vars: grad_loss.append(tf.ones_like(grad)) train_op = optimizer.minimize(loss_op, grad_loss=grad_loss)
The first part extracts gradients using compute_gradients
. The last line computes gradients of the loss function loss_op
but attempts to use 1
-filled vectors for the grads. As far as I understand, this should behave similarly to funning minimize
without the grad_loss
parameter.
Unfortunately, this fails since it expects grad_loss
to be a Tensor (and have a dtype) and not a list. Looking into gradients_impl.py
I see that the function expected grad_loss
to be of the same dimension as loss
(which in this case is a scalar).
I would appreciate any assistance in this simple example - how do I add elements to the gradients this way?
EDIT: I guess the question boils down to the definition of grad_loss
: "A Tensor
holding the gradient computed for loss
." How do I generate such a tensor from a set of gradients obtained by compute_gradients
?
Thanks.
1 Answers
Answers 1
You can make use of the tf.convert_to_tensor
method to convert your list of gradients to a tensor
, and then use tf.reduce_sum
:
train_op = optimizer.minimize(loss_op, grad_loss=tf.reduce_sum(tf.convert_to_tensor(grad_loss)))
0 comments:
Post a Comment