Wednesday, July 12, 2017

Custom metric in TensorFlow, getting NaN

Leave a Comment

I'm trying to define the F1-score as a custom metric in TensorFlow for a DNNClassifier. To do that, I wrote a function

def metric_fn(predictions=[], labels=[], weights=[]):     P, _ = tf.contrib.metrics.streaming_precision(predictions, labels)     R, _ = tf.contrib.metrics.streaming_recall(predictions, labels)     if P + R == 0:         return 0     return 2*(P*R)/(P+R) 

that uses streaming_precision and streaming_recall from TensorFlow to calulate the F1 score. After that I made a new entry to the validation_metrics:

validation_metrics = {     "accuracy":         tf.contrib.learn.MetricSpec(             metric_fn=tf.contrib.metrics.streaming_accuracy,             prediction_key=tf.contrib.learn.PredictionKey.CLASSES),     "precision":         tf.contrib.learn.MetricSpec(             metric_fn=tf.contrib.metrics.streaming_precision,             prediction_key=tf.contrib.learn.PredictionKey.CLASSES),     "recall":         tf.contrib.learn.MetricSpec(             metric_fn=tf.contrib.metrics.streaming_recall,             prediction_key=tf.contrib.learn.PredictionKey.CLASSES),     "f1score":         tf.contrib.learn.MetricSpec(             metric_fn=metric_fn,             prediction_key=tf.contrib.learn.PredictionKey.CLASSES) } 

However, although I get correct precision and recall values, f1score is always nan:

INFO:tensorflow:Saving dict for global step 151: accuracy = 0.982456, accuracy/baseline_label_mean = 0.397661, accuracy/threshold_0.500000_mean = 0.982456, auc = 0.982867, f1score = nan, global_step = 151, labels/actual_label_mean = 0.397661, labels/prediction_mean = 0.406118, loss = 0.310612, precision = 0.971014, precision/positive_threshold_0.500000_mean = 0.971014, recall = 0.985294, recall/positive_threshold_0.500000_mean = 0.985294 

Something is wrong with my metric_fn, but I can't figure it out. The values P and R obtained by metric_fn are of the form Tensor("precision/value:0", shape=(), dtype=float32). I find this a bit strange. I was expecting a scalar tensor.

Any help is appreciated.

3 Answers

Answers 1

I think the problem may come from the fact that the streaming metrics you use within your metric_fn do not get any update.

Try the following (I also included minor modifications to my taste):

def metric_fn(predictions=None, labels=None, weights=None):     P, update_op1 = tf.contrib.metrics.streaming_precision(predictions, labels)     R, update_op2 = tf.contrib.metrics.streaming_recall(predictions, labels)     eps = 1e-5;     return (2*(P*R)/(P+R+eps), tf.group(update_op1, update_op2) 

Answers 2

If what the above answer suggested didn't work...

I don't know so much about how custom metrics work in TF, but how about you change your function name to something else like, f1score?

Maybe there was a conflict somewhere that occurred because the parameter and value have the same name.

Answers 3

tf.learn.MetricSpec __init__ first argument is metric_fn.

The documentation says:

metric_fn: A function to use as a metric. See _adapt_metric_fn for rules on how predictions, labels, and weights are passed to this function. This must return either a single Tensor, which is interpreted as a value of this metric, or a pair (value_op, update_op), where value_op is the op to call to obtain the value of the metric, and update_op should be run for each batch to update internal state.

Since you want to use streaming operations in your metric_fn, you can't return a single Tensor but you have to take into account that the streaming operations have an inner state that has to be updated.

Thus, the first part of your metric_fn should be:

def metric_fn(predictions=[], labels=[], weights=[]):     P, update_precision = tf.contrib.metrics.streaming_precision(predictions, labels)     R, update_recall = tf.contrib.metrics.streaming_recall(predictions, labels) 

Then, if you want to return 0 when a condition is met you can't use python if statement (that's not computed insede the tensorflow graph) but you have to use tf.cond (computation inside the graph).

Moreover, you want to check the value of P and R only after the update operation (otherwise the first value is undefined or nan).

To force the evaluation of tf.cond after the P and R update, you can use tf.control_dependencies

def metric_fn(predictions=[], labels=[], weights=[]):     P, update_precision = tf.contrib.metrics.streaming_precision(predictions, labels)     R, update_recall = tf.contrib.metrics.streaming_recall(predictions, labels)      with tf.control_dependencies([P, update_precision, R, update_recall]):         score = tf.cond(tf.equal(P + R, 0.), lambda: 0, lambda: 2*(P*R)/(P+R))     return score, tf.group(update_precision, update_recall) 
If You Enjoyed This, Take 5 Seconds To Share It

0 comments:

Post a Comment