Wednesday, August 9, 2017

TensorFlow crashes when fitting TensorForestEstimator

Leave a Comment

I am trying to fit at TensorForestEstimator model with numerical floating-point data representing 7 features and 7 labels. That is, the shape of both features and labels is (484876, 7). I set num_classes=7 and num_features=7 in ForestHParamsappropriately. The format of the data is as follows:

f1       f2     f3    f4      f5    f6    f7   l1       l2       l3       l4       l5       l6       l7 39000.0  120.0  65.0  1000.0  25.0  0.69  3.94 39000.0  39959.0  42099.0  46153.0  49969.0  54127.0  55911.0 32000.0  185.0  65.0  1000.0  75.0  0.46  2.19 32000.0  37813.0  43074.0  48528.0  54273.0  60885.0  63810.0  30000.0  185.0  65.0  1000.0  25.0  0.41  1.80 30000.0  32481.0  35409.0  39145.0  42750.0  46678.0  48595.0 

When calling fit() Python crashes with the following message:

Python quit unexpectedly while using the _pywrap_tensorflow_internal.so plug-in.

Here is the output when enabling tf.logging.set_verbosity('INFO'):

INFO:tensorflow:training graph for tree: 0 INFO:tensorflow:training graph for tree: 1 ...  INFO:tensorflow:training graph for tree: 9998 INFO:tensorflow:training graph for tree: 9999 INFO:tensorflow:Create CheckpointSaverHook. 2017-07-26 10:25:30.908894: F tensorflow/contrib/tensor_forest/kernels/count_extremely_random_stats_op.cc:404]  Check failed: column < num_classes_ (39001 vs. 8)  Process finished with exit code 134 (interrupted by signal 6: SIGABRT) 

I'm not sure what this error means, it doesn't really make sense since num_classes=7, not 8 and as the shape of features and labels is (484876, 7), I don't know where the 39001 is coming from.

Here is the code to reproduce:

import numpy as np import pandas as pd import os  def get_training_data():     training_file = "data.txt"     data = pd.read_csv(training_file, sep='\t')      X = np.array(data.drop('Result', axis=1), dtype=np.float32)      y = []     for e in data.ResultStr:         y.append(list(np.array(str(e).replace('[', '').replace(']', '').split(','))))      y = np.array(y, dtype=np.float32)      features = tf.constant(X)     labels = tf.constant(y)      return features, labels  hyperparameters = ForestHParams(     num_trees=100,     max_nodes=10000,     bagging_fraction=1.0,     num_splits_to_consider=0,     feature_bagging_fraction=1.0,     max_fertile_nodes=0,     split_after_samples=250,     min_split_samples=5,     valid_leaf_threshold=1,     dominate_method='bootstrap',     dominate_fraction=0.99,     # All parameters above are default     num_classes=7,     num_features=7 )  estimator = TensorForestEstimator(     params=hyperparameters,     # All parameters below are default     device_assigner=None,     model_dir=None,     graph_builder_class=RandomForestGraphs,     config=None,     weights_name=None,     keys_name=None,     feature_engineering_fn=None,     early_stopping_rounds=100,     num_trainers=1,     trainer_id=0,     report_feature_importances=False,     local_eval=False )  estimator.fit(     input_fn=lambda: get_training_data(),     max_steps=100,     monitors=[         TensorForestLossHook(             early_stopping_rounds=30         )     ] ) 

It also doesn't work if I wrap it with SKCompat, the same error occur. What is the cause of this crash?

1 Answers

Answers 1

regression=True needs to be specified in the ForestHParams because TensorForestEstimator by default assumes that it is being used to solve a classification problem, which can only output one value.

There is an implicit num_outputs variable created upon initialization of the estimator and it is set to 1 if regression was not specified. If regression is specified, then num_outputs = num_classes and checkpoints are saved normally.

If You Enjoyed This, Take 5 Seconds To Share It

0 comments:

Post a Comment