Inserting 1MM+ rows into the wide and deep learning model throws ValueError: GraphDef cannot be larger than 2GB
:
Traceback (most recent call last): File "search_click.py", line 207, in <module> tf.app.run() File "/usr/lib/python2.7/site-packages/tensorflow/python/platform/app.py", line 30, in run sys.exit(main(sys.argv)) File "search_click.py", line 204, in main train_and_eval() File "search_click.py", line 181, in train_and_eval m.fit(input_fn=lambda: input_fn(df_train), steps=FLAGS.train_steps) File "/usr/lib/python2.7/site-packages/tensorflow/contrib/learn/python/learn/estimators/estimator.py", line 182, in fit monitors=monitors) File "/usr/lib/python2.7/site-packages/tensorflow/contrib/learn/python/learn/estimators/estimator.py", line 458, in _train_model summary_writer=graph_actions.get_summary_writer(self._model_dir)) File "/usr/lib/python2.7/site-packages/tensorflow/contrib/learn/python/learn/graph_actions.py", line 76, in get_summary_writer graph=ops.get_default_graph()) File "/usr/lib/python2.7/site-packages/tensorflow/python/training/summary_io.py", line 113, in __init__ self.add_graph(graph=graph, graph_def=graph_def) File "/usr/lib/python2.7/site-packages/tensorflow/python/training/summary_io.py", line 204, in add_graph true_graph_def = graph.as_graph_def(add_shapes=True) File "/usr/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 2117, in as_graph_def raise ValueError("GraphDef cannot be larger than 2GB.") ValueError: GraphDef cannot be larger than 2GB.
I defined the same input_fn
as in the example:
def input_fn(df): """Input builder function.""" # Creates a dictionary mapping from each continuous feature column name (k) to # the values of that column stored in a constant Tensor. continuous_cols = {k: tf.constant(df[k].values) for k in CONTINUOUS_COLUMNS} # Creates a dictionary mapping from each categorical feature column name (k) # to the values of that column stored in a tf.SparseTensor. categorical_cols = {k: tf.SparseTensor( indices=[[i, 0] for i in range(df[k].size)], values=df[k].values, shape=[df[k].size, 1]) for k in CATEGORICAL_COLUMNS} # Merges the two dictionaries into one. feature_cols = dict(continuous_cols) feature_cols.update(categorical_cols) # Converts the label column into a constant Tensor. label = tf.constant(df[LABEL_COLUMN].values) # Returns the feature columns and the label. return feature_cols, label
Is there a replacement to tf.constant
and tf.SparseTensor
that allows for inserting in batches and avoiding the memory error?
1 Answers
Answers 1
The example of wide and deep need load the data set all into memory. If you have large data set, you may need tf.decode_csv for csv formatted input. If your input format is self defined, you should create a custom data reader .
0 comments:
Post a Comment