I want to train a network with planar curves, which I represent as numpy arrays with shape (L,2)
. The number 2 stands for x,y coordinates and L is the number of points which is changing in my dataset. I treat x,y as 2 different "channels".
I implemented a function, next_batch(batch_size)
, that provides the next batch as a 1D numpy array with shape (batch_size,)
, containing elements which are 2D arrays with shape: (L,2)
. These are my curves, and as mentioned before, L is different between the elements. (I didn't want to confine to fixed number of points in the curve).
My question:
How can I manipulate the output from next_batch()
so I will able to feed the network with the input curves, using a scheme similar to what appears in Tensorflow tutorial: https://www.tensorflow.org/get_started/mnist/pros
i.e, using the feed_dict
mechanism. In the given turorial the input size was fixed, in the tutorial's code line:
train_step.run(feed_dict={x: batch[0], y_: batch[1], keep_prob: 0.5})
batch[0]
has a fixed shape: (50,784)
(50 = #samples,784 = #pixels)
I cannot transform my input into numpy array with shape (batch_size,L,2)
since the array should have fixed size in every dimension. So what can I do?
I already defined a placeholder (that can have unknown size):
#first dimension is the sample dim, second is curve length, third:x,y coordinates x = tf.placeholder(tf.float32, [None, None,2])
but how can I feed it properly?
Thank you very much
4 Answers
Answers 1
You can use input with different sizes in TF. just feed the data in the same way as in the tutorial you listed, but make sure to define the changing dimensions in the placeholder as None.
Here's an simple example of feeding a placeholder with different shapes:
import tensorflow as tf import numpy as np array1 = np.arange(9).reshape((3,3)) array2 = np.arange(16).reshape((4,4)) array3 = np.arange(25).reshape((5,5)) model_input = tf.placeholder(dtype='float32', shape=[None, None]) sqrt_result = tf.sqrt(model_input) with tf.Session() as sess: print sess.run(sqrt_result, feed_dict={model_input:array1}) print sess.run(sqrt_result, feed_dict={model_input:array2}) print sess.run(sqrt_result, feed_dict={model_input:array3})
Answers 2
You can use placeholder with initial the var with [None, ..., None]. Each 'None' means there are input feed data at that dimension for the compiler. For example, [None, None] means a matrix with any row and column length you can feed. However, you should take care about which kind of NN you use. Because when you deal with CNN, at the convolution layer and pool layer you must identify the specific size of the 'tensor'.
Answers 3
Short answer that you're probably looking for: you can't without padding or grouping samples by lenght.
To elaborate a bit: in tensorflow, dimensions must be fixed throughout a batch, and jagged arrays are not natively supported.
Dimensions may be unknown a priori (in which case you set the placeholders' dimensions to None
) but are still inferred at runtime, so your solution of having a placeholder:
x = tf.placeholder(tf.float32, [None, None, 2])
couldn't work because it's semantically equivalent to saying "I don't know the constant length of the curves in a batch a priori, infer it at runtime from the data".
This is not to say that your model in general can't accept inputs of different dimensions, if you structure it accordingly, but the data that you feed it each time you call sess.run()
must have fixed dimensions.
Your options, then, are as follows:
Pad your batches along the second dimension.
Say that you have 2 curves of shape(4, 2)
and(5, 2)
and you know the maximum curve length in you dataset is 6, you could usenp.pad
as follows:In [1]: max_len = 6 ...: curve1 = np.random.rand(4, 2) ...: curve2 = np.random.rand(5, 2) ...: batch = [curve1, curve2] In [2]: for b in batch: ...: dim_difference = max_len - b.shape[0] ...: print np.pad(b, [(0, dim_difference), (0,0)], 'constant') ...: [[ 0.92870128 0.12910409] [ 0.41894655 0.59203704] [ 0.3007023 0.52024492] [ 0.47086336 0.72839691] [ 0. 0. ] [ 0. 0. ]] [[ 0.71349902 0.0967278 ] [ 0.5429274 0.19889411] [ 0.69114597 0.28624011] [ 0.43886002 0.54228625] [ 0.46894651 0.92786989] [ 0. 0. ]]
- Have your
next_batch()
function return batches of curves with the same length.
These are the standard ways of doing things when dealing with jagged arrays.
Another possibility, if your task allows for it, is to concatenate all your points in a single tensor of shape (None, 2)
and change your model to operate on single points as if they were samples in a batch. If you save the original sample lengths in a separate array, you can then restore the model outputs by slicing them correctly. This is highly inefficient and requires all sorts of assumptions on your problem, but it's a possibility.
Cheers and good luck!
Answers 4
Tensorflow Fold might be of interest to you.
From the Tensorflow Fold README:
TensorFlow Fold is a library for creating TensorFlow models that consume structured data, where the structure of the computation graph depends on the structure of the input data.Fold implements dynamic batching. Batches of arbitrarily shaped computation graphs are transformed to produce a static computation graph. This graph has the same structure regardless of what input it receives, and can be executed efficiently by TensorFlow.
The graph structure can be set up so as to accept an arbitrary L
value so that any structured input can be read in. This is especially helpful when building architectures such as recursive neural nets. The overall structure is very similar to what you are used to (feed dicts, etc). Since you need a dynamic computational graph for your application, this might be a good move for you in the long run.
0 comments:
Post a Comment