Python 3.x Loop over tensor dimension 0 (NoneType) with second tensor values

Python 3.x Loop over tensor dimension 0 (NoneType) with second tensor values,python-3.x,tensorflow,Python 3.x,Tensorflow,I have a tensor a, I'd like to loop over the rows and index values based on another tensor l. i.e. l suggests the length of the vector I need. sess = tf.InteractiveSession() a = tf.constant(np.random.rand(3,4)) # shape=(3,4) a.eval() Out: array([[0.35879311, 0.35347166, 0.31525201, 0.24089784], [0.47296348, 0.96773956, 0.61336239, 0.6093023 ], [0.42492552, 0.2556728 , 0.86135674, 0.86679779]]) l = tf.constant(np.array([3,2,4])) # shape=(3,) l.eval() Out: array([3, 2, 4]) Exp

I have a tensor

a
, I'd like to loop over the rows and index values based on another tensor
l
. i.e.
l
suggests the length of the vector I need.

sess = tf.InteractiveSession()

a = tf.constant(np.random.rand(3,4)) # shape=(3,4)
a.eval()

Out:
array([[0.35879311, 0.35347166, 0.31525201, 0.24089784],
       [0.47296348, 0.96773956, 0.61336239, 0.6093023 ],
       [0.42492552, 0.2556728 , 0.86135674, 0.86679779]])

l = tf.constant(np.array([3,2,4])) # shape=(3,)
l.eval()

Out:
array([3, 2, 4])

Expected output:

[array([0.35879311, 0.35347166, 0.31525201]),
 array([0.47296348, 0.96773956]),
 array([0.42492552, 0.2556728 , 0.86135674, 0.86679779])]

The tricky part is the fact that

a
could have
None
as first dimension since it's what is usually defined as batch size through placeholder.

I can not just use mask and condition as below since I need to compute the variance of each row individually.

condition = tf.sequence_mask(l, tf.reduce_max(l))
a_true = tf.boolean_mask(a, condition)
a_true

Out:
array([0.35879311, 0.35347166, 0.31525201, 0.47296348, 0.96773956,
   0.42492552, 0.2556728 , 0.86135674, 0.86679779])

I also tried to use

tf.map_fn
but can't get it to work.

elems = (a, l)
tf.map_fn(lambda x: x[0][:x[1]], elems)

Any help will be highly appreciated!


#1

TensorArray object can store tensors of different shapes. However, it is still not that simple. Take a look at this example that does what you want using tf.while_loop() with tf.TensorArray and tf.slice() function:

import tensorflow as tf
import numpy as np

batch_data = np.array([[0.35879311, 0.35347166, 0.31525201, 0.24089784],
                       [0.47296348, 0.96773956, 0.61336239, 0.6093023 ],
                       [0.42492552, 0.2556728 , 0.86135674, 0.86679779]])
batch_idx = np.array([3, 2, 4]).reshape(-1, 1)

x = tf.placeholder(tf.float32, shape=(None, 4))
idx = tf.placeholder(tf.int32, shape=(None, 1))

n_items = tf.shape(x)[0]
init_ary = tf.TensorArray(dtype=tf.float32,
                          size=n_items,
                          infer_shape=False)
def _first_n(i, ta):
    ta = ta.write(i, tf.slice(input_=x[i],
                              begin=tf.convert_to_tensor([0], tf.int32),
                              size=idx[i]))
    return i+1, ta

_, first_n = tf.while_loop(lambda i, ta: i < n_items,
                           _first_n,
                           [0, init_ary])
first_n = [first_n.read(i)                      # <-- extracts the tensors
           for i in range(batch_data.shape[0])] #     that you're looking for

with tf.Session() as sess:
    res = sess.run(first_n, feed_dict={x:batch_data, idx:batch_idx})
    print(res)
    # [array([0.3587931 , 0.35347167, 0.315252  ], dtype=float32),
    #  array([0.47296348, 0.9677396 ], dtype=float32),
    #  array([0.4249255 , 0.2556728 , 0.86135674, 0.8667978 ], dtype=float32)]

Note

  • We still had to use batch_size to extract elements one by one from first_n TensorArray using read() method. We can't use any other method that returns Tensor because we have rows of different sizes (except TensorArray.concat method but it will return all elements stacked in one dimension).

  • If TensorArray will have less elements than index you pass to TensorArray.read(index) you will get InvalidArgumentError.

  • You can't use tf.map_fn because it returns a tensor that must have all elements of the same shape.

The task is simpler if you only need to compute variances of the first n elements of each row (without actually gather elements of different sizes together). In this case we could directly compute variance of sliced tensor, put it to TensorArray and then stack it to tensor:

n_items = tf.shape(x)[0]
init_ary = tf.TensorArray(dtype=tf.float32,
                          size=n_items,
                          infer_shape=False)
def _variances(i, ta, begin=tf.convert_to_tensor([0], tf.int32)):
    mean, varian = tf.nn.moments(
        tf.slice(input_=x[i], begin=begin, size=idx[i]),
        axes=[0]) # <-- compute variance
    ta = ta.write(i, varian) # <-- write variance of each row to `TensorArray`
    return i+1, ta


_, variances = tf.while_loop(lambda i, ta: i < n_items,
                             _variances,
                             [ 0, init_ary])
variances = variances.stack() # <-- read from `TensorArray` to `Tensor`
with tf.Session() as sess:
    res = sess.run(variances, feed_dict={x:batch_data, idx:batch_idx})
    print(res) # [0.0003761  0.06120085 0.07217039]

#2

Thanks for the answer, in my opinion, looping over the batch_size will most likely to raise an IndexError on the last batch of the data, since the last batch usually have less rows than the predefined batch_size

#3

If you only need variances, read the second part that I just appended. It computes variances of sliced tensors.

#4

Glad that worked for you. Consider accepting my answer then.