Reputation: 11198
I am trying to train a text to speech model in TensorFlow. To generate batches, I'm using bucket_by_sequence_length function.
But, after I get each batch, I need to know the shape of each tensor. The problem is that the tensors come from different sized arrays.
I have tried get_shape().as_list()
and 'tf.shape' but I don't get the shape in runtime. For the temporal dimension, I always get None. But, as I am planning to add a custom layer which needs to know the temporal dimension too, I am getting errors for dimension mismatch.
_, (texts, mels, mags, fnames) = tf.contrib.training.bucket_by_sequence_length(
input_length=text_length,
tensors=[text, mel, mag, fname],
batch_size=hp.B,
bucket_boundaries=[i for i in range(minlen + 1, maxlen - 1, 20)],
num_threads=1,
capacity=hp.B,
dynamic_pad=True)
Each time I get batch size, for the tensors I get the following shape with get_shape().as_list()
(10, None, 156) Here, mel, mag are two tensors with unknown temporal dimension (depends on the audio file length).
Here, 10 is the batch size, and 156 is the feature size. I also need the exact temporal dimension where I get None
.
tf.shape
returns Tensor("Text2Mel/AudioEnc/strided_slice:0", shape=(), dtype=int32) I don't get anything useful from this too.
How can I infer the exact shape at runtime?
Actual Source which I have modified: https://github.com/Kyubyong/dc_tts/blob/master/data_load.py
Upvotes: 0
Views: 399
Reputation: 1538
I can see that you are using an older version of TensorFlow. The API has since moved to tf.data.experimental
. If you use the tf.data.Dataset
API then you can use the following code to get the size of the tensor.
def _element_length_fn(x, y=None):
return array_ops.shape(x)[0]
Upvotes: 1