How to get the tensor shape dynamically for bucket_by_sequence_length with dynamic padding?

Question

I am trying to train a text to speech model in TensorFlow. To generate batches, I'm using bucket_by_sequence_length function.

But, after I get each batch, I need to know the shape of each tensor. The problem is that the tensors come from different sized arrays.

I have tried get_shape().as_list() and 'tf.shape' but I don't get the shape in runtime. For the temporal dimension, I always get None. But, as I am planning to add a custom layer which needs to know the temporal dimension too, I am getting errors for dimension mismatch.

_, (texts, mels, mags, fnames) = tf.contrib.training.bucket_by_sequence_length(
                                            input_length=text_length,
                                            tensors=[text, mel, mag, fname],
                                            batch_size=hp.B,
                                            bucket_boundaries=[i for i in range(minlen + 1, maxlen - 1, 20)],
                                            num_threads=1,
                                            capacity=hp.B,
                                            dynamic_pad=True)

Each time I get batch size, for the tensors I get the following shape with get_shape().as_list() (10, None, 156) Here, mel, mag are two tensors with unknown temporal dimension (depends on the audio file length).

Here, 10 is the batch size, and 156 is the feature size. I also need the exact temporal dimension where I get None.

tf.shape returns Tensor("Text2Mel/AudioEnc/strided_slice:0", shape=(), dtype=int32) I don't get anything useful from this too.

How can I infer the exact shape at runtime?

Actual Source which I have modified: https://github.com/Kyubyong/dc_tts/blob/master/data_load.py

How to get the tensor shape dynamically for bucket_by_sequence_length with dynamic padding?

Answers (1)

Related Questions