Animesh Karnewar
Animesh Karnewar

Reputation: 436

tf.nn.conv2d_transpose output_shape dynamic batch_size

The documentation of tf.nn.conv2d_transpose says:

tf.nn.conv2d_transpose(
    value,
    filter,
    output_shape,
    strides,
    padding='SAME',
    data_format='NHWC',
    name=None
)

The output_shape argument requires a 1D tensor specifying the shape of the tensor output by this op. Here, since my conv-net part has been built entirely on dynamic batch_length placeholders, I can't seem to device a workaround to the static batch_size requirement of the output_shape for this op.

There are many discussions around the web for this, however, I couldn't find any solid solution to this issue. Most of them are hacky ones with a global_batch_size variable defined. I wish to know the best possible solution to this problem. This trained model is going be shipped as a deployed service.

Upvotes: 2

Views: 3645

Answers (4)

Saten Harutyunyan
Saten Harutyunyan

Reputation: 9

Just use tf.shape(X_batch)[0] when you need train_batch_size

Upvotes: 0

Ali Salehi
Ali Salehi

Reputation: 1073

You can use the following code to calculate the output shape parameter for tf.nn.conv2d_transpose based on the input to this layer (input) and the number of outputs from this layer (num_outputs). Of course, you have the filter size, padding, stride, and data_format.

def calculate_output_shape(input, filter_size_h, filter_size_w, 
    stride_h, stride_w, num_outputs, padding='SAME', data_format='NHWC'):

    #calculation of the output_shape:
    if data_format == "NHWC":
        input_channel_size = input.get_shape().as_list()[3]
        input_size_h = input.get_shape().as_list()[1]
        input_size_w = input.get_shape().as_list()[2]
        stride_shape = [1, stride_h, stride_w, 1]
        if padding == 'VALID':
            output_size_h = (input_size_h - 1)*stride_h + filter_size_h
            output_size_w = (input_size_w - 1)*stride_w + filter_size_w
        elif padding == 'SAME':
            output_size_h = (input_size_h - 1)*stride_h + 1
            output_size_w = (input_size_w - 1)*stride_w + 1
        else:
            raise ValueError("unknown padding")

        output_shape = tf.stack([tf.shape(input)[0], 
                            output_size_h, output_size_w, 
                            num_outputs])
    elif data_format == "NCHW":
        input_channel_size = input.get_shape().as_list()[1]
        input_size_h = input.get_shape().as_list()[2]
        input_size_w = input.get_shape().as_list()[3]
        stride_shape = [1, 1, stride_h, stride_w]
        if padding == 'VALID':
            output_size_h = (input_size_h - 1)*stride_h + filter_size_h
            output_size_w = (input_size_w - 1)*stride_w + filter_size_w
        elif padding == 'SAME':
            output_size_h = (input_size_h - 1)*stride_h + 1
            output_size_w = (input_size_w - 1)*stride_w + 1
        else:
            raise ValueError("unknown padding")

        output_shape = tf.stack([tf.shape(input)[0], 
                                output_size_h, output_size_w, num_outputs])
    else:
        raise ValueError("unknown data_format")

    return output_shape

Upvotes: 2

nessuno
nessuno

Reputation: 27042

You can use the dynamic shape of a reference tensor, instead of the static one.

Usually, wehn you use the conv2d_transpose operation, your're "upsampling" a layer in order to obtain a certain shape of another tensor in your network.

If, for instance, you want to replicate the shape of the input_tensor tensor, you can do something like:

import tensorflow as tf

input_tensor = tf.placeholder(dtype=tf.float32, shape=[None, 16, 16, 3])
# static shape
print(input_tensor.shape)

conv_filter = tf.get_variable(
    'conv_filter', shape=[2, 2, 3, 6], dtype=tf.float32)
conv1 = tf.nn.conv2d(
    input_tensor, conv_filter, strides=[1, 2, 2, 1], padding='SAME')
# static shape
print(conv1.shape)

deconv_filter = tf.get_variable(
    'deconv_filter', shape=[2, 2, 6, 3], dtype=tf.float32)

deconv = tf.nn.conv2d_transpose(
    input_tensor,
    filter=deconv_filter,
    # use tf.shape to get the dynamic shape of the tensor
    # know at RUNTIME
    output_shape=tf.shape(input_tensor),
    strides=[1, 2, 2, 1],
    padding='SAME')
print(deconv.shape)

The program outputs:

(?, 16, 16, 3)
(?, 8, 8, 6)
(?, ?, ?, ?)

As you can see, the last shape is completely unknown at compile time, because I'm setting the output shape of conv2d_transpose with the result of the tf.shape operation, that returns and thus its values can change at runtime.

Upvotes: 3

Prasad
Prasad

Reputation: 6034

You can use value of -1 to substitute the exact value of batch_size. Consider the below example whereby I convert variable batch sized input tensor of shape (16, 16, 3) to (32, 32, 6).

import tensorflow as tf

input_tensor = tf.placeholder(dtype = tf.float32, shape = [None, 16, 16, 3])
print (input_tensor.shape)

my_filter = tf.get_variable('filter', shape = [2, 2, 6, 3], dtype = tf.float32)
conv = tf.nn.conv2d_transpose(input_tensor,
                              filter = my_filter,
                              output_shape = [-1, 32, 32, 6],
                              strides = [1, 2, 2, 1],
                              padding = 'SAME')
print (conv.shape)

Will Output you:

(?, 16, 16, 3)
(?, 32, 32, 6)

Upvotes: 1

Related Questions