Reputation: 61
I have a series of 1D time series that, through a series of convolutional layers, end up in the form of:
(batch_size, time_series_length, num_filters)
I would like to manually upsample the tensors by inserting alternating zeros (much like a tranposed convolution), such that the new dimensionality becomes
(batch_size, 2*time_series_length, num_filters)
in order to be able to include an additional step before a convolutional layer. It is simple to do this in numpy, for example, with np.insert, but how does one do it with tensors?
I have looked at a few similar posts such as this, but I don't understand how to do this with multiple dimensions while preserving the other dimensions. Any thoughts?
Upvotes: 5
Views: 1413
Reputation: 1
Here is a solution which inserts factor - 1
zeros in between time samples for a tensor of shape (batch_size, time_series_length, num_channels):
def upsample(x, factor):
# x has shape (batch_size, time_series_length, num_channels)
L = tf.shape(x)[1] # time series length
## repeat each sample `factor` times
x = tf.repeat(x, tf.repeat(factor, L), axis=1)
## create a mask in order to replace the inserted samples by zeroes
mask = tf.reshape(tf.repeat([ tf.concat([[factor], tf.zeros(factor-1)], 0) ], L, axis=0), [-1])
# mask looks like [1, 0, 0, 0, 1, 0, 0, 0, 1, ...] (here factor = 4)
## multiply by mask
x = x * mask[tf.newaxis, :, tf.newaxis] # mask is reshaped to broadcast multiplication along axis 1
## low-pass filtering:
# from scipy.signal import firwin2
# filters = tf.convert_to_tensor(firwin2(32*factor, [0.0, 0.95/factor, 1.0/factor, 1.0], [1.0, 1.0, 0.0, 0.0], window="blackman"), tf.float32)[:,tf.newaxis, tf.newaxis]
# x = tf.nn.conv1d(x, filters, 1, 'SAME')
return x
Upvotes: 0
Reputation: 2316
I was working on a similar problem with images. I wanted to go from batch, height, width, in_channels
to batch, 2*height, 2*width, in_channels
. Like you said this is very much like a transposed convolution so I ended up using tf.nn.conv2d_transpose
with strides=2
and filters=tf.ones([1, 1, 1, 1])
:
upsampled_images = tf.nn.conv2d_transpose(images, tf.ones([1, 1, 1, 1]), output_shape, strides=2, padding='VALID')
This worked perfectly so I think the same will be true for 1d by just using tf.nn.conv1d_transpose
with filters=tf.ones([1, 1, 1])
.
I know this question is old and you probably figured out a way since, but I was looking for the answer for long myself, so it will probably help others.
As pointed out by @A Roebel, this answer works only for single-channel images. Here is an extension to the multi-channel case, with a complete example:
import tensorflow as tf
image = tf.random.normal(shape=[1, 2, 2, 2])
def enlarge_one_channel_images(images):
batch_size, height, width, n_channels = tf.shape(image) # might not work in graph mode
output_shape = [batch_size, 2*height, 2*width, 1]
upsampled_images = tf.nn.conv2d_transpose(images, tf.ones([1, 1, 1, 1]), output_shape, strides=2, padding='VALID')
return upsampled_images
image_reshaped = tf.transpose(image, [3, 0, 1, 2])[..., None]
batch_size, height, width, n_channels = tf.shape(image) # might not work in graph mode
expected_output_shape = [batch_size, 2*height, 2*width, 1]
image_reshaped_enlarged = tf.map_fn(
enlarge_one_channel_images,
image_reshaped,
fn_output_signature=tf.TensorSpec(expected_output_shape)
)
image_enlarged = tf.transpose(image_reshaped_enlarged[..., 0], [1, 2, 3, 0])
As also pointed out by @A Roebel in his answer this might not be the most efficient solution however.
I have not run the tests myself, but I agree that the additional convolution with the identity filter will surely slow things down, although I am not sure exactly what the expected acceleration when using tf.function
can be.
Upvotes: 1
Reputation: 333
I just had the same problem and found a problem in the solution shared by zaccharie-ramzi. The given solutions does not work with signals with more than a singe channel. I suggest here a fix for the solution with conXd_transpose together with a more efficient solution by means of reshaping and padding.
If you store the code below in a script named ./upsample_with_padding.py
you can reproduce the following experiments. The script starts with tensor
sig = tf.ones((60,10000,args.n_channels))
that is supposed to be upsampled by a factor upfac
by means of inserting 0s in time direction for all channels. Default upfac is 4, default number of channels is 2.
You can run it with argument check to see the shapes and check that the results obtained with the padding solution and the solution using the corrected implementation of the answer with transposed convolution are equivalent.
> ./upsample_with_padding.py --check
upsig_conv (60, 40000, 2)
upsig_pad (60, 40000, 2)
diff: tf.Tensor(0.0, shape=(), dtype=float32)
Comparing the computational speed wee can see that the use of padding is much more efficient
> ./upsample_with_padding.py
timeit conv: 9.84551206199103
timeit pad : 1.459020125999814
This is expected because the convXd_transpose operation will perform padding as well but then has to convolve with a identity filter.
Here the script
#! /usr/bin/env python3
import os
# silence verbose TF feedback
if 'TF_CPP_MIN_LOG_LEVEL' not in os.environ:
os.environ['TF_CPP_MIN_LOG_LEVEL'] = "2"
from argparse import ArgumentParser
import tensorflow as tf
import timeit
def up_pad(sig, upfac):
upsigp = tf.expand_dims(sig, axis=2)
upsigp = tf.pad(upsigp, ((0, 0), (0, 0), (0, upfac-1), (0, 0)))
return tf.reshape(upsigp, shape=(sig.shape[0], sig.shape[1]*upfac, sig.shape[2]))
def up_conv(sig, upfac):
upsigc = tf.expand_dims(sig, axis=-1)
filter = tf.ones([1, 1, 1, 1])
return tf.nn.conv2d_transpose(upsigc, filters=filter, strides=(upfac,1), padding="VALID", data_format="NHWC",
output_shape=(sig.shape[0], sig.shape[1]*upfac, sig.shape[2], 1))[:,:,:,0]
parser=ArgumentParser()
parser.add_argument("--check", action="store_true")
parser.add_argument("--upfac", default=4, type=int)
parser.add_argument("--n_channels", default=2, type=int)
args=parser.parse_args()
sig = tf.ones((60,10000,args.n_channels))
if args.check:
upsig_conv = up_conv(sig, upfac=args.upfac)
upsig_pad = up_pad(sig, upfac=args.upfac)
print(f"upsig_conv {upsig_conv.shape}")
print(f"upsig_pad {upsig_pad.shape}")
print("diff:", tf.reduce_max(tf.abs(upsig_conv - upsig_pad)))
else:
print("timeit conv:",timeit.timeit(f'up_conv(sig, upfac={args.upfac})', globals=globals(), number=3000))
print("timeit pad :",timeit.timeit(f'up_pad(sig, upfac={args.upfac})', globals=globals(), number=3000))
Upvotes: 0
Reputation: 1082
The short answer is: use tf.scatter_nd
The tricky part is constructing the indices for this operation. The following code example shows how you can do this for Tensors with arbitrarily many dimensions.
import itertools
import numpy as np
import tensorflow as tf
def pad_strided(x, strides, name=None):
# Preparatory steps and sanity checks.
input_shape = x.shape.as_list()
# Because life gets easier, we let the consumer specify a striding value for EACH dimension
assert len(strides) == len(input_shape), "Rank of strides and x.shape must be the same"
output_shape = [s_in * s for s_in, s in zip(input_shape, strides)]
"""
Calculate the striding indices for EACH dimension.
"""
index_ranges = [list(range(0, s_out, s)) for s_out, s in zip(output_shape, strides)]
"""
Expand the indices per dimension. The resulting array has shape [n_elements, n_dims].
n_elements is the number of values in the input tensor x. So the product of the input
shape. n_dims is the number of input (and output) dimensions.
"""
indices_flat = np.array(list(itertools.product(*index_ranges)))
"""
Reshape the flat index array to have the same dimensions as the input plus an additional
dimension. If the input had [s0, s1, ..., sn], then indices will have
[s0, s1, ..., sn, n_dims]. I.e. the rank will be 1 higher than that of the input tensor.
"""
indices = np.reshape(indices_flat, input_shape + [-1])
""" Now we simply call the TensorFlow operator """
with tf.variable_scope(name, default_name="pad_strided"):
t_indices = tf.constant(indices, dtype=tf.int32, name="indices")
t_output_shape = tf.constant(output_shape, name="output_shape")
return tf.scatter_nd(t_indices, x, t_output_shape)
session = tf.Session()
batch_size = 1
time_series_length = 6
num_filters = 3
t_in = tf.random.uniform((batch_size, time_series_length, num_filters))
# Specify a stride 2 for the time_series dimension
t_out = pad_strided(t_in, strides=[1, 2, 1])
original, strided = session.run([t_in, t_out])
print(f"Input Tensor:\n{original[:,:,:]}")
print(f"Output Tensor:\n{strided[:,:,:]}")
The output would then be for instance
Input Tensor:
[[[0.0678339 0.07883668 0.49193358]
[0.5029118 0.8639555 0.74302936]
[0.995087 0.6315181 0.11990702]
[0.95606446 0.29059124 0.12656784]
[0.8278991 0.8518325 0.4033165 ]
[0.78434443 0.7894305 0.6251142 ]]]
Output Tensor:
[[[0.0678339 0.07883668 0.49193358]
[0. 0. 0. ]
[0.5029118 0.8639555 0.74302936]
[0. 0. 0. ]
[0.995087 0.6315181 0.11990702]
[0. 0. 0. ]
[0.95606446 0.29059124 0.12656784]
[0. 0. 0. ]
[0.8278991 0.8518325 0.4033165 ]
[0. 0. 0. ]
[0.78434443 0.7894305 0.6251142 ]
[0. 0. 0. ]]]
Upvotes: 0