Daniel Klauser
Daniel Klauser

Reputation: 454

Convolution1D to Convolution2D

Summarize the Problem I have a Raw Signal from a Sensor which is 76000 Datapoints long. I want to process those data with a CNN. To do that, I thought I could use a Lambda Layer to form a Short Time Fourier Transformation from the Raw Signal such as

x = Lambda(lambda v: tf.abs(tf.signal.stft(v,frame_length=frame_length,frame_step=frame_step)))(x)

which totally works. But I want to go one step further and process the Raw data in advance. In the hope that a Convolution1D layer works as a filter to let some of the frequency pass and block others.

What I tried I do have the two separate (Conv1D example for raw Data processing & the Conv2D example where I process the STFT "image") up and running. But I want to combine these.

Conv1D Where the input is : input = Input(shape = (76000,))

  x = Lambda(lambda v: tf.expand_dims(v,-1))(input)
  x = layers.Conv1D(filters =10,kernel_size=100,activation = 'relu')(x)
  x = Flatten()(x)
  output = Model(input, x)

Model: "model"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input_1 (InputLayer)         [(None, 76000)]           0         
_________________________________________________________________
lambda_2 (Lambda)            (None, 76000, 1)          0         
_________________________________________________________________
conv1d (Conv1D)              (None, 75901, 10)         1010      
________________________________________________________________

Conv2D same input

  x = Lambda(lambda v:tf.expand_dims(tf.abs(tf.signal.stft(v,frame_length=frame_length,frame_step=frame_step)),-1))(input)
  x = BatchNormalization()(x)
Model: "model_4"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input_6 (InputLayer)         [(None, 76000)]           0         
_________________________________________________________________
lambda_8 (Lambda)            (None, 751, 513, 1)       0         
_________________________________________________________________
batch_normalization_3 (Batch (None, 751, 513, 1)       4         
_________________________________________________________________
. . .
. . . 
flatten_4 (Flatten)          (None, 1360)              0         
_________________________________________________________________
dropout_2 (Dropout)          (None, 1360)              0         
_________________________________________________________________
dense_2 (Dense)              (None, 1)                 1361      

Im Looking for a way to combine the start from the "conv1d" to the "lambda_8" layer. If I put them together I ll get :

  x = Lambda(lambda v: tf.expand_dims(v,-1))(input)
  x = layers.Conv1D(filters =10,kernel_size=100,activation = 'relu')(x)
  #x = Flatten()(x)
  x = Lambda(lambda v:tf.expand_dims(tf.abs(tf.signal.stft(v,frame_length=frame_length,frame_step=frame_step)),-1))(x)
Layer (type)                 Output Shape              Param #   
=================================================================
input_6 (InputLayer)         [(None, 76000)]           0         
_________________________________________________________________
lambda_17 (Lambda)           (None, 76000, 1)          0         
_________________________________________________________________
conv1d_6 (Conv1D)            (None, 75901, 10)         1010      
_________________________________________________________________
lambda_18 (Lambda)           (None, 75901, 0, 513, 1)  0         <-- Wrong
=================================================================

Which is not what I am looking for. It should look more like (None,751,513,10,1). So far I could not find a suitable solution. Can someone help me?

Thanks in advance!

Upvotes: 0

Views: 181

Answers (1)

Daniel M&#246;ller
Daniel M&#246;ller

Reputation: 86650

From the documentation, it seems the stft only accepts (..., length) inputs, it doesn't accept (..., length, channels).

Thus, the first suggestion is to move the channels to another dimension first, to keep the length at the last index and make the function work.

Now, of course, you will need matching lengths, you can't match 76000 with 75901. Thus the second suggestion is to use a padding='same' in the 1D convolutions to keep the lengths equal.

And lastly, since you will already have 10 channels in the result of the stft, you don't need to expand dims in the last lambda.

Summarizing:

1D part

inputs = Input((76000,)) #(batch, 76000)

c1Out = Lambda(lambda x: K.expand_dims(x, axis=-1))(inputs) #(batch, 76000, 1)
c1Out = Conv1D(10, 100, activation = 'relu', padding='same')(c1Out) #(batch, 76000, 10)

#permute for putting length last, apply stft, put the channels back to their position
c1Stft = Permute((2,1))(c1Out) #(batch, 10, 76000)
c1Stft = x = Lambda(lambda v: tf.abs(tf.signal.stft(v,
                                                    frame_length=frame_length,
                                                    frame_step=frame_step)
                                     )
                    )(c1Stft) #(batch, 10, probably 751, probably 513)
c1Stft = Permute((2,3,1))(c1Stft) #(batch, 751, 513, 10)

2D part, your code seems ok:

c2Out = Lambda(lambda v: tf.expand_dims(tf.abs(tf.signal.stft(v,
                                                              frame_length=frame_length,
                                                              frame_step=frame_step)
                                               ),
                                        -1))(inputs) #(batch, 751, 513, 1)

Now that everything has compatible dimensions

#maybe
#c2Out = Conv2D(10, ..., padding='same')(c2Out) 

joined = Concatenate()([c1Stft, c2Out]) #(batch, 751, 513, 11) #maybe (batch, 751, 513, 20)

further = BatchNormalization()(joined)
further = Conv2D(...)(further)

Warning: I don't know if they made stft differentiable or not, the Conv1D part will only work if the gradients are defined.

Upvotes: 1

Related Questions