Reputation: 30677
I am new to deep learning, the keras
API, and convolutional networks so apologies before-hand if these mistakes are naive. I am trying to build a simple convolutional neural network for classification. The input data X has 286
samples each with 500
timepoints of 4
dimensions. The dimensions are one-hot-encodings of categorical variables. I wasn't sure what to do for Y
so I just did some clustering of the samples and the one-hot-encoded them to have data to experiment with for the modeling. The Y
target data has 286
samples with one-hot-encodings for 6
categories. My ultimate goal is just to get it to run so I can figure out how to change it for actually useful learning problems and use the hidden layers for feature extraction.
My problem is that I can't get the shapes to match up in the final layer.
The model I made does the following:
(1) Inputs the data
(2) Convolutional layer
(3) Maxpooling layer
(4) Dropout regularization
(5) Large fully connected layer
(6) Output layer
import tensorflow as tf
import numpy as np
# Data Description
print(X[0,:])
# [[0 0 1 0]
# [0 0 1 0]
# [0 1 0 0]
# ...,
# [0 0 1 0]
# [0 0 1 0]
# [0 0 1 0]]
print(Y[0,:])
# [0 0 0 0 0 1]
X.shape, Y.shape
# ((286, 500, 4), (286, 6))
# Tensorboard callback
tensorboard= tf.keras.callbacks.TensorBoard()
# Build the model
# Input Layer taking in 500 time points with 4 dimensions
input_layer = tf.keras.layers.Input(shape=(500,4), name="sequence")
# 1 Dimensional Convolutional layer with 320 filters and a kernel size of 26
conv_layer = tf.keras.layers.Conv1D(320, 26, strides=1, activation="relu", )(input_layer)
# Maxpooling layer
maxpool_layer = tf.keras.layers.MaxPooling1D(pool_size=13, strides=13)(conv_layer)
# Dropout regularization
drop_layer = tf.keras.layers.Dropout(0.3)(maxpool_layer)
# Fully connected layer
dense_layer = tf.keras.layers.Dense(512, activation='relu')(drop_layer)
# Softmax activation to get probabilities for output layer
activation_layer = tf.keras.layers.Activation("softmax")(dense_layer)
# Output layer with probabilities
output = tf.keras.layers.Dense(num_classes)(activation_layer)
# Build model
model = tf.keras.models.Model(inputs=input_layer, outputs=output, name="conv_model")
model.compile(loss="categorical_crossentropy", optimizer="adam", callbacks=[tensorboard])
model.summary()
# _________________________________________________________________
# Layer (type) Output Shape Param #
# =================================================================
# sequence (InputLayer) (None, 500, 4) 0
# _________________________________________________________________
# conv1d_9 (Conv1D) (None, 475, 320) 33600
# _________________________________________________________________
# max_pooling1d_9 (MaxPooling1 (None, 36, 320) 0
# _________________________________________________________________
# dropout_9 (Dropout) (None, 36, 320) 0
# _________________________________________________________________
# dense_16 (Dense) (None, 36, 512) 164352
# _________________________________________________________________
# activation_7 (Activation) (None, 36, 512) 0
# _________________________________________________________________
# dense_17 (Dense) (None, 36, 6) 3078
# =================================================================
# Total params: 201,030
# Trainable params: 201,030
# Non-trainable params: 0
model.fit(X,Y, batch_size=128, epochs=100)
# ValueError: Error when checking target: expected dense_17 to have shape (None, 36, 6) but got array with shape (286, 6, 1)
Upvotes: 4
Views: 1863
Reputation: 2552
Conv1D
's output's shape is a 3-rank tensor (batch, observations, kernels)
:
> x = Input(shape=(500, 4))
> y = Conv1D(320, 26, strides=1, activation="relu")(x)
> y = MaxPooling1D(pool_size=13, strides=13)(y)
> print(K.int_shape(y))
(None, 36, 320)
However, Dense
layers expects a 2-rank tensor (batch, features)
. A Flatten
, GlobalAveragePooling1D
or GlobalMaxPooling1D
separating the convolutions from the denses is sufficient to fix this:
Flatten
will reshape a (batch, observations, kernels)
tensor into a (batch, observations * kernels)
one:
....
y = Conv1D(320, 26, strides=1, activation="relu")(x)
y = MaxPooling1D(pool_size=13, strides=13)(y)
y = Flatten()(y)
y = Dropout(0.3)(y)
y = Dense(512, activation='relu')(y)
....
GlobalAveragePooling1D
will average all observations in (batch, observations, kernels)
tensor, resulting in a (batch, kernels)
one:
....
y = Conv1D(320, 26, strides=1, activation="relu")(x)
y = GlobalAveragePooling1D(pool_size=13, strides=13)(y)
y = Flatten()(y)
y = Dropout(0.3)(y)
y = Dense(512, activation='relu')(y)
....
There seems to be a problem with your tensorboard callback initialization also. This one is easy to fix.
For temporal data processing, take a look at the TimeDistributed wrapper.
Upvotes: 2