Reputation: 10011
I have the following data:
overall reviewTime reviewerID \
0 4 08 24, 2010 u04428712
1 5 10 31, 2009 u06946603
2 4 10 13, 2015 u92735614
3 5 06 28, 2017 u35112935
4 4 10 12, 2015 u07141505
reviewText \
0 So is Katy Perry's new album "Teenage Dream" c...
1 I got this CD almost 10 years ago, and given t...
2 I REALLY enjoy this pairing of Anderson and Po...
3 Finally got it . It was everything thought it ...
4 Look at all star cast. Outstanding record, pl...
summary unixReviewTime \
0 Amazing that I Actually Bought This...More Ama... 1282608000
1 Excellent album 1256947200
2 Love the Music, Hate the Light Show 1444694400
3 Great 1498608000
4 Love these guys. 1444608000
category price itemID reviewHash image
0 Pop $35.93 p70761125 85559980 NaN
1 Alternative Rock $11.28 p85427891 41699565 NaN
2 Pop $89.86 p82172532 24751194 NaN
3 Pop $11.89 p15255251 22820631 NaN
4 Jazz $15.24 p82618188 53377470 NaN
Index(['overall', 'reviewTime', 'reviewerID', 'reviewText', 'summary',
'unixReviewTime', 'category', 'price', 'itemID', 'reviewHash', 'image'],
dtype='object')
(200000, 11)
and shape of data:
X train shape (160000, 100)
X test shape (40000, 100)
y train shape (160000, 5)
y test shape (40000, 5)
Code for modelling:
# Add sequential model
model = Sequential()
# Add embedding layer
# No of output dimenstions is 100 as we embedded with Glove 100d
Embed_Layer = Embedding(vocab_size, 100, weights=[embedding_matrix], input_length=(MAX_SEQUENCE_LENGTH,), trainable=True)
#define Inputs
review_input = Input(shape=(MAX_SEQUENCE_LENGTH,), dtype= 'int32', name = 'review_input')
review_embedding = Embed_Layer(review_input)
Flatten_Layer = Flatten()
review_flatten = Flatten_Layer(review_embedding)
output_size = 2
dense1 = Dense(100,activation='relu')(review_flatten)
dense2 = Dense(32,activation='relu')(dense1)
predict = Dense(2,activation='softmax')(dense2)
model = Model(inputs=[review_input],outputs=[predict])
model.compile(optimizer='adam',loss='categorical_crossentropy',metrics=['acc'])
print(model.summary())
Out:
Model: "functional_33"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
review_input (InputLayer) [(None, 100)] 0
_________________________________________________________________
embedding_17 (Embedding) (None, 100, 100) 22228800
_________________________________________________________________
flatten_17 (Flatten) (None, 10000) 0
_________________________________________________________________
dense_48 (Dense) (None, 100) 1000100
_________________________________________________________________
dense_49 (Dense) (None, 32) 3232
_________________________________________________________________
dense_50 (Dense) (None, 2) 66
=================================================================
Total params: 23,232,198
Trainable params: 23,232,198
Non-trainable params: 0
_________________________________________________________________
None
After y_train = tf.one_hot(y_train, 5)
, then during the fitting of the model using model.fit(X_train, y_train, epochs= 5, batch_size=32, verbose=True, validation_data=(X_test, y_test))
, the following error occurs:
ValueError: in user code:
/home/x/.local/lib/python3.8/site-packages/tensorflow/python/keras/engine/training.py:806 train_function *
return step_function(self, iterator)
/home/x/.local/lib/python3.8/site-packages/tensorflow/python/keras/engine/training.py:796 step_function **
outputs = model.distribute_strategy.run(run_step, args=(data,))
/home/x/.local/lib/python3.8/site-packages/tensorflow/python/distribute/distribute_lib.py:1211 run
return self._extended.call_for_each_replica(fn, args=args, kwargs=kwargs)
/home/x/.local/lib/python3.8/site-packages/tensorflow/python/distribute/distribute_lib.py:2585 call_for_each_replica
return self._call_for_each_replica(fn, args, kwargs)
/home/x/.local/lib/python3.8/site-packages/tensorflow/python/distribute/distribute_lib.py:2945 _call_for_each_replica
return fn(*args, **kwargs)
/home/x/.local/lib/python3.8/site-packages/tensorflow/python/keras/engine/training.py:789 run_step **
outputs = model.train_step(data)
/home/x/.local/lib/python3.8/site-packages/tensorflow/python/keras/engine/training.py:748 train_step
loss = self.compiled_loss(
/home/x/.local/lib/python3.8/site-packages/tensorflow/python/keras/engine/compile_utils.py:204 __call__
loss_value = loss_obj(y_t, y_p, sample_weight=sw)
/home/x/.local/lib/python3.8/site-packages/tensorflow/python/keras/losses.py:149 __call__
losses = ag_call(y_true, y_pred)
/home/x/.local/lib/python3.8/site-packages/tensorflow/python/keras/losses.py:253 call **
return ag_fn(y_true, y_pred, **self._fn_kwargs)
/home/x/.local/lib/python3.8/site-packages/tensorflow/python/util/dispatch.py:201 wrapper
return target(*args, **kwargs)
/home/x/.local/lib/python3.8/site-packages/tensorflow/python/keras/losses.py:1535 categorical_crossentropy
return K.categorical_crossentropy(y_true, y_pred, from_logits=from_logits)
/home/x/.local/lib/python3.8/site-packages/tensorflow/python/util/dispatch.py:201 wrapper
return target(*args, **kwargs)
/home/x/.local/lib/python3.8/site-packages/tensorflow/python/keras/backend.py:4687 categorical_crossentropy
target.shape.assert_is_compatible_with(output.shape)
/home/x/.local/lib/python3.8/site-packages/tensorflow/python/framework/tensor_shape.py:1134 assert_is_compatible_with
raise ValueError("Shapes %s and %s are incompatible" % (self, other))
ValueError: Shapes (32, 5, 5) and (32, 2) are incompatible
Can someone assists with an idea of how to solve that?
Upvotes: 0
Views: 707
Reputation: 302
If you are trying to classify between 5 classes you shoud use 5 nodes in the last dense layer:
predict = Dense(5,activation='softmax')(dense2)
Upvotes: 1
Reputation: 6367
The model output shape is (None, 2). Try:
y_train = tf.one_hot(y_train, 2)
Upvotes: 1