Reputation: 77
Learning about ragged tensors and how can I use them with tensorflow. My example
xx = tf.ragged.constant([
[0.1, 0.2],
[0.4, 0.7 , 0.5, 0.6]
])
yy = np.array([[0, 0, 1], [1,0,0]])
mdl = tf.keras.Sequential([
tf.keras.layers.InputLayer(input_shape=[None], batch_size=2, dtype=tf.float32, ragged=True),
tf.keras.layers.LSTM(64),
tf.keras.layers.Dense(3, activation='softmax')
])
mdl.compile(loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
optimizer=tf.keras.optimizers.Adam(1e-4),
metrics=['accuracy'])
mdl.summary()
history = mdl.fit(xx, yy, epochs=10)
The error
Input 0 of layer lstm_152 is incompatible with the layer: expected ndim=3, found ndim=2. Full shape received: [2, None]
I am not sure if I can use ragged tensors like this. All examples I found have embedding layer before LSTM, but what I don't want to create additional embedding layer.
Upvotes: 3
Views: 4445
Reputation: 1
idk if it helps or not, but i know that rnn and lstm require a fixed size input to process and move forward. we use ragged tensors bcoz it helps us to not change our data. most of the times we use embeddings along with rnn or lstm. now dk where exactly, but in one of these layers, some implicit padding is happening (i.e., even though shapes are different at the tart, but to process rnn/lstm e need same shape input and so tensor do some padding by itself, some intrnal mechanism of sorts) what i am trying to say is if you do not wish to use embeddings then you can try padding it yourself, you wouldn't need ragged tensor then, you can simply use normal tensor. or you can use embeddings. hope it was helpful to anyone, and i really hope that my explaination is correct, if it's wrong then pls lmk, i will take my answer down. thanks
Upvotes: 0
Reputation: 1151
I recommend to use Input layer rather than InputLayer, you often not need to use InputLayer, Anyway the probelm that the shape of your input and LSTM layer input shape was wrong , here the modification i have made with some comments.
# xx should be 3d for LSTM
xx = tf.ragged.constant([
[[0.1, 0.2]],
[[0.4, 0.7 , 0.5, 0.6]]
])
"""
Labels represented as OneHotEncoding so you
should use CategoricalCrossentropy instade of SparseCategoricalCrossentropy
"""
yy = np.array([[0, 0, 1], [1,0,0]])
# For ragged tensor , get maximum sequence length
max_seq = xx.bounding_shape()[-1]
mdl = tf.keras.Sequential([
# Input Layer with shape = [Any, maximum sequence length]
tf.keras.layers.Input(shape=[None, max_seq], batch_size=2, dtype=tf.float32, ragged=True),
tf.keras.layers.LSTM(64),
tf.keras.layers.Dense(3, activation='softmax')
])
# CategoricalCrossentropy
mdl.compile(loss=tf.keras.losses.CategoricalCrossentropy(from_logits=True),
optimizer=tf.keras.optimizers.Adam(1e-4),
metrics=['accuracy'])
mdl.summary()
history = mdl.fit(xx, yy, epochs=10)
Upvotes: 3