kneki
kneki

Reputation: 75

LSTM using word embeddings and TFIDF vectors

I am trying to run LSTM on a dataset that has text attributes and TFIDF vectors. I word embed the text and input to LSTM layer. Next, I concatenate the LSTM output and the TFIDF vectors. However, line 2 in the code below throws the following error:

"ValueError: Layer lstm_1 was called with an input that isn't a symbolic tensor. Received type: . Full input: []. All inputs to the layer should be tensors."

The code is given below, where len(term_Index)+1 = 9891, emb_Dim=100, emb_Mat contains floats and has shape [9891,100], and sen_Len=1000:

    embed = Embedding(len(term_Index) + 1, emb_Dim, weights=[emb_Mat], 
    input_length=sen_Len, trainable=False)
    lstm = LSTM(60, dropout=0.1, recurrent_dropout=0.1)(embed)
    tfidf_i = Input(shape=(max_terms_art,))
    conc = Concatenate()(lstm, tfidf_i)
    drop = Dropout(0.2)(conc)
    dens = Dense(1)(drop)
    acti = Activation('sigmoid')(dens)

    model = Model([embed, tfidf_i], acti)
    model.compile(optimizer='adam', loss='binary_crossentropy', metrics = ['accuracy'])
    history = model.fit([features_Train, TFIDF_Train], target_Train, epochs = 50, batch_size=128, validation_split=0.20)

Upvotes: 1

Views: 2309

Answers (1)

Chompakorn CChaichot
Chompakorn CChaichot

Reputation: 309

It seems that I cannot reproduce your error. After I added the bracket, the code run perfectly. See my code below:

from tensorflow.keras.layers import Input, Embedding, LSTM, Concatenate, Dropout, Dense, Activation
from tensorflow.keras import Model
import tensorflow as tf
import numpy as np

emb_Mat = tf.random.normal((9891,100)).numpy()
term_Index = tf.random.uniform((9890,)).numpy()
sen_Len=1000
emb_Dim=100
max_terms_art=500

inp = Input(shape=(len(term_Index),))
embed = Embedding(len(term_Index) + 1, emb_Dim, weights=[emb_Mat], input_length=sen_Len, trainable=False)(inp)
lstm = LSTM(60, dropout=0.1, recurrent_dropout=0.1)(embed)
tfidf_i = Input(shape=(max_terms_art,))
conc = Concatenate()([lstm, tfidf_i])
drop = Dropout(0.2)(conc)
dens = Dense(1)(drop)
acti = Activation('sigmoid')(dens)

Model([inp, tfidf_i], acti).summary()

Outputs:

Model: "model_2"
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
==================================================================================================
input_16 (InputLayer)           [(None, 9890)]       0                                            
__________________________________________________________________________________________________
embedding_15 (Embedding)        (None, 9890, 100)    989100      input_16[0][0]                   
__________________________________________________________________________________________________
lstm_8 (LSTM)                   (None, 60)           38640       embedding_15[0][0]               
__________________________________________________________________________________________________
input_17 (InputLayer)           [(None, 500)]        0                                            
__________________________________________________________________________________________________
concatenate_2 (Concatenate)     (None, 560)          0           lstm_8[0][0]                     
                                                                 input_17[0][0]                   
__________________________________________________________________________________________________
dropout_1 (Dropout)             (None, 560)          0           concatenate_2[0][0]              
__________________________________________________________________________________________________
dense_1 (Dense)                 (None, 1)            561         dropout_1[0][0]                  
__________________________________________________________________________________________________
activation_1 (Activation)       (None, 1)            0           dense_1[0][0]                    
==================================================================================================
Total params: 1,028,301
Trainable params: 39,201
Non-trainable params: 989,100
__________________________________________________________________________________________________

Upvotes: 3

Related Questions