priegueee
priegueee

Reputation: 57

Bag of Words embedding layer in Keras?

I have a very simple Keras model that looks like:

model = Sequential()
model.add(Dense(hidden_size, input_dim=n_inputs, activation='relu'))
model.add(Dense(n_outputs, activation='softmax'))

The embedding that I am using is Bag of Words.

I want to include the embedding step as part of the model. I thought of doing it as a embedding layer... but I don't know wether is possible to implement a Bag of Words model as a Keras Embedding Layer? I know you can pass pre-trained BoW and GloVe embedding models to Embedding layers, so I was wondering if something like that could be done with BOW?

Any ideas will be much appreciated! :D

Upvotes: 0

Views: 578

Answers (1)

Jindřich
Jindřich

Reputation: 11240

The embeddings layer in Keras (and basically all deep learning frameworks) does a lookup: for a token index, it returns a dense embedding.

The question is how do you want to embed a bag-of-words representation? I think one of the reasonable options would be:

  1. Do the embedding lookup for every word,
  2. Average the token embeddings and thus get a single vector representing the BoW. In Keras, you can use the GlobalAveragePooling1D for that.

Averaging is probably a better option than summing because the output will be of the same scale for sequences of different lengths.

Note that for the embedding lookup, you need the input to has a shape of batch × sequence length with integers corresponding to token indices in a vocabulary.

Upvotes: 1

Related Questions