Reputation: 13
I understand that Elmo uses CNN over characters for character embeddings. However I do not understand how the character embeddings are concatenated with word embeddings in the Highway network. In the Elmo paper most of the evaluations use Glove for word embeddings and CNN character embedding together which make sense as they have mentioned the word embeddings. But for pre-trained models like the one in TF-Hub with which word embeddings do we concatenate with character embeddings in Highway layer?
Please help me understand if you can.
Upvotes: 0
Views: 184
Reputation: 213
Concatenation happens inside the https://tfhub.dev/google/elmo/3 model. When using word_emb
output, one can get the embedding for each token in the input. The embedding can be used for classification or other modeling tasks similar to BERT/transformer based models. The model also provides direct access to the some hidden state of the LSTM through lstm_outputs1
and lstm_outputs2
.
Upvotes: 1