What model is Rasa NLU entity extraction using? Is it LSTM or just a simple neural network?

What kind of model is RASA NLU using to extract the entities and intents after word embedding?

Upvotes: 0

Answers (1)

Patrizio G

Reputation: 362

This blog post from Rasa clarifies some aspects.

With Rasa you will first train a vectorizer that transforms each document in a N-dimensional vector, where N is the size of your vocabulary. This is exactly what scikit-learn's CountVectorizer does.

Each intent embedding is instead built as an one-hot vector (or a vector with more 1s if you have "mixed" intents). Each of these vectors has the same dimensions of a document embedding, so I guess N may actually be (vocabulary size) + (number of intents).

At that point Rasa will train a neural network (default: 2 hidden layers) where the loss function is designed to maximise the similarity between document d and intent i if d is labelled as i in the training set (and minimize d's similarity with all the other intent embeddings). The similarity is by default calculated as cosine similarity.

Each new, unseen document is embedded by the neural network and its similarity computed for each of the intents. The intent which is most similar to the new document will be returned as the predicted label.

Old answer:

It's not an LSTM. They say their approach is inspired by Facebook's StarSpace.

I didn't find the paper above very enlightning, however looking at Starspace's Github repo, the text classification use case is said to have same setting as their previous work TagSpace.

The TagSpace paper is more clear and explains how they use a CNN to embed each document in a space such that its distance to the associated class vector is minimized. Both words, documents and classes ("tags") are embedded in the same d-dimensional space and their distance measured via cosine similarity or inner product.

Upvotes: 2

What model is Rasa NLU entity extraction using? Is it LSTM or just a simple neural network?

Answers (1)

Related Questions