thinkdeep
thinkdeep

Reputation: 1033

Embedding Layer with Changing input_dim

According to the doc, Tensorflow Embedding layer has fixed input_dim, i.e., vocabulary size.

When we train a DNN model in a streaming fashion (online learning), the number of unique features, i.e., input_dim is unknown beforehand. It could increase over time as new data comes in. Thus, we cannot declare the Embedding layer with fixed input_dim. How shall we handle the embedding in this case?

Thanks in advance!

Upvotes: 1

Views: 805

Answers (1)

Anton Panchishin
Anton Panchishin

Reputation: 3773

Dynamic Embedding Size

You are correct, when you declare and embedding layer like so

tf.keras.layers.Embedding(input_dim, output_dim)

you certainly have to pass a static value for input and output dims. The way the embedding is implemented under the covers is a large matrix of size input_dim x output_dim, and then it uses tf.keras.backend.gather to pull the rows out for the input indexes that you pass.

This doesn't solve your problem, but is a start as insight into how we could fix it.

Fix 1 - Allocate enough for a while

It is as simple as that. Allocate 20%, 50%, or 200% more output_dim than you need today. Do some estimates of how much you will need before you want to deploy another model. If that is next month, then allocate enough space to get you there (and a little more as a buffer)

Fix 2 - Reuse your weights

Eventually Fix 1 will run out. Create a new model reusing all the previous weights from your previously trained model other than the weights within the Embedding layer. Create a new embedding layer of the size you want next (and add some buffer to that). Remembering that the rows are the input, and the columns are the output, we simply copy the previous embedding matrix into the top of the new embedding matrix.

If our old matrix was an input_dim=300, output_dim=20, that is a 300x20 matrix. And if our new matrix is an input_dim=500, output_dim=20, that is a 500x20 matrix. Copy the first 300 rows of the previous matrix into the initialized 500x20 matrix.

Upvotes: 3

Related Questions