Tensorflow - Train only a subset of embedding matrix

Question

I have an embedding matrix e defined as follows

e = tf.get_variable(name="embedding", shape=[n_e, d], 
              initializer=tf.contrib.layers.xavier_initializer(uniform=False))

where n_e refers to the number of entities and d is the number of latent dimensions. For this example, say d=10.

Training:

optimizer = tf.train.GradientDescentOptimizer(0.01)
grads_and_vars = optimizer.compute_gradients(loss)
train_op = optimizer.apply_gradients(grads_and_vars, global_step=global_step)

The model is saved after training. At some point later, new entities(e.g., 2) are added resulting in n_e_new. Now I would like to re-train the model, however retaining the embeddings for the already trained entities i.e., retraining only the delta (the 2 new entities).

I load the saved e and

init_e = np.zeros((n_e_new, d), dtype=np.float32)
r = list(range(n_e_new - 2))
init_e[r, :] = # load e from saved model

e = tf.get_variable(name="embedding", initializer=init_e)
gather_e = tf.nn.embedding_lookup(e, [n_e, n_e+1])

Training:

optimizer = tf.train.GradientDescentOptimizer(0.01)
grads_and_vars = optimizer.compute_gradients(loss, gather_e)
train_op = optimizer.apply_gradients(grads_and_vars, global_step=global_step)

I get an error at compute_gradients: NotImplementedError: ('Trying to optimize unsupported type ', )

I understand that the second parameter gather_e to compute_gradients is not a variable but cannot figure out how to achieve this partial training/update.

P.S - I also had a look at this post, but cannot seem to find a solution there either.

EDIT: Code sample(as per the approach suggested by @meruf):

if new_data_available:
    e = tf.get_variable(name="embedding", shape=[n_e_new, 1, d],
              initializer=tf.contrib.layers.xavier_initializer(uniform=False))
    e_old = tf.get_variable(name="embedding_old", initializer=, trainable=False)
    e_new = tf.concat([e_old, e], 0)

else:
    e = tf.get_variable(name="embedding", shape=[n_e, d], 
              initializer=tf.contrib.layers.xavier_initializer(uniform=False))

Lookup is as follows:

if new_data_available:
    var_p = tf.nn.embedding_lookup(e_new, indices)
else:
    var_p = tf.nn.embedding_lookup(e, indices)

loss = #some operations on var_p and other variabes that are a result of the lookup above

The issue is that when new_data_available is true, neither e nor e_new change during each epoch. They remain same.

Tensorflow - Train only a subset of embedding matrix

Answers (1)

at first load the library

declare the placeholders

create the network

calculate loss

optimize loss

load and run the graph

print the initialized value

test case

check that the value matches with above value.

we will send z=True to indicate on which embedding you want to lookup.

after training let's check is it behaves ok or not.

let's try the opposite

Related Questions