Sasquatch Man
Sasquatch Man

Reputation: 73

How to gradually train with more and more classes?

I'm trying to create an incremental classifier that will get trained on data containing n classes for some set number of epochs, then n+m classes for a set number of epochs, then n+m+k, etc, where each successive set of classes contains the previous set as a subset.

In order to do this without having to train the model, save it, manually edit the graph, re-train, repeat, I'm simply defining all the weights I will need to classify the entire set of classes, but keeping the weights corresponding to unseen classes frozen at 0 until the classifier is introduced to those classes.

My strategy for this is to define a placeholder that is fed in an array of Boolean values defining whether or not some given set of weights are trainable.

Relevant code below:

output_train = tf.placeholder(tf.int32, shape = (num_incremental_grps), name         = "output_train")
.
.
.
weights = []
biases = []
for i in range(num_incremental_grps):
    W = tf.Variable(tf.zeros([batch_size, classes_per_grp]),         
    trainable=tf.cond(tf.equal(output_train[i], tf.constant(1)),lambda: tf.constant(True), lambda: tf.constant(False)))
    weights.append(W)
    b = tf.Variable(tf.zeros([classes_per_grp]), trainable=tf.cond(tf.equal(output_train[i], 
    tf.constant(1)), lambda:tf.constant(True), lambda: tf.constant(False)))
    biases.append(b)

out_weights = tf.stack(weights, axis=1).reshape((batch_size, -1))
out_biases = tf.stack(biases, axis=1).reshape((batch_size, -1))
outputs = tf.identity(tf.matmul(inputs, out_weights) + out_biases, name='values')
.
.
.
# Will change this to an array that progressively updates as classes are added.
output_trainable = np.ones(num_incremental_grps, dtype=bool)
.
.
.
with tf.Session() as sess:
    init.run()
    for epoch in range(epochs):
        for iteration in range(iterations):
            X_batch, y_batch = batch.getBatch()
            fd={X: X_batch, y: y_batch, training: True, output_train: output_trainable}
            _, loss_val = sess.run([training_op, loss], feed_dict=fd)

This returns the error message

Using a 'tf.Tensor' as a Python `bool` is not allowed. Use `if t is not None:` instead of 
`if t:` to test if a tensor is defined,and use TensorFlow ops such as tf.cond to execute 
subgraphs conditioned on the value of a tensor.

I've tried tinkering around with this, like making the initial placeholder datatype tf.bool instead of tf.int32. I've also tried just feeding in a slice of the tensor into the 'trainable' argument in the weights/biases like this

W = tf.Variable(tf.zeros([batch_size, classes_per_grp]), trainable=output_variable[i])

but I get the same error message. I'm not sure how to proceed from here, aside from trying a completely different approach to updating the number of predictable classes. Any help would be much appreciated.

Upvotes: 1

Views: 64

Answers (1)

P-Gn
P-Gn

Reputation: 24581

The error occurs because tf.cond takes a decision based on a single boolean — much like an if statement. What you want here is to make a choice per element of your tensor.

You could use tf.where to fix that problem, but then you will run into another one, which is that trainable is not a property that you can fix at runtime, it is part of the definition of a variable. If a variable will be trained at some point, perhaps not at the beginning but definitely later, then it must be trainable.

I would suggest to take a much simpler route: define output_train to be an array of tf.float32

output_train = tf.placeholder(tf.float32, shape=(num_incremental_grps), name="output_train")

then later simply multiply your weights and variables with this vector.

W = tf.Variable(...)
W = W * output_train
...

Provide values of 1 to output_train where you want training to happen, 0 otherwise.

Be careful to also mask your loss to ignore output from unwanted channels, because event though they now always output 0, that may still affect your loss. For example,

logits = ...
logits = tf.matrix_transpose(tf.boolean_mask(
  tf.matrix_transpose(logits ),
  output_train == 1))
loss = tf.nn.softmax_cross_entropy_with_logits_v2(logits=logits, labels=labels)

Upvotes: 1

Related Questions