Custom TensorFlow loss function with batch size 1?

Question

I have some neural network with following code snippets, note that batch_size == 1 and input_dim == output_dim:

net_in = tf.Variable(tf.zeros(shape = [batch_size, input_dim]), dtype=tf.float32)
input_placeholder = tf.compat.v1.placeholder(shape = [batch_size, input_dim], dtype=tf.float32)
assign_input = net_in.assign(input_placeholder) 
# Some matmuls, activations, dropouts, normalizations...
net_out = tf.tanh(output_before_activation)


def loss_fn(output, input):
    #input.shape = output.shape = (batch_size, input_dim)
    output = tf.reshape(output, [input_dim,]) # shape them into 1d vectors
    input = tf.reshape(input, [input_dim,])
    return my_fn_that_only_takes_in_vectors(output, input)

# Create session, preprocess data ...

for epoch in epoch_num:
    for batch in range(total_example_num // batch_size):
        sess.run(assign_input, feed_dict = {input_placeholder : some_appropriate_numpy_array})
        sess.run(optimizer.minimize(loss_fn(net_out, net_in)))

Currently the neural network above works fine, but it is very slow because it updates gradient every sample (batch size = 1). I would like to set batch size > 1, but my_fn_that_only_takes_in_vectors cannot accommodate matrices whose first dimension is not 1. Due to the nature of my custom loss, flattening the batch input into a vector of length (batch_size * input_dim) seems to not work.

How would I write my new custom loss_fn now that the input and output are N x input_dim where N > 1? In Keras this would not have been an issue because keras somehow takes the average of the gradients of each example in the batch. For my TensorFlow function, should I take each row as a vector individually, pass them to my_fn_that_only_takes_in_vectors, then take the average of the results?

Custom TensorFlow loss function with batch size > 1?

Answers (1)

Related Questions

Custom TensorFlow loss function with batch size &gt; 1?

Answers (1)

Related Questions

Custom TensorFlow loss function with batch size > 1?