Reputation: 43
I am working on a binary classifier using the custom estimator api, code below.
I would like to experiment using using different loss functions, the code below runs with the sigmoid_cross_entropy or sparse_softmax_cross_entropy calls. But when I try mean_squared_error I get a stack trace
ValueError: No gradients provided for any variable, check your graph for ops that do not support gradients, between variables ["<tf.Variable 'dense/kernel:0' shape=(350, 18) dtype=float32_ref>", "<tf.Variable 'dense/bias:0' shape=(18,) dtype=float32_ref>", "<tf.Variable 'OUTPUT/kernel:0' shape=(18, 2) dtype=float32_ref>", "<tf.Variable 'OUTPUT/bias:0' shape=(2,) dtype=float32_ref>"] and loss Tensor("mean_squared_error/value:0", shape=(), dtype=float32).
Here is the code, I suspect some newbie mistake. any insights would be appreciated. thx
# input layer
net = tf.feature_column.input_layer( features, params['feature_columns'] )
# hidden layer 1
net = tf.layers.dense(net, units=18, activation=tf.nn.relu)
# output layer computes logits
logits = tf.layers.dense(net, params['n_classes'], activation=None, name='OUTPUT')
# sigmoid cross entropy
#multi_class_labels = tf.one_hot( labels, 2 )
#loss = tf.losses.sigmoid_cross_entropy(multi_class_labels=multi_class_labels, logits=logits)
# sparse softmax cross entropy
# loss = tf.losses.sparse_softmax_cross_entropy(labels=labels, logits=logits)
# mean squared error
predicted_classes = tf.argmax(logits, 1)
loss = tf.losses.mean_squared_error(labels=labels, predictions=predicted_classes)
# TRAINING MODE
assert mode == tf.estimator.ModeKeys.TRAIN
optimizer = tf.train.AdagradOptimizer(learning_rate=0.1)
train_op = optimizer.minimize(loss, global_step=tf.train.get_global_step())
return tf.estimator.EstimatorSpec(mode, loss=loss, train_op=train_op)
this demo_model custom estimator is called like this
classifier = tf.estimator.Estimator(
model_fn=demo_model,
model_dir=cur_model_dir,
params={
'feature_columns': feature_columns,
# The model must choose between 2 classes.
'n_classes': 2
})
Upvotes: 3
Views: 744
Reputation: 59721
The problem is that tf.argmax
has no defined gradient. You can still use the mean squared error comparing the logits with the one-hot encoded label instead:
loss = tf.losses.mean_squared_error(labels=tf.one_hot(labels, 2), predictions=logits)
Upvotes: 4