Michael Phillips
Michael Phillips

Reputation: 11

Tensorflow Custom Cost Function

I have 2 concrete classes, say A and C. I want to use an NN to classify them into classes A, B, C, such that samples that are too close to confidently classify are just classed as B. The cost function should be as follows: A misclassification, (an A classified as C, or vice versa) will have a very large cost. A correct classification will have zero cost. Classifying an item as B will have very low cost. The result is that we only distinguish samples that we are VERY SURE fit into their respective classes.

I have only worked through the simple tutorials in TensorFlow, but it didn't cover how to define more specific costs functions such as this. Can anyone explain how this can be accomplished in TensorFlow

Here is my relevant code, where I currently classify using only 2 classes. It is straight from the TensorFlow tutorial:

cost = tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(y, y_))
train_step = tf.train.AdamOptimizer(learning_rate = learning_rate).minimize(cost)

y is the output of the NN (will look like [[1,0,0],[0,1,0]] for a two sample set with 3 classes), and y_ is the correct classes for the sample which might be [[1,0,0],[0,0,1]]. In this example, we would have classified the second sample as B because we were uncertain, but the true class was C.

Upvotes: 0

Views: 1045

Answers (1)

Mad Wombat
Mad Wombat

Reputation: 15155

I think you have some fundamental misunderstanding of how NN classifiers work. You should probably read up on that a bit if you are going to go deeper into coding them. I highly recommend online book by Michael Nielsen Neural Networks and Deep Learning.

That said, the solution you are looking for is not in creating a special cost function, but in how you interpret the results you get from NN. You do not have 3 classes, you have 2. The "I have no idea what this is" is not a class by itself, but rather a measure of NN's confidence in its answer. So, your network should have 2 outputs, one for each class, just like in the TendorFlow guides. And you should train it just like in the guides. Once your network is trained, when you feed it a sample to classify, you get 2 numbers, lets call them A' and C'. These numbers indicate NN's confidence in what class the sample belongs to. For example, if you get A' == 0.999 and C' == 0.00001, the network is pretty damn sure that your sample is class A. If you get A' == 0.6 and C' == 0.59, your network has no idea if the sample is A or C, but slightly favors the theory that it is class A. It is now up to you to say what your confidence intervals are. To make this easier, you should probably use softmax for the output layer non-linearity (the way TensorFlow MNIST guides do). One of the useful features of softmax is that the sum of all your classes will always be 1 and you can easily make decisions based on the difference between A' and C'.

Upvotes: 1

Related Questions