Andnp
Andnp

Reputation: 674

Can a Neural Network using SGD change only one output of many with backprop?

Let's say I have a Neural Network with this structure: network([256, 100, 4]) where there are 256 input neurons, 100 hidden, and 4 outputs. The network uses the sigmoid function as it's activation function and the output neurons return a value in the range [0:1].

With every epoch, I can know that one of the four outputs is right or wrong. For instance, if the network gives me [1, 0, 1, 0], but I know that the first output should be a 0 and I know nothing else about the other three outputs.

Is there a way I can train the network so that only the first output is affected?

My intuition tells me that using backprop with the target set as [0,0,1,0] will solve my problem, but I'm also curious if [0, .5, .5, .5] makes more sense.

Upvotes: 2

Views: 220

Answers (2)

Zaw Lin
Zaw Lin

Reputation: 5708

what you should do is set the gradient of the unknown outputs to zero during the back propagation stage. You should not set the label itself to any value because if the number of sample with unknown labels are large, you will bias the network output to that number. For example if you set [0, .5, .5, .5] and the ratio of unknown to known is maybe 20:1 its likely the network will simply output a constant [.5,.5,.5,.5]

Upvotes: 2

Felipe Oriani
Felipe Oriani

Reputation: 38608

Yes, you can define a training set to provide an output as [0, 0, 1, 0], but, the neural network can generate errors in a unseen set. The Backpropagation algorithm can do it for you, and you can minimize this error using a validation set to generate a validation neural network (which provide a generalization of the result), as I explain on this post.

The problem is (, actually it's not a big problem but) it will not provide the exactly result you want, you have to interpret it and define the right output. Let's supose you are waiting a result like [0, 0, 1, 0] and the neural network provide for you [0.1278, 0.1315, 0.981554, 0.2102]. As you can see, the third output is closer to 1 than the other, so you can convert the output.

Since you normalize the set between 0 and 1, and normalize the future information to test on the neural network, you should not have problems. You could consider output values lower than .5 as 0 and greater/equals than 0.5 as 1. In other case, the greater value you could consider as 1.

My intuition tells me that using backprop with the target set as [0,0,1,0] will solve my problem, but I'm also curious if [0, .5, .5, .5] makes more sense.

You could use the tangent hyperbolic as activation function for you neural network and normalize the data between -1 and 1, so the search space for the output values can be more spaced than sigmoid.

enter image description here

If you have a result near [0, 0, 1, 0], you will still get closer than this value, because if you pass a new pattern for the neural network, the model can change to search on this new pattern and it will still have closer values. Maybe, you could search a new architecture for your neural network model and take better results, using prunning methods.

Upvotes: 0

Related Questions