Reputation: 39
I have been recently looking into machine learning and neural networks, and am currently trying to gain a better grasp of the impact that the training process has on the performance of a network over its lifetime.
My current understanding is that when given the training set, the network adjusts its bias and weight variables in a way that brings the program closer to a correct answer as it processes thousands of data points, learning and adjusting at each point. This paper discusses the example of the MNIST data set where the network could be trained to guess the correct number accurately up to 97% of the time, where the dataset was marked correct with 100% accurately.
But, what happens if the data set it is given is only 80% accurate or 50% accurate? Is this accuracy the upper bound of accuracy the network can achieve after training? Is there a model that allows one to train better than the given data set?
For lack of a better example, if a neural network learned from beginner chess players, could it theoretically be trained to a grandmaster level?
Upvotes: 0
Views: 376
Reputation: 7838
Short Answer: Check this paper that talks about Training NN based on Unreliable Labels, where they explain in depth an approach to train a network given some mislabled data using noise filters and other techniques. They even exemplify it with the MNIST dataset.
Longer Answer:
But, what happens if the data set it is given is only 80% accurate or 50% accurate?
In general, we can say that neural networks are similar to regressions, extracting from the (great) book Artificial Intelligence: A Modern Approach 3rd Ed. by Stuart J. Russell and Peter Norvig (p. 732):
And because the function represented by a network can be highly nonlinear—composed, as it is, of nested nonlinear soft threshold functions—we can see neural networks as a tool for doing nonlinear regression.
Therefore, if you give this network (or regressor) some data it will try to fit it in the best possible way. That is, the network can't know a priori what are going to be the labels on the real world; it is trained to fit the data you give it, trying to learn hidden patterns and adapt to possible variants of inputs.
So, considering this and following a traditional approach (not like the one of the paper) if you give your NN mislabled data it will learn to classify that kind of mislabled data (that is, your training error and loss funciton may be good but when you give it real labels it may perform worst, as it was trained to classify the mislabled data). If you tried its performance with similarily mislabled data it will surely perform great.
Is there a model that allows one to train better than the given data set?
The refered paper presents a model to do so. Although, be careful as in NN "train better" may ofter result in Overfitting if you are not careful.
Hope this helps.
Upvotes: 2
Reputation: 77867
Yes -- in theory, a network can achieve higher standards than its input, especially where the input is known to have errors. Among other things, it's quite possible for "training by committee" to surpass the accuracy of any individual member. In such cases, the controlling factor is how far above "random guess" the training set's accuracy lies: if it's near or worse than random, you're going to fail.
One highly visible example of this is on the quiz show "Who Wants to Be a Millionaire". Although the studio audience are "regular folks" with a passing interest in trivia and large prizes, the "ask the audience" life-line has a high degree (91%) of accuracy.
Upvotes: 1