Morano88
Morano88

Reputation: 2077

Randomness in Artificial Intelligence & Machine Learning

This question came to my mind while working on 2 projects in AI and ML. What If I'm building a model (e.g. Classification Neural Network,K-NN, .. etc) and this model uses some function that includes randomness. If I don't fix the seed, then I'm going to get different accuracy results every time I run the algorithm on the same training data. However, If I fix it then some other setting might give better results.

Is averaging a set of accuracies enough to say that the accuracy of this model is xx % ?

I'm not sure If this is the right place to ask such a question/open such a discussion.

Upvotes: 12

Views: 2990

Answers (5)

luispedro
luispedro

Reputation: 7014

There are models which are naturally dependent on randomness (e.g., random forests) and models which only use randomness as part of exploring the space (e.g., initialisation of values for neural networks), but actually have a well-defined, deterministic, objective function.

For the first case, you will want to use multiple seeds and report average accuracy, std. deviation, and the minimum you obtained. It is often good if you have a way to reproduce this, so just use multiple fixed seeds.

For the second case, you can always tell, just on the training data, which run is best (although it might actually not be the one which gives you the best test accuracy!). Thus, if you have the time, it is good to do say, 10 runs, and then evaluate on the one with the best training error (or validation error, just never evaluate on testing for this decision). You can go a level up and do multiple multiple runs and get a standard deviation too. However, if you find that this is significant, it probably means you weren't trying enough initialisations or that you are not using the right model for your data.

Upvotes: 5

Shaunak
Shaunak

Reputation: 18008

I generalize the answer from what i get of your question, I suppose Accuracy is always average accuracy of multiple runs and the standard deviation. So if you are considering accuracy you get using different seeds to the random generator, are you not actually considering a greater range of input (which should be a good thing). But you have to consider the Standard deviation to consider the accuracy. Or did i get your question it totally wrong ?

Upvotes: 2

Yuval F
Yuval F

Reputation: 20621

I believe cross-validation may give you what you ask about: an averaged, and therefore more reliable, estimate of classification performance. It contains no randomness, except in permuting the data set initially. The variation comes from choosing different train/test splits.

Upvotes: -1

John McFarlane
John McFarlane

Reputation: 6077

Stochastic techniques are typically used to search very large solution spaces where exhaustive search is not feasible. So it's almost inevitable that you will be trying to iterate over a large number of sample points with as even a distribution as possible. As mentioned elsewhere, basic statistical techniques will help you determine when your sample is big enough to be representative of the space as a whole.

To test accuracy, it is a good idea to set aside a portion of your input patterns and avoid training against those patterns (assuming you are learning from a data set). Then you can use the set to test whether your algorithm is learning the underlying pattern correctly, or whether it's simply memorizing the examples.

Another thing to think about is the randomness of your random number generator. Standard random number generators (such as rand from <stdlib.h>) may not make the grade in many cases so look around for a more robust algorithm.

Upvotes: 2

deong
deong

Reputation: 3870

Simple answer, yes, you randomize it and use statistics to show the accuracy. However, it's not sufficient to just average a handful of runs. You need, at a minimum, some notion of the variability as well. It's important to know whether "70%" accurate means "70% accurate for each of 100 runs" or "100% accurate once and 40% accurate once".

If you're just trying to play around a bit and convince yourself that some algorithm works, then you can just run it 30 or so times and look at the mean and standard deviation and call it a day. If you're going to convince anyone else that it works, you need to look into how to do more formal hypothesis testing.

Upvotes: 6

Related Questions