Reputation: 1142
I'm doing some classification with Python and scikit-learn. I have a question which doesn't seem to be covered in the documentation: if I'm doing, for example, classification with SVM, does the order of the input examples matter? If I have binary labels, will the results be less accurate if I put all the examples with label 0 next to each other and all the examples with label 1 next to each to other, or would it be better to mix them up? What about the other algorithms scikit provides?
Upvotes: 0
Views: 143
Reputation: 66795
No, the ordering of the patterns in the training set do not matter. While the ordering of samples can affect stochastic gradient descent learning algorithms (like for example the one for the NN) they are in most cases coded in a way that ensures internal randomness. SVM on the other hand is globally convergant and it will result in the exact same solution regardless of the ordering.
Upvotes: 3