Reputation: 45
I am working on a multiclass model with a huge number of classes (approx. 3500). Can a large number of classes influence the performance of my model?I would like to use SVM and Random Forest. Does anyone know if there is any limitation on the number of classes for these methods? Thanks in advance
Upvotes: 4
Views: 186
Reputation: 3823
Yes, it might have a performance hit specially because most libraries solve the multiclass problem by using a combination of binary problems. There are different strategies (one-vs-all, one-vs-one, winner-takes-all, etc) and you have to try and see which perform well enough for you (assuming that you have control over the strategy).
https://en.wikipedia.org/wiki/Support_vector_machine#Multiclass_SVM
A dirty hack that sometimes have worked for me in the past, is to think the problem as a regression problem instead of a multiclass problem, but that might not be valid in your case, I would have to see the problem in detail to tell.
Upvotes: 2