Reputation: 335
I am using k-means on a dataset including more than 150k
documents but i don't know what a good k
value is.
I have tried elbow method to find it but the inertia value doesn't change so much.(i am using sklearn).
Upvotes: 0
Views: 164
Reputation: 6299
If elbow method does not have a clear answer, then possibly no number of clusters is particularly good. k-means can only model spherical relationships, which might be limiting. You can maybe try other feature representations, such as something based on Word Embeddings.
For a document grouping task, you might want to use a topic modelling approach instead of clustering, like Latent Dirichlet Allocation (LDA) or Non-negative Matrix factorization (NMF).
Upvotes: 1