Reputation: 616
Adjusted rand index (ARI) is a popular measure to compare two clusters. Unfortunately, I usually get negative ARI after performing clustering analysis and comparing them. How can I interpret these negative ARIs to describe the differences of those clusters? And then if the negative ARIs are meaningless, any suggestion about an appropriate measure?
Upvotes: 4
Views: 4928
Reputation: 368
I recently had a similar interpretation question and found these toy examples in R useful:
> aricode::ARI(c(1,1,2,2), c(1,1,2,2))
[1] 1
> aricode::ARI(c(1,1,2,2), c(2,2,1,1))
[1] 1
> aricode::ARI(c(1,1,2,2), c(1,2,1,2))
[1] -0.5
> aricode::ARI(c(1,1,2,1), c(1,2,1,2))
[1] 0
Upvotes: 0
Reputation: 77485
They aren't "meaningless" at all.
Negative ARI says that the agreement is less than what is expected from a random result. This means the results are 'orthogonal' or 'complementary' to some extend.
But this shouldn't happen often, unless you deliberately look for alternative clusterings. Maybe there is an implementation error?
Upvotes: 8