Dhiraj
Dhiraj

Reputation: 1720

Probabilities of all classifications in knn in R

When using the knn() function in package class in R, there is an argument called "prob". If I make this true, I get the probability of that particular value being classified to whatever it is classified as. I have a dataset where the classifier has 9 levels. Is there any way in which I can get the probability of a particular observation for all the 9 levels?

Upvotes: 2

Views: 8116

Answers (3)

polkas
polkas

Reputation: 4194

This question still require proper answer.

If the probability for the most probable class is needed then the class package will be still suited. The clue is to set the argument prob to TRUE and k to higher than default 1 - class::knn(tran, test, cl, k = 5, prob = TRUE). The k has to be higher than default 1 to not get always 100% probability for each observation.

However if you want to get probabilities for each of the classes I will recommend the caret::knn3 function with predict one.

data(iris3)
train <- rbind(iris3[1:25,,1], iris3[1:25,,2], iris3[1:25,,3])
test <- rbind(iris3[26:50,,1], iris3[26:50,,2], iris3[26:50,,3])
cl <- factor(c(rep("s",25), rep("c",25), rep("v",25)))

# class package
# take into account k higher than 1 and prob equal TRUE
model <- class::knn(train, test, cl, k = 5, prob = TRUE)
tail(attributes(model)$prob, 10)
#>  [1] 1.0 1.0 1.0 1.0 1.0 1.0 0.8 1.0 1.0 0.8

# caret package
model2 <- predict(caret::knn3(train, cl, k = 3), test)
tail(model2, 10)
#>               c s         v
#> [66,] 0.0000000 0 1.0000000
#> [67,] 0.0000000 0 1.0000000
#> [68,] 0.0000000 0 1.0000000
#> [69,] 0.0000000 0 1.0000000
#> [70,] 0.0000000 0 1.0000000
#> [71,] 0.0000000 0 1.0000000
#> [72,] 0.3333333 0 0.6666667
#> [73,] 0.0000000 0 1.0000000
#> [74,] 0.0000000 0 1.0000000
#> [75,] 0.3333333 0 0.6666667

Created on 2021-07-20 by the reprex package (v2.0.0)

Upvotes: 3

user12525636
user12525636

Reputation: 21

I know there is an answer already marked here, but this is possible to complete without utilizing another function or package.

What you can do instead is build your knn model knn_model and check out it's attributes for the "prob" output, as such.

attributes(knn_model)$prob

Upvotes: 2

Maximilian Scherer
Maximilian Scherer

Reputation: 69

As far as I know the knn() function in class only returns the highest probability.

However, you can use the knnflex package which allows you to return all probability levels using knn.probability (see here, page 9-10).

Upvotes: 2

Related Questions