Reputation: 237
I am trying to find the Mahalanobis Distance between the different species in the iris
dataset in R. I was able to find the distance between setosa
and versicolor
by the following code:
library(HDMD)
#To get Mahalanobis distances between Setosa and Versicolor,
set.vers<-pairwise.mahalanobis(x=iris[1:100,1:4], grouping=iris[1:100,]$Species)
md= sqrt(set.vers$distance)
However, I am struggling to do the same for setosa
and virginica
. I am not sure how to select the first 50 rows and last 50 rows of the data set (i.e. not have any versicolor
data)
Upvotes: 1
Views: 297
Reputation: 76432
Here is a way to get all combinations of levels in iris$Species
with combn
and compute the Mahalanobis distances.
library(HDMD)
inx <- sapply(levels(iris$Species), function(l) which(iris$Species == l), simplify = FALSE)
inx <- combn(inx, 2, function(x) unlist(x), simplify = FALSE)
set.vers_all <- lapply(inx, function(i) {
pairwise.mahalanobis(x = iris[i, 1:4], grouping = droplevels(iris$Species[i]))
})
set.vers_all
Upvotes: 1
Reputation: 70643
This is a basic subsetting question. You want to subset based on Species
, something along the lines of (not tested)
ss <- iris[iris$Species %in% c("Setosa", "Virginica"), ]
pairwise.mahalanobis(x = ss, grouping = ss$Species)
You can of course change the species pair you want to compare in many ways.
Upvotes: 2