Reputation: 179
I am trying to apply the Simpson's Diversity Index across a number of different datasets with a variable number of species ('nuse') captured. As such I am trying to construct code which can cope with this automatically without needing to manually construct a formula each time I do it. Example dataset for a manual formula is below:
diverse <- data.frame(nuse1=c(0,20,40,20), nuse2=c(5,5,3,20), nuse3=c(0,2,8,20), nuse4=c(5,8,2,20), total=c(10,35,53,80))
simp <- function(x) {
total <- x[,"total"]
nuse1 <- x[,"nuse1"]
nuse2 <- x[,"nuse2"]
nuse3 <- x[,"nuse3"]
nuse4 <- x[,"nuse4"]
div <- round(((1-(((nuse1*(nuse1 - 1)) + (nuse2*(nuse2 - 1)) + (nuse3*(nuse3 - 1)) + (nuse4*(nuse4 - 1)))/(total*(total - 1))))),digits=4)
return(div)
}
diverse$Simpson <- simp(diverse)
diverse
As you can see this works fine. However, how would I be able to create a function which could automatically adjust to, for example, 9 species (so up to nuse9)?
I have experimented with the paste function + as.formula as indicated here Formula with dynamic number of variables; however it is the expand form of (nuse1 * (nuse1 - 1)) that I'm struggling with. Does anyone have any suggestions please? Thanks.
Upvotes: 0
Views: 57
Reputation: 3194
How about something like:
diverse <- data.frame(nuse1=c(0,20,40,20), nuse2=c(5,5,3,20), nuse3=c(0,2,8,20), nuse4=c(5,8,2,20), total=c(10,35,53,80))
simp <- function(x, species) {
spcs <- grep(species, colnames(x)) # which column names have "nuse"
total <- rowSums(x[,spcs]) # sum by row
div <- round(1 - rowSums(apply(x[,spcs], 2, function(s) s*(s-1))) / (total*(total - 1)), digits = 4)
return(div)
}
diverse$Simpson2 <- simp(diverse, species = "nuse")
diverse
# nuse1 nuse2 nuse3 nuse4 total Simpson2
# 1 0 5 0 5 10 0.5556
# 2 20 5 2 8 35 0.6151
# 3 40 3 8 2 53 0.4107
# 4 20 20 20 20 80 0.7595
All it does is find out which columns start with "nuse" or any other species you have in your dataset. It constructs the "total" value within the function and does not require a total column in the dataset.
Upvotes: 1