Reputation: 21
I plan to use the iNEXT function (from the iNEXT package) to analyse potential differences between two pond types. These data are presence/absence and in a sites by species matrix e.g.
Pond.type <- c("A", "A", "A", "B", "B", "B")
Sample.no <- c(1,2,3,1,2,3)
Species1 <- c(0,1,1,1,0,0)
Species2 <- c(0,1,1,0,1,0)
Species3 <- c(1,1,1,1,0,1)
Species4 <- c(0,1,0,1,0,0)
Species5 <- c(1,1,1,0,0,0)
mydata <- cbind.data.frame(Pond.type, Sample.no, Species1, Species2, Species3, Species4, Species5)
If I split my data frame into pond types and keep these as matrices I can run the function, e.g. iNEXT(pond.type.a.dataframe)
, but then I cannot compare between the different pond types on one plot.
My question is, is there a way to convert my data into the same format as the ciliates example data provided in the iNEXT library? This is given as a list of matrices.
Upvotes: 1
Views: 3586
Reputation: 21
I came to a solution, probably more long-winded than it needs to be, but it works.
# I've altered the dataset as I usually type these data in long format
Pond.type <- c(rep("A", 15), rep("B", 15))
Site.no <- rep(seq(1,6, by=1), 5)
Species <- c(rep("Spp1", 6), rep("Spp2", 6), rep("Spp3", 6), rep("Spp4", 6), rep("Spp5", 6))
Presence <- rep(1, 30)
# join the vectors together into a dataframe
mydata.long <- cbind.data.frame(Pond.type, Site.no, Species, Presence)
# I then cast the data into a species x site matrix (dcast is from the reshape2 library)
mydata.cast <- dcast(mydata.long, Species + Pond.type ~ Site.no)
# Changes the NAs to zeros.
mydata.cast[is.na(mydata.cast)] <- 0
# Separate the dataframe into each pond tyoe
pondA <- mydata.cast[grep("A", mydata.cast$Pond.type),]
pondB <- mydata.cast[grep("B", mydata.cast$Pond.type),]
# Calculate the frequency counts using the function from the iNEXT library
pondA.freq <- as.incfreq(pondA[,3:ncol(pondA)])
pondB.freq <- as.incfreq(pondB[,3:ncol(pondB)])
# join them as a list
pond.list = list(A = pondA.freq, B = pondB.freq)
# ready for comparison
pond.out <- iNEXT(pond.list, datatype = "incidence_freq", q=0)
pond.out
Upvotes: 0
Reputation: 21
I tried the answer from Oliver Burdekin which worked great - thanks
One addition to make if using abundance data - apply the as.abucount function to the list of matrices, before running the iNEXT function
# create a list of your matrices (named so the output looks nice)
pondABM = list(A = mPondA, B = mPondB)
# apply as.abucount to the list of matrices
pondABM2 = lapply(pondABM, as.abucount)
# run the iNEXT function
iNEXT(pondABM2, datatype="abundance")
Upvotes: 2
Reputation: 1108
If you have 2 .csvs structured like this
Pond type A:
Species, Sample1, Sample2, Sample3
Species1, 0,1,1
Species2, 0,1,1
Species3, 1,1,1
Species4, 0,1,0
Species5, 1,1,1
Pond type B:
Species, Sample1, Sample2, Sample3
Species1, 1,0,0
Species2, 0,1,0
Species3, 1,0,1
Species4, 1,0,0
Species5, 0,0,0
Then assuming you call the "read csv"s pondA and pondB you could do the following:
library(iNEXT)
library(ggplot2)
# make a matrix from pondA as type "integer"
mPondA <- as.matrix(apply(pondA[,-1],2,as.integer))
# use your species names as row names
row.names(mPondA) <- pondA[,1]
# do the same for pondB
mPondB <- as.matrix(apply(pondB[,-1],2,as.integer))
row.names(mPondB) <- pondB[,1]
# create a list of your matrices (named so the output looks nice)
pondABM = list(A = mPondA, B = mPondB)
# have a look at the raw data
out.raw <- iNEXT(pondABM, datatype="incidence_raw", endpoint=20)
ggiNEXT(out.raw)
Which will give you an output like this:
If you want to do more with the data using this input type, don't forget to change the data type:
# note that datatype has been changed to "incidence_raw" instead of "abundance"
iNEXT(pondABM, q=0, datatype="incidence_raw", size=NULL, se=TRUE, conf=0.95, nboot=50)
Upvotes: 3