Transform community data collected by two sampling methods into matrix for vegan

Question

I have community data collected by two sampling methods that I want to transform into a matrix (or two? not sure which would be correct input) for a downstream analysis using the vegan package to compare how well each method performs at detecting community dissimilarity (using bray-curtis and anosim).

Here are two example dataframes:

method1 <- data.frame(site = c('site1','site1','site1','site1','site2','site2','site2','site2','site3','site3','site3','site3'),
                    sampleID  = c("site1.net1.2018", "site1.net2.2018","site1.net1.2019", "site1.net2.2019", "site2.net1.2018", "site2.net2.2018","site2.net1.2019","site2.net2.2019","site3.net1.2018", "site3.net2.2018", "site3.net1.2019", "site3.net2.2019"),
                    year = c("2018", "2018", "2019", "2019","2018", "2018", "2019", "2019","2018", "2018", "2019", "2019"),
                    species = c('Sp1','Sp2','Sp1','Sp3','Sp4','Sp2','Sp1','Sp2','Sp1','Sp3','Sp4','Sp2'),
                    abundance = c(1,7,1,6,2,5,2,1,6,3,2,1),
                    method = c("method1","method1","method1","method1","method1","method1","method1","method1","method1","method1","method1","method1"))
                    
method2 <- data.frame(site = c('site1','site1','site1','site1','site2','site2','site2','site2','site3','site3','site3','site3'),
                      sampleID  = c("site1.net1.2018", "site1.net2.2018","site1.net1.2019", "site1.net2.2019", "site2.net1.2018", "site2.net2.2018","site2.net1.2019","site2.net2.2019","site3.net1.2018", "site3.net2.2018", "site3.net1.2019", "site3.net2.2019"),
                      year = c("2018", "2018", "2019", "2019","2018", "2018", "2019", "2019","2018", "2018", "2019", "2019"),
                      species = c('Sp2','Sp4','Sp5','Sp1','Sp3','Sp1','Sp6','Sp1','Sp3','Sp4','Sp1','Sp5'),
                      abundance = c(2,1,3,3,5,2,10,6,4,2,1,1),
                      method = c("method2","method2","method2","method2","method2","method2","method2","method2","method2","method2","method2","method2"))

> head(method1)
   site        sampleID year species abundance  method
1 site1 site1.net1.2018 2018     Sp1         1 method1
2 site1 site1.net2.2018 2018     Sp2         7 method1
3 site1 site1.net1.2019 2019     Sp1         1 method1
4 site1 site1.net2.2019 2019     Sp3         6 method1
5 site2 site2.net1.2018 2018     Sp4         2 method1
6 site2 site2.net2.2018 2018     Sp2         5 method1

It's unclear to me how the data should be formatted in matrix form as input into the vegan package, especially since there are multiple years, samples, and methods. For example, the documentation for vegan shows the following that indicates a separate df is to be used for categorical/environmental variables:

data(dune)
data(dune.env)
dune.dist <- vegdist(dune)
attach(dune.env)
dune.ano <- anosim(dune.dist, Management)

This example has one community matrix for multiple management types, but it's unclear to me whether i need to make one matrix or two matrices for each sampling method, and how to coalesce the data into a binary presence/absence matrix formatted by method, year, and sampleID.

Transform community data collected by two sampling methods into matrix for vegan

Answers (1)

Related Questions