pprest00
pprest00

Reputation: 1

How to calculate species accumulation for many points in a dataset?

I have a dataset that consists of 120 survey points, each of which was surveyed 4 times in a year (so 480 total surveys). We counted the presence or absence of species in every 3-minute interval of a 15 minute survey. I am interested in determining how many more species are observed for each additional 3-minute interval (aka the species accumulation).

I have tried using vegan to do a species accumulation curve and I have figured out how to apply the specaccum function to multiple points. Here is my code:

# Loading packages
library(vegan)

# setwd first
Toronto_data=read.csv(file.choose()) 

# Converting columns that won't be used in the species accumulation curve into factors
cols_to_convert <- c("SurveyDay", "TimeBlock")
Toronto_data[cols_to_convert] <- lapply(Toronto_data[cols_to_convert], as.factor) 

# Split the data frame by multiple factors: Point ID and Number of Days Survey Occurred (~120 points x 4 survey dates)
list_of_samples <- split(Toronto_data, interaction(Toronto_data$PointID, Toronto_data$SurveyDay))

# Remove empty data frames from split_data
list_of_samples <- list_of_samples[sapply(list_of_samples, nrow) > 0]

# Create function to apply specaccum function to numeric columns. Want to use "collector" rather than "exact but it will not work.  
specaccum_function <- function(data) {
  # Ensure that only numeric columns are passed
  numeric_data <- data[sapply(data, is.numeric)]
  
  # Check if numeric_data has more than one row and one column
  if (nrow(numeric_data) > 1 && ncol(numeric_data) > 1) {
    return(specaccum(numeric_data, "collector"))
  } else {
    stop("Not enough data to calculate species accumulation.")
   
  }
}

# Apply specaccum to each subset
results <- lapply(list_of_samples, function(list_of_samples) {
  specaccum_function(list_of_samples)
})

However I believe i need to use the "collector" method and when I do so I continuous get the same error "

Error in rowSums(apply(x[ind, , drop = FALSE], 2, cumsum) > 0) : 
  'x' must be an array of at least two dimensions

Looking at the list created by my code, all of the object in the list seem to be dataframes of 5 x76 which is more than two dimensions, so i can't figure out what the issue is.

A subset of my data from dput is here:

structure(list(PointID = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), SurveyDay = c(1L, 
1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 4L, 4L, 
4L, 4L, 4L), TimeBlock = c(1L, 2L, 3L, 4L, 5L, 1L, 2L, 3L, 4L, 
5L, 1L, 2L, 3L, 4L, 5L, 1L, 2L, 3L, 4L, 5L), SpeciesA = c(1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L), SpeciesB = c(1L, 1L, 1L, 1L, 0L, 0L, 0L, 0L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 0L, 1L, 1L), SpeciesC = c(0L, 
0L, 0L, 0L, 0L, 0L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 
0L, 0L, 0L), SpeciesD = c(1L, 0L, 0L, 0L, 0L, 0L, 1L, 1L, 1L, 
0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L), SpeciesE = c(0L, 
0L, 0L, 0L, 0L, 0L, 1L, 1L, 0L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 
0L, 0L, 0L), SpeciesF = c(0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 
0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L)), class = "data.frame", row.names = c(NA, 
-20L))

Any help provided would be much appreciated!

Upvotes: 0

Views: 44

Answers (1)

Michael Dewar
Michael Dewar

Reputation: 3293

I assume the data from your dput is supposed to be Toronto_data. When I run it I get your error for only one part of the data. That is,

specaccum_function(list_of_samples$`2.2`)

gives me the error. But that's because

list_of_samples$`2.2`

only has one row of data. If you change the stop() to a warning(), you'll see the other elements of your list return without problem.

If you need to update your dput, then please edit the question.

Upvotes: 0

Related Questions