AnnK
AnnK

Reputation: 189

Pool information based on the spatial location of survey locations within a spatial grid

I need to run a habitat use analysis using camera trap data within an occupancy framework. I have several camera locations (survey points) with repeated surveys throughout a period of 13 years. For occupancy analysis I need to have a sample grid and information on detection/non detection per each grid cell (i.e.detection history). I will create this detection history file in camtrapR but I need to pool my data so I have a file with information NOT per camera trap station as I have now, but per grid cell.

The problem is that for some grid cells I have more than one camera and I need to pool all cameras that lay within the same grid cell IF they belong to the same year of study, while keeping track of the id's and the total number of cameras that were pooled together for each grid (as more cameras within a grid will possibly result in higher detection of the species).

What I currently have: 1) The file ‘camtrap_ca’, with the camera station id in rows, and in columns the xy coordinates, start and end dates of survey. 2) The file ‘recordtable’ that has in each row every record of presence of the species of study, and in columns, the associated date, location and camera id for each presence record. 3) The grid: raster file with each cell with an associated consecutive number.

What I want to do:

1) overlay a raster of 926.6254m2 grid cell resolution on top of my camera trap locations (this is the spatial resolution of my GIS data). I have chosen a raster instead of a polygon grid because my study area is of 523,780 Km2 and creating such a large polygon grid in R was too slow.

2) Pool /collapse in my ‘camtrap_ca’ file the information of all camera traps within the same year of study that are located in the same grid cell as just one record (row), while adding a new field that stores how many cameras have been pooled together for each grid cell (as detection of the species will increase with higher number of cameras per grid cell), and the id of the pooled cameras.

The closest threads that I could find regarding this were: https://gis.stackexchange.com/questions/48416/aggregating-points-to-grid-using-r and Counting species occurrence in a grid However, they are not quite what I need.

A reproducible example of my data is the following: 

Camera trap locations and operation dates

    camtrap_ca <-read.table(text = "station_code    latitude    longitude   date_start  date_end
    BF09-1  -2955950    1247610 23-09-05    30-09-05
    BF09-10 -2955950    1247610 01-10-05    10-10-05
    BF09-11 -2955950    1247610 23-09-05    16-10-06
    BF09-12 -2958100    1245020 23-09-05    30-09-05
    BF09-13 -2958550    1244090 23-09-05    30-09-05
    BF09-14 -2958130    1244300 23-09-05    30-09-05
    BF09-15 -2958130    1244300 23-09-05    30-09-05
    BF09-16 -2958260    1245340 23-09-05    30-09-05
    BF09-17 -2955950    1247610 11-10-06    16-10-06
    BF09-18 -2963780    1240270 23-09-05    30-09-05
    BF09-19 -2963780    1240270 11-10-06    16-10-06",
                            header = TRUE)

# Species records, location, record date and year

    recordtable <- read.table(text = "station_code  latitude    longitude   DateTimeOriginal    year
    BF09-1  -2955950    1247610 24-09-05    2005
    BF09-10 -2955950    1247610 09-10-05    2005
    BF09-11 -2955950    1247610 26-09-05    2005
    BF09-12 -2958100    1245020 29-09-05    2005
    BF09-13 -2958550    1244090 29-09-05    2005
    BF09-14 -2958130    1244300 27-09-05    2005
    BF09-15 -2958130    1244300 28-09-05    2005
    BF09-16 -2958260    1245340 24-09-05    2005
    BF09-17 -2955950    1247610 15-10-06    2006
    BF09-18 -2963780    1240270 24-09-05    2005
    BF09-19 -2963780    1240270 15-10-06    2006
    ", header= TRUE)

Raster to use as the reference grid for pooling data

r <- raster()
    crs(r) <- "+proj=laea +lat_0=52 +lon_0=10 +x_0=4321000 +y_0=3210000 +ellps=GRS80 +units=m +no_defs"
    ext.r <- extent(-2963780, -2955950, 1240270, 1247610)
    extent(r) <- ext.r
    res(r) <- 926.6254
    values(r)<-1:64 # Numbering each grid cell consecutively

Final product: the same ‘camtrap_ca’ file I have but with rows representing not the data from each camera trap number as I have now, but each 926.6254m2 grid cell of the raster, and in the columns the start and end dates of survey, the total number of cameras that were pooled together (if any), and in another column the id of the cameras pooled in each grid cell. This means that not only do I need to pool cameras and record the grid cell where they are located, but that the start and end dates for each grid must be updated as well if cameras were pooled together. And finally, join each grid cell number to each of the camera id (station_code) in the 'recordtable' file.

If someone could help me build a code to do this I would be very grateful!

Upvotes: 0

Views: 77

Answers (1)

Robert Hijmans
Robert Hijmans

Reputation: 47081

We only need recordtable, not camtrap_ca

tab <- read.table(text = "station_code  latitude    longitude   DateTimeOriginal    year
    BF09-1  -2955950    1247610 24-09-05    2005
    BF09-10 -2955950    1247610 09-10-05    2005
    BF09-11 -2955950    1247610 26-09-05    2005
    BF09-12 -2958100    1245020 29-09-05    2005
    BF09-13 -2958550    1244090 29-09-05    2005
    BF09-14 -2958130    1244300 27-09-05    2005
    BF09-15 -2958130    1244300 28-09-05    2005
    BF09-16 -2958260    1245340 24-09-05    2005
    BF09-17 -2955950    1247610 15-10-06    2006
    BF09-18 -2963780    1240270 24-09-05    2005
    BF09-19 -2963780    1240270 15-10-06    2006", 
    header=TRUE,  stringsAsFactors=FALSE)   

You have

colnames(tab)[2:3]
#[1] "latitude"  "longitude"

These names are clearly wrong (the unit is not degrees but likely meters; as in the crs that you specify for the raster). We can fix that like this

colnames(tab)[2:3] <- c("x", "y")

It does not affect the computations, but I am worried that you x and y might be reversed because latitude is normally "y" not "x". I will assume this is not the case.

Given a RasterLayer, you can get the cell numbers for each coordinate pair --- and that is what you need to group records.

r <- raster(crs="+proj=laea +lat_0=52 +lon_0=10 +x_0=4321000 +y_0=3210000 +ellps=GRS80 +units=m", ext = extent(-2966000, -2954000, 1240000, 1248000), res = 926.6254)

#To illustrate you could do 
#values(r) <- 1:ncell(r)
#plot(r)
#points(tab[, c("x", "y")]) 

Get the cell numbers

tab$cell <- cellFromXY(r, tab[, c("x", "y")])
tab
#   station_code        x       y DateTimeOriginal year cell
#1        BF09-1 -2955950 1247610         24-09-05 2005   11
#2       BF09-10 -2955950 1247610         09-10-05 2005   11
#3       BF09-11 -2955950 1247610         26-09-05 2005   11
#4       BF09-12 -2958100 1245020         29-09-05 2005   48
#5       BF09-13 -2958550 1244090         29-09-05 2005   61
#6       BF09-14 -2958130 1244300         27-09-05 2005   48
#7       BF09-15 -2958130 1244300         28-09-05 2005   48
#8       BF09-16 -2958260 1245340         24-09-05 2005   35
#9       BF09-17 -2955950 1247610         15-10-06 2006   11
#10      BF09-18 -2963780 1240270         24-09-05 2005  107
#11      BF09-19 -2963780 1240270         15-10-06 2006  107

With this you can compute summaries with base R functions aggregate or tapply (or other approaches)

# transform your character dates to Date objects
tab$date <- as.Date(tab$DateTimeOriginal, "%d-%m-%y")
datemin <- aggregate(tab[, "date", drop=FALSE], tab[, "cell", drop=FALSE], min)
colnames(datemin)[2] <- "first_date"
datemax <- aggregate(tab[, "date", drop=FALSE], tab[, "cell", drop=FALSE], min)
colnames(datemax)[2] <- "last_date"
out <- merge(datemin, datemax)

# number of observations n <- aggregate(tab$cell, tab[, "cell", drop=FALSE], length) colnames(n)[2] <- "nobs" out <- merge(out, n)

# number of cameras ncam <- aggregate(tab$station_code, tab[, "cell", drop=FALSE], function(i)length(unique(i))) colnames(ncam)[2] <- "n_stations" out <- merge(out, ncam)

# add the coordinates for the cells
out <- cbind(out, xyFromCell(r, out$cell))
out
#  cell first_date  last_date n n_stations        x       y
#1   11 2005-09-24 2005-09-24 4          4 -2956270 1247537
#2   35 2005-09-24 2005-09-24 1          1 -2958124 1245683
#3   48 2005-09-27 2005-09-27 3          3 -2958124 1244757
#4   61 2005-09-29 2005-09-29 1          1 -2958124 1243830
#5  107 2005-09-24 2005-09-24 2          2 -2963683 1240124

Upvotes: 1

Related Questions