Reputation: 23
I am running into issues with proper data extraction using Terra. It appears to be obtaining the mean and I am not seeing an option to select the most frequent or largest coverage for a selected area.
I have two raster stack datasets I am working with for a given area.
Dataset1: is irrelevant to the task at hand aside from the fact that it was used to create the extent that will be used to extract data from Dataset2. Dataset 1 consisted of 30x30 meter pixels that were aggregated into groups of 9 to define a zone/extent along a grid. This grid is what I will be working with to extract Dataset2 values.
Dataset2:Also consists of 30x30 meter pixels but these pixels do not line up with Dataset1 pixels. Everything is projected properly, this seems to just be how the data lines up.
For each grid I obtained the "mean" value of Dataset1. This part worked well. Dataset2, however is Land cover data, so even though it's represented by a "number" in the data, the number is not mathematical- it just represents specific value, for example, 42 (forest) and 71(grassland) cannot be averaged to create a number that makes any sense. I was extracting the data and noticed it gave me 9 "values" for each grid for each raster, and that despite the data pixels not lining up with the grid it was still slicing them into 9 30x30m sections along the grid-line. I didn't tell it to do this with the code. I used: LCValues<- extract(LC, Grid)
where "LC" is my spatraster stack and "Grid" is my vector polygon grid data layer.
I presumed it would give me each value in the grid and that I could obtain how much area each value covered to select the value with the most weight, but what happened with the 30x30 subdivided areas was that it was giving whole number where it could (when the value was constant across the 30x30m area) but sometimes the value obtained for each 30x30m was a weighted average of the 30x30m section, for example if it was 75% grassland and 25% forested, my value would be closer to the grassland value (60s) but if it was 25% forested and 75% grassland it would be in the upper 40s. I do not need the 30x30 meter "sub grids" at all, but at the very least I would like it to at least choose one of the values in the grid not average them because that data is un-useable to me. I would prefer to extract the value that is most constant (covers the greatest area) across the 90x90m grid area if possible.
I have tried adding exact=TRUE and weights=TRUE but this doesn't work because of the weird way it's automatically subdividing into multiple 30x30 m sub-grids.
The simplest solution would be to make my grid follow along the data but this isn't possible because I need to be able to merge this with Dataset1 eventually for further analysis.
I also tried using a spatial join (so I could use intersect and largest=TRUE) but that isn't possible with the SpatRaster. Is there another option I am missing? And does anyone know where these 9 30x30 meter grids within my grid are coming from?
Upvotes: 0
Views: 78