Sarah
Sarah

Reputation: 41

Memory efficient way to convert large raster stack to dataframe and merge with another dataframe

I am trying to convert a large raster stack (4 layers) to a data frame and then combine it with a smaller descriptor data frame in R. I need the 40 rows from the descriptor data frame to be repeated for every grid cell of the raster data frame. Is there a way to do this that is not so memory hungry (it crashes on an HPC running out of memory at 1000GB)?

The reason I am doing this is that I need to feed a dataframe into a model that predicts fire spread (package 'firebehavioR', function rothermel()) and I need the model to run for each treatment combination and each grid cell. As far as I can tell, you have to feed a dataframe into the model (can't be a raster), and you cannot have variables of different lengths [the parameters that provide things plant biomass and moisture content (from my descriptor dataset) cannot be of a shorter length than the parameters that provide things like slope and wind (from the spatial dataset)]

I have been doing this:

# this is the raster stack
rs <- stack('SLOPE_WIND_SM_ras.tif') 
> rs
class      : RasterStack 
dimensions : 18001, 26061, 469124061, 4  (nrow, ncol, ncell, nlayers)
resolution : 0.0002777778, 0.0002777778  (x, y)
extent     : -97.23903, -89.99986, 43.99986, 49.00014  (xmin, xmax, ymin, ymax)
crs        : +proj=longlat +datum=WGS84 +no_defs 
names      : SLOPE_WIND_SM_ras.1, SLOPE_WIND_SM_ras.2, SLOPE_WIND_SM_ras.3, SLOPE_WIND_SM_ras.4 
min values :          56.0000000,          35.0000000,           0.8514639,           0.0000000 
max values :           93.500000,           82.000000,            7.471955,           57.738941 

# this is the descriptor data
# there are 40 treatment combinations
# I need all 40 treatment combos to be repeated for each grid cell of the raster data
data_treatment <- read.csv("data_treat_avg.csv")
head(data_treatment)
# A tibble: 6 × 8
# Groups:   Species [3]
  Species              CombinedT LWC_mean H..cm._mean Biomass.g.m2_mean LWC_sd H..cm._sd Biomass.g.m2_sd
  <fct>                <fct>        <dbl>       <dbl>             <dbl>  <dbl>     <dbl>           <dbl>
1 Achillea millefolium Ambient       71.1        12.2              85    NA        NA               NA  
2 Agropyron repens     Ambient       66.0        28.9              10.4   3.97      8.65            19.8
3 Agropyron repens     +N            68.0        25.0              10.3   6.19      5.99            20.1
4 Agropyron repens     +CO2          62.5        28.0              15.5   2.57      4.55            11.3
5 Amorpha canescens    Ambient       56.9        35.0              33.0   4.69      3.03            29.6
6 Amorpha canescens    +N            56.9        27.6              60.7   4.17      6.73            85.4
**your text**
### Converting the raster stack to a dataframe
rs_dat <- rasterToPoints(rs)

### Merging the converted raster with the descriptor dataframe
rep_data_spatial <- as.data.frame(lapply(rep_data, rep, each=nrow(rs_dat)))
final_data <- cbind(rs_dat, rep_data_spatial)

Upvotes: 2

Views: 71

Answers (0)

Related Questions