89_Simple
89_Simple

Reputation: 3805

Running a function after grouping in dplyr

I have written a function that calculates the timing of growth stages in crops (growing degree days). As a background, after a crop is planted, it moves from one stage to another stage after it has accumulated certain heat units. For e.g. for a given plant day, a crop needs 300°C,500°C,600°C accumulated heat units to reach stage1, stage2 and stage 3 respectively.

This function takes a vector of temperature, temp.vec, plant.date which is basically a day you want to start calculating accumulated heat units, a base temperature, an optimum temperature and a critical temperature.

set.seed(123)
sample.temp <- data.frame(day = 1:365,tmean = c(sample(25:32,365, replace = T)))

gdd.func <- function(temp.vec,plant.date,t.base,t.opt,t.cri){

 x <- temp.vec[temp.vec > plant.date]

 fT <- ifelse(x >= t.base & x <= t.opt,(x - t.base)/(t.opt - t.base),
           ifelse(t.opt <= x & x <= t.cri,(t.cri - x)/(t.cri - t.opt),0))

 Te <- t.base + fT*(t.opt - t.base)
 thermal.units <- Te - t.base
 day.stage1 <- which.max(cumsum(thermal.units) >= 300) # this will give me the day when cumulative accumulation of thermal units crossed 300 heat units

 # once growth stage 1 is reached, t.base,t.opt and t.cri are updated
 t.base <- t.base - 2
 t.opt <- t.opt - 2
 t.cri <- t.cri - 2

 fT[(day.stage1 + 1):length(fT)] <- ifelse(x[(day.stage1 + 1):length(fT)] >= t.base & x[(day.stage1 + 1):length(fT)] <= t.opt,(x[(day.stage1 + 1):length(fT)] - t.base)/(t.opt - t.base), ifelse(t.opt <= x[(day.stage1 + 1):length(fT)] & x[(day.stage1 + 1):length(fT)] <= t.cri,(t.cri - x[(day.stage1 + 1):length(fT)])/(t.cri - t.opt),0))

 Te[(day.stage1 + 1):length(Te)] <- t.base + fT[(day.stage1 + 1):length(fT)]*(t.opt - t.base)

 thermal.units[(day.stage1 + 1):length(Te)] <- Te[(day.stage1 + 1):length(Te)] - t.base

  day.stage2 <- which.max(cumsum(thermal.units) >= 500) 

   # once growth stage 2 is reached, t.base,t.opt and t.cri are updated again
   t.base <- t.base - 1
   t.opt <- t.opt - 1
   t.cri <- t.opt - 1

   fT[(day.stage2 + 1):length(fT)] <- ifelse(x[(day.stage2 + 1):length(fT)] >= t.base & x[(day.stage2 + 1):length(fT)] <= t.opt,(x[(day.stage2 + 1):length(fT)] - t.base)/(t.opt - t.base), ifelse(t.opt <= x[(day.stage2 + 1):length(fT)] & x[(day.stage2 + 1):length(fT)] <= t.cri,(t.cri - x[(day.stage2 + 1):length(fT)])/(t.cri - t.opt),0))

   Te[(day.stage2 + 1):length(Te)] <- t.base + fT[(day.stage2 + 1):length(fT)]*(t.opt - t.base)

   thermal.units[(day.stage2 + 1):length(Te)] <- Te[(day.stage2 + 1):length(Te)] - t.base

    day.stage3 <- which.max(cumsum(thermal.units) >= 600) 

    list(day.stage1,day.stage2,day.stage3)
   }

Doing a test run

   t.base <- 24
   t.opt <- 32
   t.cri <- 36

   plant.dates <- gdd.func(temp.vec = sample.temp$tmean,plant.date = 10,t.base,t.opt,t.cri)
   unlist(plant.dates)
  # [1]  66 117 144

The output is a vector of three days which gives the occurrence of stage1, stage2 and stage3 for a plant.date 10.

My question is if I want to run the above function for multiple plant.date across multiple location and year. For e.g. imagine this data:

   sample.data <- data.frame(id1 = rep(1:20, each = 730*36), year = rep(rep(1980:2015, each = 365*2), times = 20),day = rep(rep(1:730, times = 36), times = 20), tmean = sample(25:32,20*730*36,replace = T))  

   head(sample.data)
   id1 year day tmean
1   1 1980   1    26
2   1 1980   2    32
3   1 1980   3    25
4   1 1980   4    26
5   1 1980   5    28
6   1 1980   6    28

The data consists of 20 locations, each location has 36 years of data. Each year has 730 days (365*2) and the mean temperature of each day.

I have three plant.date.

     plant.vec <- c(250,290,302)

I want to select each planting day and generate the three growth stages for each of my location X year combination

    for(p in seq_along(plant.vec))

        plant.date <- plant.vec[p]

        sample.data %>% group_by(id1,year) %>% # how to insert my gdd.func here so that it runs for each id1 and year combination)

Thank you

Upvotes: 1

Views: 307

Answers (1)

Prem
Prem

Reputation: 11955

Does this help?

library(dplyr)

plant.vec <- c(10, 20, 30)

final_lst <- lapply(plant.vec, function(x)
  sample.data %>% 
    group_by(id1,year) %>%
    summarise(plant.dates = paste(gdd.func(temp.vec = tmean, plant.date = x, t.base, t.opt, t.cri), collapse=",")))

Upvotes: 1

Related Questions