Reputation: 55
I'm struggling with the syntax for performing operations on elements on nested dataframes. Using this example:
> df1 <- tibble(P=c(101,101,102,102,103,103,101,101,102,102,103,103))
> df2 <- tibble(C=c(1,2,1,2,1,2,1,2,1,2,1,2))
> df3 <- tibble(SmpDate=as.Date(c("2019-11-01","2019-11-01","2019-11-01","2019-11-01","2019-11-01","2019-11-01","2019-11-02","2019-11-02","2019-11-02","2019-11-02","2019-11-02","2019-11-02")))
> df4 <- tibble(Fl=round(runif(12,0.1,5),2))
> df <- data.frame(df1,df2,df3,df4) #create the data.frame
> df_n <- df %>% group_by(P,C,SmpDate) %>% nest(data=c(SmpDate,Fl))
>
> glimpse(df_n)
Observations: 6
Variables: 3
Groups: P, C [6]
$ P <dbl> 101, 101, 102, 102, 103, 103
$ C <dbl> 1, 2, 1, 2, 1, 2
$ data <list<df[,2]>> 18201.00, 18202.00, 0.50, 3.11, 18201.00, 18202.00, 2.04, 0.86, 18201.00, 18202.00, 2.07, 1.59, 18201.00, 18202.00, 4.51, 2.83, 18201.0...
>
I want to perform operations on the Fl variable and the SmpDate in the data list using lag functions and some conditional statements. I understand that I should be able to use the purrr::map functions but I can't seem to get the syntax right to address the individual elements. For example, keeping in mind that I realize this will not work:
cp1<-function(df){
day(SmpDate)*Fl
}
cp2<-function(df){
(SmpDate-lag(SmpDate,n=1L))*Fl
}
Using mutate and conditions based on the SmpDate, I will select the function to apply.
Upvotes: 0
Views: 401
Reputation: 1718
Here's an attempt at something. But your post lack enough context for this approach to make sense.
df_n %>%
mutate(
cp1 = data %>% map(. %>% pmap(function(SmpDate, Fl, ...) {
lubridate::day(SmpDate)*Fl
})),
cp2 = data %>% map(. %>% pmap(function(SmpDate, Fl, ...) {
(SmpDate-lag(SmpDate,n=1L))*Fl
}))
) %>%
# unnest(c(cp1, cp2))
identity()
Upvotes: 0