Andrew Leach
Andrew Leach

Reputation: 167

Using summarize (or equivalent?) to create a column of functions in an R dataframe

I'm working with some electricity data that has, for each hour, day and asset a step function which specifies the asset's offering of power at escalating prices. What I'd like to do is collapse those data into a data frame, tibble, etc. with date, time, asset and a row-specific step function. I'll then use that step function to populate some other columns later on.

Here's a quick reproducible example of what I want to do.

library(dplyr)
df_test<-data.frame(rep(1:25, times=1, each=4))
names(df_test)[1]<-"asset"
df_test$block<-rep(1:4, times=25)
df_test$from<-rep(seq(0,150,50), times=25)
df_test$to<-df_test$from+50
df_test$index<-runif(100)*100

df_test<-df_test %>% group_by(asset) %>% mutate(price=cumsum(index))

This is basically an example of what I would have for each hour of each day, except that in my case, the numbers of blocks are different (some firms bid a single block, others bid up to 7 blocks, but that's likely not material to the problem here).

Now, what I would like to do is, for each asset, calculate a step function using the from, to, and price blocks and store it in a data frame by asset (again, in my extended case, it will be by date, hour, and asset).

For example, using the first group I could do this

generate_func<-function(x,y){
  stepfun(x, y, f = as.numeric(0), ties = "ordered",right = FALSE)
}

eg_func<-generate_func(df_test$from[2:4],df_test$price[1:4])

The function eg_func lets me find the implied price at any value x for asset 1.

eg_func(500)
[1] 43.10305

What I'd like to do is group my data by asset and then store a version of eg_func for each asset in a second column of a data frame or equivalent.

Basically, what I want to do is something like:

df_sum<-df_test %>% group_by(asset) %>% summarize(
  step_func=generate_func(from[-1],price)
) 

But I get:

Error: Column `step_func` is of unsupported type function

Update:

@akrun has gotten me a step down the road. So, if I wrap the function in a list, I can do what I want to do...at least the first step:

df_func<-df_test %>%
  group_by(asset) %>% 
  summarize(step_func=list(generate_func(from[-1],price)))

So now I have a data frame with a step function for each asset. Now, my next quest is to be able to evaluate that function to create a new column evaluating the step function at a particular value. So, for example, I can evaluate the first asset's bid at a value of 50:

df_func[1,2][[1]][[1]](50)
[1] 49.60776

I'd like to be able to do this in a mutate command, so something akin to:

df_func <-df_func %>% mutate(bid_50=step_func[[2]](50))

But that applies the second step function to everyone. How do I fill column bid_50 with each asset's step function evaluated at 50?

Update #2 @akrun again with the solution:

df_func <-df_func %>% mutate(bid_50=map_dbl(step_func, ~ .x(50)))

Upvotes: 1

Views: 117

Answers (1)

akrun
akrun

Reputation: 887851

It is better to wrap it in a list as eg_func is a function and then extract the list elements with map apply the function on the argument passed to create a new column 'bid_50'

library(tidyverse)
df_test %>%
  group_by(asset) %>% 
  summarize(step_func=list(generate_func(from[-1],price))) %>%
  mutate(bid_50 = map_dbl(step_func, ~ .x(50)))

Upvotes: 2

Related Questions