Reputation: 75
I've searched here and on Google and haven't found an answer that I can apply to my situation.
Lets say I have a dataframe with columns for Element 1, Element 2, Element 3, Metric, Other. I have another internal function that has three arguments (input_dataframe, element_position, metric_position) that I use to perform calculations one element at a time. It outputs a dataframe, lets say 1 row by three variables.
I have been trying to use either lapply or for loops to write code that will allow me to specify the range of columns containing the elements (in this example above, its columns 1-3 of the dataframe) and run the function for all the specified columns against the metric column and then combine the results into one table that has the results of each run of the function. I havent had any luck making this work trying variations of lapply and for loops with seq_along. Any suggestions? Sample data, code, and output below for my current inefficient solution:
#example data
element1 <- c("control", "control", "variation", "variation")
element2 <- c("control", "variation", "variation", "control")
element3 <- c("variation", "control", "variation", "variation")
metric <- c(10,15,20,25)
other <- c(2,4,2,6)
data<-data.frame(element1, element2, element3, metric, other)
#example function
test_func <- function(input_df,element_position,metric_position)
{
df <- input_df[,c(element_position,metric_position)]
colnames(df) <- c("element","metric")
mean <- ddply(df,~element,summarise,mean(metric))
control <- mean[1,2]
variation <- mean[2,2]
lift <- (variation-control)/control
df_table <<- data.frame(control,variation,lift)
}
#call function three times, once for each element, compile results
test_func(data,1,4)
element1 <- df_table
test_func(data,2,4)
element2 <- df_table
test_func(data,3,4)
element3 <- df_table
summary_output <- rbind(element1,element2,element3)
Upvotes: 2
Views: 897
Reputation: 3597
There is a typo in the part df_table <<- data.frame(control,variation,lift)
, The operator <<-
does a global assignment instead of local function environment hence the latest value overrides the previous ones. Editing the typo and using lapply and rbind gives the result you expected.
test_func_modif <- function(input_df,element_position,metric_position)
{
df <- input_df[,c(element_position,metric_position)]
colnames(df) <- c("element","metric")
mean <- ddply(df,~element,summarise,mean(metric))
control <- mean[1,2]
variation <- mean[2,2]
lift <- (variation-control)/control
df_table <- data.frame(control,variation,lift)
}
element_vec = 1:3
metric_position_value = 4
result_list = lapply(element_vec,function(x) test_func_modif(data,x,metric_position_value))
result_DF = do.call(rbind,result_list)
# > result_DF
# control variation lift
# 1 12.5 22.50000 0.8000000
# 2 17.5 17.50000 0.0000000
# 3 15.0 18.33333 0.2222222
# > all.equal(summary_output,result_DF)
# [1] TRUE
Upvotes: 0
Reputation: 20811
I made some minor changes to your function. You should just return the object and save the result of the function rather than using <<-
#example data
element1 <- c("control", "control", "variation", "variation")
element2 <- c("control", "variation", "variation", "control")
element3 <- c("variation", "control", "variation", "variation")
metric <- c(10,15,20,25)
other <- c(2,4,2,6)
data<-data.frame(element1, element2, element3, metric, other)
#example function
test_func <- function(input_df,element_position,metric_position)
{
require('plyr')
df <- input_df[,c(element_position,metric_position)]
colnames(df) <- c("element","metric")
mean <- ddply(df,~element,summarise,mean(metric))
control <- mean[1,2]
variation <- mean[2,2]
lift <- (variation-control)/control
data.frame(control,variation,lift)
}
this will just map each set of parameters to the test_func
:
data
, element_position = 1, metric_position = 4data
, element_position = 2, metric_position = 4data
, element_position = 3, metric_position = 4etc.
do.call('rbind', Map(test_func, rep(list(data), 3), 1:3, rep(4, 3)))
# control variation lift
# 1 12.5 22.50000 0.8000000
# 2 17.5 17.50000 0.0000000
# 3 15.0 18.33333 0.2222222
Upvotes: 1