Death Metal
Death Metal

Reputation: 892

function for data.table to perform group by actions

I've a sample data.table data

sampledt<- data.table("BP"=c(seq(c(1:3)),c(1:2)) ,"STATUS"=c(rep("CASE",5),rep("CONTROL",5) ), "value"=c(0.8,0.9,0.10,0.9,0.10))

There are columns - BP - basepair, status that is case and control. Value is a value per BP stratified by status. I need to obtain mean of the value grouped by BP and STATUS, which I obtain using following code:

sampledt[,.("meaned_group"=mean(value)),by=.(BP,STATUS)] ## this achieves desired results 

However, I'd like to make a function to perform this task. At times I need to obtain mean simply by BP, or say by STATUS column. Or instead of mean I'm interested in sum.

join_group_datatable<-function(temp_datat,temp_namecolumn,column_value,func_join, list_groupby){

##temp_datat - temp data.table
## temp_namecolumn - output column name - grouped_mean or meaned_group 
## column_value column on which function needs to be applied 
## func_join - function may be mean, may be sun
## list_groupby - vector of group

temp_datat[,.(temp_namecolumn=func_join(column_value) , by=.(list_groupby))]

}

I set the function and run following line of code:

join_group_datatable(sampledt,"meaned_group","value",mean,c("BP","STATUS"))

This gives me error/warning:

Warning message:
In mean.default(column_value) :
  argument is not numeric or logical: returning NA

Class of input data.table value is numeric. I cannot understand how to make a function passing column names, function and obtain desired results.

Upvotes: 0

Views: 62

Answers (1)

Dave Ross
Dave Ross

Reputation: 703

If you replace your function body with the following it should work.

temp_datat[, setNames(.(func_join(get(column_value))), temp_namecolumn), by = mget(list_groupby)]

This uses get/mget and setNames to pass function parameters on to the relevant places in the data.table calling scope.

Upvotes: 1

Related Questions