Kuo-Hsien Chang
Kuo-Hsien Chang

Reputation: 935

R how to properly put column name to function as an input

Here is a small example of dataset I wish to process:

df   = setNames(data.frame(matrix(1:100,10)), c("Dis_N1", "Dis_N2", "Dis_N3", "Dis_N4", "Dis_N5", "Dis_N6", "Dis_N7", "Dis_N8", "Dis_N9", "Dis_N10"))

FilterGap   = setNames(data.frame(matrix(1:10,1)), c("Dis_N1", "Dis_N2", "Dis_N3", "Dis_N4", "Dis_N5", "Dis_N6", "Dis_N7", "Dis_N8", "Dis_N9", "Dis_N10"))

I have another function (FrcGap, see below) to process df dataset based on the value in the FilterGap.

The old function (not working):

FrcGap = function(Var){length(na.omit(df$Var[df$Var > FilterGap$Var])) / length(na.omit(df$Var))}

I review other posts and noticed that I need to convert $ to [[ in the function. So, I modified the old function to the new function.

The new function (not working):

FrcGap = function(Var){length( na.omit( df[[Var[df$Var > FilterGap$Var]]] ) ) / length( na.omit( df[[Var]] ) )}

I also realized that the new function is not easy to be understood and it also has errors.

The errors:

> FrcGap("Dis_N1")
 Show Traceback

 Rerun with Debug
 Error in .subset2(x, i, exact = exact) : no such index at level 1 

Manual procedure (it works): If I insert the Var ID to the function one by one manually, it actually works.

length(na.omit(df$Dis_N1[df$Dis_N1 > FilterGap$Dis_N1])) / length(na.omit(df$Dis_N1))
length(na.omit(df$Dis_N2[df$Dis_N2 > FilterGap$Dis_N2])) / length(na.omit(df$Dis_N2))
length(na.omit(df$Dis_N10[df$Dis_N10 > FilterGap$Dis_N10])) / length(na.omit(df$Dis_N10))

Could you please provide your insights, comments, and suggestions for this type of work in R?

Thanks a lot.

Upvotes: 1

Views: 77

Answers (1)

flee
flee

Reputation: 1335

OK thanks for adding example data, I can get the "old" function working fine.

FrcGap = function(var1, var2){
  length(na.omit(var1[var1 > var2])) / length(na.omit(var1)) 
}

If you want to run it on a single set of values you can do this:

FrcGap(df$Dis_N1, FilterGap$Dis_N1)

[1] 0.9

Or if you want to run it over the both dataframes in their entirety you can use mapply

mapply(FrcGap, df, FilterGap)

Dis_N1  Dis_N2  Dis_N3  Dis_N4  Dis_N5  Dis_N6  Dis_N7  Dis_N8  Dis_N9 Dis_N10 
    0.9     1.0     1.0     1.0     1.0     1.0     1.0     1.0     1.0     1.0 

Upvotes: 1

Related Questions