How to use variable names as arguments

Question

For a homework assignment, I wrote a function that performs forward step-wise regression. It takes 3 arguments: dependent variable, list of potential independent variables, and the data frame in which these variables are found. Currently all of my inputs except data frame, including the list of independent variables, are strings.

Many built-in functions, as well as functions from high-profile packages, allow for variable inputs that are not strings. Which way is best-practice and why? If non-string is best practice, how can I implement this considering that one of the arguments is a list of variables in the data frame, not a single variable?

mrip · Accepted Answer

Personally I don't see any problem with using strings if it accomplishes what you need it to. If you want, you could rewrite your function to take a formula as input rather than strings to designate independent and dependent variables. In this case your function calls would look like this:

fitmodel(x ~ y + z,data)

rather than this:

fitmodel("x",list("y","z"),data)

Using formulas would allow you to specify simple algebraic combinations of variables to use in your regression, like x ~ y + log(z). If you go this route, then you can build the data frame specified by the formula by calling model.frame and then use this new data frame to run your algorithm. For example:

> df<-data.frame(x=1:10,y=10:1,z=sqrt(1:10))
> model.frame(x ~ y + z,df)
    x  y        z
1   1 10 1.000000
2   2  9 1.414214
3   3  8 1.732051
4   4  7 2.000000
5   5  6 2.236068
6   6  5 2.449490
7   7  4 2.645751
8   8  3 2.828427
9   9  2 3.000000
10 10  1 3.162278
> model.frame(x ~ y + z + I(x^2) + log(z) + I(x*y),df)
    x  y        z I(x^2)    log(z) I(x * y)
1   1 10 1.000000      1 0.0000000       10
2   2  9 1.414214      4 0.3465736       18
3   3  8 1.732051      9 0.5493061       24
4   4  7 2.000000     16 0.6931472       28
5   5  6 2.236068     25 0.8047190       30
6   6  5 2.449490     36 0.8958797       30
7   7  4 2.645751     49 0.9729551       28
8   8  3 2.828427     64 1.0397208       24
9   9  2 3.000000     81 1.0986123       18
10 10  1 3.162278    100 1.1512925       10
>

How to use variable names as arguments

Answers (1)

Related Questions