pradeepvaranasi
pradeepvaranasi

Reputation: 171

How can I extract a function argument to use inside the function using R language ?!

I'm trying to create a function that creates a model and could predict the target variable for any given data.frame (eg. mtcars).

#Function to create a model for predicting a target variable
myRegModel = function(myFormula,myData){
sampleIndex = sample(1:nrow(myData),size= 0.7*nrow(myData), replace=FALSE)
myTraining = myData[sampleIndex, ]
myTesting = myData[-sampleIndex, ]
myDataFit = lm(myFormula, data = myTraining)
myTesting$predVar <- predict(myDataFit, myTesting)
myTesting$predErr <- abs(((myTesting$mpg - myTesting$predVar)/ myTesting$mpg)*100)
print(cor(myTesting$mpg, myTesting$predVar))
print(mean(myTesting$predErr))
print(summary(myDataFit)) 
}

myRegModel(mpg ~ ., myMtCars)

However, I've hard-coded my target varaible (mpg) in the case of finding the predicted error and correlation values above. Since, I'm passing my target variable in the function as first argument, Is there a way I could extract my target variable and dynamically assign to myTesting data.frame. (eg. myTesting$target)

Upvotes: 1

Views: 42

Answers (2)

akrun
akrun

Reputation: 887158

Just to extend @RuiBarradas approach, we can extract the variable directly from the formula using all.vars then, use [[ as @RuiBarradas suggested

myRegModel <- function(myFormula,myData){
        nm1 <- all.vars(myFormula)[1]
        sampleIndex <- sample(seq_len(nrow(myData)),size= 0.7*nrow(myData), replace=FALSE)
        myTraining <- myData[sampleIndex, ]
        myTesting <- myData[-sampleIndex, ]
        myDataFit <- lm(myFormula, data = myTraining)
        myTesting$predVar <- predict(myDataFit, myTesting)
        myTesting$predErr <- abs(((myTesting[[nm1]] - 
                   myTesting$predVar)/ myTesting[[nm1]])*100)
        myTesting

    }

myMtCars <- mtcars
myRegModel(mpg ~ ., myMtCars)
#                  mpg cyl  disp  hp drat    wt  qsec vs am gear carb  predVar   predErr
#Datsun 710          22.8   4 108.0  93 3.85 2.320 18.61  1  1    4    1 26.43998 15.964845
#Hornet 4 Drive      21.4   6 258.0 110 3.08 3.215 19.44  1  0    3    1 20.84027  2.615556
#Valiant             18.1   6 225.0 105 2.76 3.460 20.22  1  0    3    1 20.30464 12.180316
#Merc 280            19.2   6 167.6 123 3.92 3.440 18.30  1  0    4    4 18.10403  5.708192
#Lincoln Continental 10.4   8 460.0 215 3.00 5.424 17.82  0  0    3    4 11.22245  7.908153
#Fiat 128            32.4   4  78.7  66 4.08 2.200 19.47  1  1    4    1 27.88747 13.927557
#Toyota Corona       21.5   4 120.1  97 3.70 2.465 20.01  1  0    3    1 25.47992 18.511254
#Pontiac Firebird    19.2   8 400.0 175 3.08 3.845 17.05  0  0    3    2 16.11037 16.091819
#Lotus Europa        30.4   4  95.1 113 3.77 1.513 16.90  1  1    5    2 25.64254 15.649525
#Maserati Bora       15.0   8 301.0 335 3.54 3.570 14.60  0  1    5    8 11.47808 23.479490

Upvotes: 2

Rui Barradas
Rui Barradas

Reputation: 76432

Yes, there is a way of doing what you want. You'll just have to use a different notation for the columns of a data.frame. Generally speaking, when in interactive mode it's OK to use dat$col. But when you program a function it's much better to use dat[[col]]. These are exactly the same vector but the latter is far more flexible.

So, in your case this would become myTesting[[target]].

Upvotes: 1

Related Questions