SAS_converted_to_R
SAS_converted_to_R

Reputation: 17

Regression in R using a function

I am trying to smooth out my data for each variable in the data frame. Lets say it looks like this:

data <- data.frame(v1 = c(0.5,1.1,2.9,3.4,4.1,5.7,6.3,7.4,6.9,8.5,9.1),
                   v2 = c(0.1,0.8,0.5,1.1,1.9,2.4,0.8,3.4,2.9,3.1,4.2),
                   v3 = c(1.3,2.1,0.8,4.1,5.9,8.1,4.3,9.1,9.2,8.4,7.4))

data$x <- 1:nrow(data)

I then specify my x and y variables as:

x <- data$x
y <- data$v1

I can fit the predicted line I want (and I am happy with the process):

f <- function (x,a,b,d) {(a*x^2) + (b*x) + d}
order_two <- nls(y ~ f(x,a,b,d), start = c(a=1, b=1, d=1)) 
co2 <- coef(order_two)
data$order_two_predicted_v1 <- (co2[1] * (data$x)^2) + (co2[2] * data$x) + co2[3] 

I therefore end up with an appropriately titled new variable (the predicted values for v1). I now want to do this for each of the other 100 variables in my data frame (v2 and v3 in this example).

I tried using a function to do this but can't get it to work as intended. Here is my attempt:

myfunction <- function(xaxis,yaxis){
  # Specfiy my "y" and "x"
  x <- data$xaxis
  y <- data$yaxis

  f <- function (x,a,b,d) {(a*x^2) + (b*x) + d}
  order_two <- nls(y ~ f(x,a,b,d), start = c(a=1, b=1, d=1)) 
  co2 <- coef(order_two)
  data$order_two_predicted_yaxis <- (co2[1] * (data$x)^2) + (co2[2] * data$x) + co2[3]
}

myfunction(x,v1)
myfunction(x,v2)
myfunction(x,v3)

Not only does the function not work as intended, I would like to avoid calling the function 100 times for each variable and instead somehow loop through it.

This is really simple to do in SAS using macros but I am struggling to get this to work in R.

Upvotes: 1

Views: 90

Answers (1)

PascalIv
PascalIv

Reputation: 615

You can model your data directly with the lm() function:

data <- data.frame(v1 = c(0.5,1.1,2.9,3.4,4.1,5.7,6.3,7.4,6.9,8.5,9.1),
                   v2 = c(0.1,0.8,0.5,1.1,1.9,2.4,0.8,3.4,2.9,3.1,4.2),
                   v3 = c(1.3,2.1,0.8,4.1,5.9,8.1,4.3,9.1,9.2,8.4,7.4))

x  <- 1:nrow(data)

# initialize a list to store the models 

models = vector("list", length = (ncol(data)))

# create a loop running over the columns of data

for (i in 1:(ncol(data))){
models[[i]] =  lm(data[,i] ~ poly(x,2, raw = TRUE))}

You can also use lapply instead of the for-loop, as stated in the comments.

Use predict() to get the values of the models:

smoothed_v1 = predict(model[[1]], newdata=data.frame(x = x))

Edit: Regarding your comment - you can store the new values in data with:

 for (i in (length(models):1)){
     data <- cbind(predict(models[[i]], newdata=data.frame(x = x)), data)

     # set the name for the new column

     names(data)[1]  = paste("pred_v",i, sep ="")}

Upvotes: 2

Related Questions