Reputation: 11
I'm trying to test the predictive power of a model by breaking the observations into 1/4th and 3/4th groups (test and train respectively), running a first-order regression with the independent variable train sample, using these coefficients to produce predicted values from the independent variable test sample, and then I would like to add new columns of these predicted values to the dependent variable test data for each iteration of the loop.
For context: TSIP500 is the full sample; iv
is independent variable; dv
is dependent variable, a max of 50 iterations is simply a test that isn't too large in quantity of iterations.
I was having trouble with the predict function so I did the equation manually. My code is below:
for(i in 1:50){
test_index <- sample(nrow(TSIP500iv), (1/4)*nrow(TSIP500iv), replace=FALSE)
train_500iv <- TSIP500[-test_index,"distance"]
test_500iv <- TSIP500[test_index,"distance"]
train_500dv <- TSIP500[-test_index,"percent_of_max"]
test_500dv <- TSIP500[test_index,"percent_of_max"]
reg_model <- lm(train_500dv~train_500iv)
int <- reg_model$coeff[1]
B1 <- reg_model$coeff[2]
predicted <- (int + B1*test_500iv)
predicted <- data.frame(predicted)
test_500dv <- data.frame(test_500dv)
test_500dv[,i] <- apply(predicted)
}
I've tried different approaches for the last line, but I always just get a singular column added. Any help would be tremendously appreciated.
Upvotes: 1
Views: 53
Reputation: 341
for(i in 1:50){
test_index <- sample(nrow(TSIP500iv), (1/4)*nrow(TSIP500iv), replace=FALSE)
train_500iv <- TSIP500[-test_index,"distance"]
test_500iv <- TSIP500[test_index,"distance"]
train_500dv <- TSIP500[-test_index,"percent_of_max"]
test_500dv <- TSIP500[test_index,"percent_of_max"]
reg_model <- lm(train_500dv~train_500iv)
int <- reg_model$coeff[1]
B1 <- reg_model$coeff[2]
temp_results <- paste('pred',i,sep='_')
assign(temp_results, as.data.frame(int + B1*test_500iv))
test_500dv <- cbind(data.frame(test_500dv),temp_results)
}
Upvotes: 0