user1403848
user1403848

Reputation: 103

How to perform least squares regression in R given training and testing data with class labels?

I have a 63*62 training set and the class labels are also present. The test data is a 25*62 dimensions and has the class labels too. Given this how would I perform least squares regression? I am using the code:

res = lm(height~age)

what does height and age correspond to? When I have 61 features + 1 class (making it 62 columns for the training data) how would I input parameters?

Also how do I apply the model on the testing data?

Upvotes: 1

Views: 1213

Answers (2)

Wilmer E. Henao
Wilmer E. Henao

Reputation: 4302

If you have 62 columns you may want to use the more general formula

res = lm(height ~ . , data = mydata)

Notice how the period '.' represent the rest of the variables. But the previous answer is completely right in the sense that there are more variables than observations and therefore the answer (if there's any which shouldn't be) is completely useless.

Upvotes: 2

sashkello
sashkello

Reputation: 17871

height and age would be simply the labels of columns in your data frame. height is a predicted variable. You can have as many variables there as you wish: res = lm(height~age+wight+gender)

However, I must say that the question seems a bit strange to me because if you are performing a regression with 62 variables having 62 points in training set it will simply mean that you will always have an exact solution. Training set should always be (significantly) larger than the number of variables used.

Upvotes: 1

Related Questions