Reputation: 103
I have a 63*62 training set and the class labels are also present. The test data is a 25*62 dimensions and has the class labels too. Given this how would I perform least squares regression? I am using the code:
res = lm(height~age)
what does height and age correspond to? When I have 61 features + 1 class (making it 62 columns for the training data) how would I input parameters?
Also how do I apply the model on the testing data?
Upvotes: 1
Views: 1213
Reputation: 4302
If you have 62 columns you may want to use the more general formula
res = lm(height ~ . , data = mydata)
Notice how the period '.' represent the rest of the variables. But the previous answer is completely right in the sense that there are more variables than observations and therefore the answer (if there's any which shouldn't be) is completely useless.
Upvotes: 2
Reputation: 17871
height
and age
would be simply the labels of columns in your data frame. height
is a predicted variable. You can have as many variables there as you wish: res = lm(height~age+wight+gender)
However, I must say that the question seems a bit strange to me because if you are performing a regression with 62 variables having 62 points in training set it will simply mean that you will always have an exact solution. Training set should always be (significantly) larger than the number of variables used.
Upvotes: 1