Reputation: 23
I know this question has been asked before, I have checked other portals but didn't get the correct response.
I was doing a school project following a sample manual based on the medical records.
So I begin with splitting the data set into two parts test and train. There are 569 records and 31 variables :
Wbcd_train : 469 obj. 31 Var.
Wbcd_test : 100 obj. 31 Var.
Wbcd_train_lables : 469 obj. 1 Var.
Adding sample data entries for wbcd_train, wbcd_test, wncd_train_lables.
wbcd_train :
Radius 17.99 Texture 10.38 Perimeter 122.8 Area 0.1184
wbcd_test :
Radius 17.6 Texture 23.33 Perimeter 119 Area 980.5
wncd_train_lables :
Diagnosis M
When using the function :
wbcd_test_pred <- knn(train = wbcd_train, test = wbcd_test, cl = wbcd_train_labels, k = 21)
I get the following error:
knn(train = wbcd_train, test = wbcd_test, cl = wbcd_train_labels, : 'train' and 'class' have different lengths
Upvotes: 1
Views: 4168
Reputation: 101
The class parameter should be provided as a vector, not as a dataframe. Referring to Diagnosis
variable in your wbcd_train_labels
dataframe should work
wbcd_test_pred <- knn(train = wbcd_train, test = wbcd_test,
cl = wbcd_train_labels$Diagnosis,...)
Upvotes: 1