Pranav Khanna
Pranav Khanna

Reputation: 23

Why do we get error saying "train" and "class" have different lengths while using Knn function in R?

I know this question has been asked before, I have checked other portals but didn't get the correct response.

I was doing a school project following a sample manual based on the medical records.

So I begin with splitting the data set into two parts test and train. There are 569 records and 31 variables :

Wbcd_train : 469 obj. 31 Var.

Wbcd_test : 100 obj. 31 Var.

Wbcd_train_lables : 469 obj. 1 Var.

Adding sample data entries for wbcd_train, wbcd_test, wncd_train_lables.

wbcd_train :

Radius 17.99 Texture 10.38 Perimeter 122.8 Area 0.1184

wbcd_test :

Radius 17.6 Texture 23.33 Perimeter 119 Area 980.5

wncd_train_lables :

Diagnosis M

When using the function :

wbcd_test_pred <- knn(train = wbcd_train, test = wbcd_test, cl = wbcd_train_labels, k = 21)

I get the following error:

knn(train = wbcd_train, test = wbcd_test, cl = wbcd_train_labels,  : 
  'train' and 'class' have different lengths

Upvotes: 1

Views: 4168

Answers (1)

Oleksii Pasichnyi
Oleksii Pasichnyi

Reputation: 101

The class parameter should be provided as a vector, not as a dataframe. Referring to Diagnosis variable in your wbcd_train_labels dataframe should work

wbcd_test_pred <- knn(train = wbcd_train, test = wbcd_test,
cl = wbcd_train_labels$Diagnosis,...)

Upvotes: 1

Related Questions