Nicolas Duaut
Nicolas Duaut

Reputation: 27

Variable lengths differ with random forest

I'm really new to R and I want to make a random forest. However I keep getting the same error-

Error in model.frame.default, lengths of variables differ.

I know this issue has been solved in another topic by constructing a formula from strings with as. formula but I have really no idea how to do it. Can you help me please? Thank you.

#A vector that has random sample of training values (70% & 30% samples)
index = sample(2,nrow(df), replace = TRUE, prob=c(0.7,0.3)) 

#Training Date 
training = df[index==1,]

#Testing data
testing = df[index==2,]

#Random forest model 
RFM = randomForest(df$Rating~., df$Customer_type, data = training)

Upvotes: 1

Views: 817

Answers (1)

Kylian
Kylian

Reputation: 329

Well what your error is, is that your independent variable is Rating from the df dataframe, but you selected data = training. This means that your random forest should take data from 2 different dataframes, which isn't possible. I guess that randomForest(Rating ~ Customer_type, data = training) would work.

Upvotes: 2

Related Questions