Why is my random forest regression predicting values not found in my training set?

Question

I have a linear regression random forest model predicting plant height from a set of variables.

training <- read.csv('/sers/me/Desktop/training_data.csv')

rf_model <- randomForest(height ~ EVI + NDVI + Annual_Mean_Temperature + Annual_Precipitation + Precipitation_of_Wettest_Month, data = training, importance=TRUE, na.action = na.roughfix)

But when I look at the predicted values I see some negative numbers, despite that there are no negative values in my training dataset for the dependent variable -- as I'm predicting plant height, a negative value is physically impossible.

> min(rf_model$predicted)
  -4.433786671143025159836e-12

I've checked my training set and there are no negative values here, so how can this be / what should I do?

> min(training$height)
  0

Why is my random forest regression predicting values not found in my training set?

Answers (1)

Related Questions