user18259592
user18259592

Reputation: 25

Random Forest Regression predictions: overestimates negative actual values and underestimates positive values

Predictions

Hello everyone, I am completely new to ML and trying to teach myself what is out there in the books, so apologies for my ignorance in advance.

Basically, I am trying to predict stock return values one period ahead based on a set of 15 predictors in a Random Forest Regression (using tidymodels in R, thank you kindly for your videos @Julia Silge :-)).

What bothers me is that the regression overestimates bad stocks and underestimates good stock. I would like to just rotate this whole point cloud a few degrees counter-clockwise and my life would be easier. Is there an expert on random forest regressions with a trick up their sleeves for solving this?

Thank you in advance.

Upvotes: 1

Views: 526

Answers (1)

dipetkov
dipetkov

Reputation: 3700

You are right to be concerned: the model basically returns the average for all stocks (as the predictions lie on a flat horizontal line with a bit of noise). As you point out this means that the model is biased: it underpredicts positive returns and underpredicts negative returns.

In short, this model predicts no returns and no losses (in the next period). This is boring but actually doesn't seem wrong.

Since you are doing this to learn about machine learning, consider picking an "easier" problem. Stock returns are not very predictable in general.

Upvotes: 1

Related Questions