Reputation: 1
I am currently working on a house price prediction task where I have logarithmically transformed the target variable (price) due to its non-normal distribution. I am using metrics such as RMSE, MAE, and MAPE, and for model training, I utilized cross_val_score.
After obtaining predictions, I took the exponential of MAE and MAPE metrics to revert them to the original scale. However, I encountered unexpectedly small values; both metrics were equal to 1. I suspect that these values are incorrect.
kf = KFold(n_splits=5, random_state=42, shuffle=True)
def rmse_cv(model):
mse_scorer = make_scorer(mean_squared_error)
rmse = np.sqrt(cross_val_score(model, train, y_train, scoring=mse_scorer, cv=kf))
return rmse
def mae_cv(model):
mae_scorer = make_scorer(mean_absolute_error)
mae = cross_val_score(model, train, y_train, scoring=mae_scorer, cv=kf)
return mae
def mape_cv(model):
mape_scorer = make_scorer(mean_absolute_percentage_error)
mape = cross_val_score(model, train, y_train, scoring=mape_scorer, cv=kf)
return mape
lightgbm = LGBMRegressor(num_leaves=6, max_depth=7, random_state=42, n_estimators=500, objective='regression')
rmse = rmse_cv(lightgbm)
mae = mae_cv(lightgbm)
mape = mape_cv(lightgbm)
print('Lightgbm rmse %.4f' % (rmse.mean()))
print('Lightgbm mae %.4f' % (mae.mean()))
print('Lightgbm mape %.4f' % (mape.mean()))
Lightgbm rmse 0.1331
Lightgbm mae 0.0874
Lightgbm mape 0.0073
I expected to obtain reasonable and interpretable values that reflect the model's performance on the original scale. However, both metrics yielded unexpectedly small values of 1, which seems inaccurate. I anticipated a more meaningful representation of model error on the original price scale.
Upvotes: 0
Views: 129
Reputation: 74
After obtaining predictions, I took the exponential of MAE and MAPE metrics to revert them to the original scale.
Perhaps I am reading this wrong, but you would want to calculate MAE of the exponentiated prediction -- i.e. MAE(exp(preds))
and not exp(MAE(preds))
.
Upvotes: 0