katharsis
katharsis

Reputation: 1

Issue with Metrics Interpretation after Log Transformation in Regression Task

I am currently working on a house price prediction task where I have logarithmically transformed the target variable (price) due to its non-normal distribution. I am using metrics such as RMSE, MAE, and MAPE, and for model training, I utilized cross_val_score.

After obtaining predictions, I took the exponential of MAE and MAPE metrics to revert them to the original scale. However, I encountered unexpectedly small values; both metrics were equal to 1. I suspect that these values are incorrect.

kf = KFold(n_splits=5, random_state=42, shuffle=True)

def rmse_cv(model):
    mse_scorer = make_scorer(mean_squared_error)
    rmse = np.sqrt(cross_val_score(model, train, y_train, scoring=mse_scorer, cv=kf))
    return rmse

def mae_cv(model):
    mae_scorer = make_scorer(mean_absolute_error)
    mae = cross_val_score(model, train, y_train, scoring=mae_scorer, cv=kf)
    return mae

def mape_cv(model):
    mape_scorer = make_scorer(mean_absolute_percentage_error)
    mape = cross_val_score(model, train, y_train, scoring=mape_scorer, cv=kf)
    return mape

lightgbm = LGBMRegressor(num_leaves=6, max_depth=7, random_state=42, n_estimators=500, objective='regression')

rmse = rmse_cv(lightgbm)
mae = mae_cv(lightgbm)
mape = mape_cv(lightgbm)
print('Lightgbm rmse %.4f' % (rmse.mean()))
print('Lightgbm mae %.4f' % (mae.mean()))
print('Lightgbm mape %.4f' % (mape.mean()))

Lightgbm rmse 0.1331
Lightgbm mae 0.0874
Lightgbm mape 0.0073

I expected to obtain reasonable and interpretable values that reflect the model's performance on the original scale. However, both metrics yielded unexpectedly small values of 1, which seems inaccurate. I anticipated a more meaningful representation of model error on the original price scale.

Upvotes: 0

Views: 129

Answers (1)

joeday
joeday

Reputation: 74

After obtaining predictions, I took the exponential of MAE and MAPE metrics to revert them to the original scale.

Perhaps I am reading this wrong, but you would want to calculate MAE of the exponentiated prediction -- i.e. MAE(exp(preds)) and not exp(MAE(preds)).

Upvotes: 0

Related Questions