Significance of choosing an Error / Evaluation metrics Part — 1
So I have probably gone through so many blogs to understand which metrics I need to choose for the underlying data to evaluate how well the model is at predicting the target and understand what type of errors the model is making more often.
We use evaluation metrics to better understand how well the model is at predictions. With correct error metrics, we can understand if the model is biased towards any class or if we can downplay/ignore such errors.
Ideally, the difference between actual values & predicted values should be less & the model should be unbiased towards the train, validation, & test data.
Most of the beginners know all the evaluation metrics like SSE, RMSE, MAE, etc, or confusion matrix for classification problems. But most of us don’t know that choosing error metrics can be very specific to the underlying problems like outliers, Class Imbalance issues (which can lead to biased models), etc. There is no one -metric that fits all kinds of data.
Let’s go through the metrics used for regression problems and which will help you understand the following:
- What type of data should it be used for?
- What are those metrics signifying?
- Do we want to downplay outliers or give them importance?
In all the below formulas refer to this:
- y — Actual data point
- ŷ — Predicted data point
- ȳ — Mean of actual data points
Sum of Squared Errors (SSE)
Signifies how far the model predicted (y hat) the value compared to the actual value.
Sum of Squared Total (SST)
Signifies how far is the actual (y) value compared to the mean value.
Sum of Squared Regressor (SSR)
Signifies how far the model predicted (y hat) the value compared to the mean value.
R — Squared
Also called as the coefficient of determination.
R-Squared is a metric that quantifies the variance that is explained by the regression model.
R-Squared is always between 0 to 1 i.e 0% to 100%. R-Squared closer to 1 is considered a good fit.
As the SSE decreases, R-Squared will be closer to one.
Word of caution — When we add more independent variables R-Squared approaches higher values, even though some of these independent variables may not be significant which can be misleading while evaluating our regression model.
The simple solution for this is Adjusted R-Squared.
Adjusted R-Squared is mostly used instead of R-Squared whenever there are more independent variables.
Adjusted R-Squared is going to penalize the model for every addition of unnecessary independent variables.
Adjusted R-Squared usually is positive but it can be negative when R-Squared is 0 or when SSE approaches SST.
Mean absolute error
MAE is the easiest metric to understand in linear regression. We only calculate the absolute residual for each point as we do not want negative errors to cancel out positive errors. And average out these absolute residuals.
MAE is the most intuitive metric but as we use absolute values, it cannot indicate the underperformance or outperformance in the model.
Each data point contributes proportionally to the total error, therefore larger errors will contribute more to the overall error.
Due to the use of absolute errors, MAE is more robust to the outliers i.e MAE remains stable or has very little change due to exposure to noisy data.
Mean Absolute percentage Error
MAPE is the percentage equivalent of MAE. It is simply the percentage of the average of the absolute errors.
The only problem with MAPE is that it can be unexpectedly high if the actual values are exceptionally small.
MAPE is also biased towards or is higher for the predictions that are less than actual values.
Just like MAE, MAPE is also a robust metric for data with outliers due to absolute errors.
Mean Percentage error
MPE is exactly like MAPE, the only difference is we remove absolute value operation.
As there is no absolute function, negative errors will cancel out positive errors.
But that’s exactly how this metric is useful in showing the bias if there are more negative errors than positive errors.
It allows us to evaluate if our model systematically underestimates ( more negative errors) or overestimates ( more positive errors).
You can’t use MPE in the same way as MAPE, but it can explain to us the errors that our model makes.
Mean squared error
Instead of using absolute errors, in MSE we square the errors before averaging them.
Due to squaring the errors, we might overestimate how bad the model is because it penalizes smaller errors as well.
In MAE each error contributes proportionally to the metric but in MSE each error contributes quadratically to the metric. This eventually means that MSE will give higher importance to outliers and will give a higher error. This is to say the larger difference between actual and predicted will be penalized more in MSE than in MAE.
Whenever you want to ensure your model takes outliers into account MSE should be used.
Root Mean square error
Due to faster computation, many prefer MSE. But as MSE is not scaled to the original data that’s where RMSE helps us.
Same as MSE, RMSE punishes larger errors, making it sensitive to outliers.
There is no one perfect metric for all data. Choosing KPI depends on the objective and the data quality.