Regression error

- Mean Absolute Error, np.mean(np.abs((y_true - y_pred))), is related to Least Absolute Deviations or L1-norm.

- Mean Squared Error, np.mean(np.square((y_true - y_pred))).

- Root Mean Squared Error, np.sqrt(MSE(y_true, y_pred)), is called as Euclidean norm or L2-norm.

- Mean Absolute Percentage Error, np.mean(np.abs((y_true - y_pred) / y_true)) * 100.

- Mean Percentage Error, np.mean((y_true - y_pred) / y_true) * 100.

* footnote

Advantages of the mean absolute error (MAE) over the root mean square error (RMSE) in assessing average model performance, https://www.jstor.org/stable/24869236

Measures of average error (such as RMSE) that are based on the sum of squared errors (i.e. on the sum of
e_i^2, e_i^2 or |e_i|^2 is not a metric because thery do not satisfy the triangle inequality of a metric) are functions of the average error (MAE), the distribution of error magnitudes (or squared errors), and n^{1/2}; therefore, they do not describe average error alone. Among the disturbing characteristics of RMSE are: it tends to become increasingly larger than MAE (but not necessarily in a monotonic fashion) as the distribution of error magnitudes becomes more variable; and, it tends to grow larger than MAE with n^{1/2}, since its lower limit is fixed at MAE and its upper limit (n^{1/2} ·MAE) increases with n^{1/2}. For these reasons, it seems to us that there is no clear interpretation of RMSE or related measures, and we recommend that such measures no longer be reported in the literature. It also occurs to us that previous model-performance evaluations and inter-comparisons, which were based primarily on RMSE or related measures, are questionable and should be reconsidered. Other commonly used bivariate statistics that share RMSE’s reliance on the sum of squares (e.g. certain correlation and skill measures) also are questionable model-performance statistics.

Our analysis indicates that MAE is the most natural measure of average error magnitude, and that (unlike RMSE) it is an unambiguous measure of average error magnitude. It seems to us that all dimensioned evaluations and inter-comparisons of average modelperformance error should be based on MAE.

[MAE] (lower limit) ≤ [RMSE] ≤ [MAE * sqrt(n)] (upper limit)

Reference

- 회귀의 오류 지표 알아보기, https://partrita.github.io/posts/regression-error/

- Tutorial: Understanding Regression Error Metrics in Python, https://www.dataquest.io/blog/understanding-regression-error-metrics/

- https://medium.com/human-in-a-machine-world/mae-and-rmse-which-metric-is-better-e60ac3bde13d

저작자표시

'Statistics' 카테고리의 다른 글

Count data distribution (0)	2021.10.17
Correlation coefficient (0)	2021.08.19
Multiple test correction (0)	2021.06.10
ANOVA (0)	2021.06.08
Fisher's exact test (0)	2021.06.01

내 블로그 - 관리자 홈 전환	`Q` `Q`
새 글 쓰기	`W` `W`

글 수정 (권한 있는 경우)	`E` `E`
댓글 영역으로 이동	`C` `C`

이 페이지의 URL 복사	`S` `S`
맨 위로 이동	`T` `T`
티스토리 홈 이동	`H` `H`
단축키 안내	`Shift` + `/` `⇧` + `/`

Analytic reasoning

Regression error

'Statistics' 카테고리의 다른 글

댓글

티스토리툴바

개인정보

단축키

내 블로그

블로그 게시글

모든 영역

Regression error

'Statistics' 카테고리의 다른 글

관련글

댓글

티스토리툴바

개인정보

단축키

내 블로그

블로그 게시글

모든 영역