Enhancement
An oft-overlooked aspect of statistical models is that parameters are dependent on each other. Ignoring such dependencies can have important consequences, and in extreme cases can even undermine assumptions for a forecasting model. However, in the case of a regression model the correlations between regressor variables can sometimes have some unexpectedly positive results. To illustrate this, consider a sequence of fits of a survival model for a Makeham-Perks mortality law (Richards, 2008) defined as follows:
μx = [exp(ε) + exp(α + βx)] / [1 + exp(α + βx)]
where the parameter α is allowed to vary by gender, health status at retirement, or both. The results for a large portfolio of pensions in payment are shown in Table 1 below:
Table 1. A sequence of model fits, their corresponding AICs and the improvement in AIC over an age-only model. Source: Own calculations using a large portfolio of pensions in payment.
Model | AIC | Improvement in AIC over age-only model |
---|---|---|
Age | 240,946 | n/a |
Age+Gender | 239,459 | 1,487 |
Age+Health | 240,211 | 735 |
Age+Gender+Health | 238,512 | 2,434 |
For data sets of this size, an improvement of more than 4 AIC units is regarded as significant. Table 1 shows a number of expected features, namely that gender is a very important risk factor for mortality, albeit one that is now illegal in the EU for insurance pricing. Similarly unsurprising is that health status at retirement is also significant, and that a model which includes both gender and health status is better than a model which leaves out either factor.
However, Table 1 does contain one seemingly curious feature: the improvement in fit from including both gender and health is 2,434 AIC units, whereas the sum of the two improvements for gender and health on their own is 2,222 AIC units (2,222 = 1,487 + 735). How can the improvement from having both factors in the model be greater than the sum of their individual contributions?
As it happens, this is no anomaly. Rather, the phenomenon is known as enhancement, and was discussed in the context of bivariate regression by Currie & Korabinski (1984):
"enhancement occurs often [...]. Further, as stepwise regression proceeds [...] enhancement becomes more frequent"
Source: Currie & Korabinski (1984), page 292.
We have seen many survival models fitted to a lot of different data sets of pensioners and annuitants, and we can confirm that Currie & Korabinski's points also hold true for survival models. Adding a signficant risk factor not only enhances a survival model's fit, it often improves the ability of existing risk factors to explain variation. This can result in a virtuous circle — the more relevant new risk factors you can add, the better the explanatory power of the first ones.
References:
Currie, I. D. and Korabinski, A. (1984) Some comments on bivariate regression, The Statistician, 33, 283–293.
Richards, S. J. (2008) Applying survival models to pensioner mortality data, British Actuarial Journal, 15(II), No. 65, 317–365 (with discussion).
Add new comment