Residual concerns
One of the most important means of checking a model's fit is to look at the residuals, i.e. the standardised differences between the actual data observed and what the model predicts. One common definition, known as the Pearson residual, is as follows:
\[r = \frac{D-E}{\sqrt{E}}\qquad(1)\]
where \(r\) is the residual, \(D\) is the observed number of deaths and \(E\) is the expected number of deaths. This definition is quick and easy to apply, and works well where there are relatively large numbers of observed and expected deaths. If the underlying model used to generate the expected values in \(E\) is correct, the residuals should have an approximate N(0, 1) distribution. The sum of the \(r^2\) values can be compared with the appropriate point of a \(\chi^2\) (chi-squared) distribution to test for fit.
The Pearson definition above depends on the law of large numbers, so it works less well where the number of deaths in each category is relatively small. One solution is to collapse data across categories to get the number of deaths large enough so that the approximation holds. However, this restricts your ability to look at localised areas of the model fit.
Small or medium-sized pension schemes usually do have small numbers of deaths, and we would prefer not to have to collapse across groups if it could be avoided. Fortunately, there is a much better definition of a residual, known as the deviance residual. Below is the definition of the deviance residual for a Poisson variable:
\[r = {\rm sign}(D-E)\sqrt{2\left[D\log\left(\frac{D}{E}\right)-(D-E)\right]}\qquad(2)\]
and the following definition applies for a Binomial variable with a sample size of \(n\):
\[r={\rm sign}(D-E)\sqrt{2\left[D\log\left(\frac{D}{E}\right)-(D-n)\log\left(\frac{n-D}{n-E}\right)\right]}\qquad(3)\]
where \({\rm sign}()\) denotes the sign function:
\[{\rm sign}(x)=\begin{cases}1&x>0\\ 0&x=0\\ -1&x<0\end{cases}\qquad(4)\]
There is little difference between Pearson residuals and deviance residuals where the number of deaths is large, but the deviance residual has better theoretical properties when the number is small. There is a small amount of extra programming for deviance residuals, but it is worth it to avoid the limitations of Pearson residuals for small data sets.
And in case you are wondering, \(D\log\left(\frac{D}{E}\right)=0\) when \(D=0\).
Add new comment