Underflow
Earlier I described a problem in mathematical computing for mortality modelling. This was where an intermediate step resulted in a number too big for the computer to handle, causing the entire calculation to overflow and fail. The cause is due to the compromises inherent in how computers deal with real numbers, and the solution lies in a bit of careful programming.
Unfortunately, computer representation of real numbers also involve compromises in accuracy as well as scale. A particular problem area lies in subtracting numbers which differ by a very small amount. On each computer there is usually a limit number, sometimes denoted epsilon, whereby 1+epsilon or 1-epsilon is still distinguishable from 1. Below this limit, however, an underflow occurs and the result becomes 1 as far as the computer is concerned. As with overflow, this can lead an entire calculation to fail, even if the end result is well within the computer's ability to represent numbers.
As an example, consider the integrated hazard function for the Gompertz law of mortality:
[exp(βt) - 1] / β * exp(α + βt))
Any mathematician can tell you what happens as β tends to zero: the equation becomes t*exp(α) in the limit. If t=1 and α=-5, then as β tends to zero the integrated hazard function approaches a limiting value of 0.0067379. Now contrast this with what happens when you just use the formula above in Excel for various decreasing values of β:
Table 1. Evaluation of the Gompertz integrated hazard function for various values of β. (t=1 and α=-5).
β | Gompertz integrated hazard function |
---|---|
0.1 | 0.0078316 |
0.00001 | 0.0067380 |
0.000000001 | 0.0067379 |
1E-13 | 0.0067326 |
1E-17 | 0 |
1E-21 | 0 |
The problem is that for very small values of β the limits of the computer's arithmetic cannot distinguish exp(βt) from 1. This means the computer has evaluated [exp(βt) - 1] as zero and this underflow affects the rest of the calculation.
Underflow is obviously related to overflow through the compromises and limitations of machine approximations to real numbers. However, underflow is the more insidious of the two. The "advantage" of overflow is that when it occurs the result is given as #NUM! (or similar) and this is immediately and easily recognised as a calculation gone wrong. In contrast, underflow occurs silently: the end result is a valid number, even if it is the wrong number. As with overflow, underflow requires programming some careful mathematics to avoid it occurring.
Previous posts
The bottom line
At it's core, the study of mortality is based on a simple ratio — the number of deaths, D, divided by the population exposed to the risk of death, E:
mortality rate = D / E
Add new comment