One small step

When fitting mortality models, the foundation of modern statistical inference is the log-likelihood function. The point at which the log-likelihood has its maximum value gives you the maximum-likelihood estimates of your parameters, while the curvature of the log-likelihood tells you about the standard errors of those parameter estimates.

The log-likelihood function is maximised by finding where the gradient is zero. To find this, one requires the first derivatives of the function with respect to each parameter. Similarly, the curvature is measured by the second partial derivatives with respect to each possible pair of parameters. In both cases, one requires either the derivatives themselves, or else a numerical approximation to those derivatives.

Numerical derivatives can be obtained relatively straightforwardly by numerical differentiation. This involves perturbing a parameter by a small value, h say, and then expressing the change in function value relative to h. In theory, the smaller the value of h, the closer your numerical approximation will be to the actual derivative. Such numerical derivatives have the advantage that they avoid the need to work out large numbers of formulaic derivatives for each mortality law. By way of illustration, Table 1 shows the estimates and approximated standard errors for one parameter in a simple Gompertz model using numerical differentiation:

Table 1. Estimated value and approximate standard error for Intercept parameter in a simple Age+Gender Gompertz model for mortality. Numerical differentiation is used with varying values for the step size, h.

h Intercept Std. Err
0.001 -13.0085 0.119561
0.0001 -13.0404 0.116770
0.00001 -13.0408 0.115219
0.000001 -13.0408 0.119749
0.0000001 -13.0408 0.008756

 

The estimate of the Intercept parameter in Table 1 is consistent for all of the given values of h, with a degree of convergence apparently achieved for values of 0.00001 and smaller. However, there is no such obvious convergence for the approximated standard error, with the estimate becoming unstable for very small values of h. The cause is likely to be the same sort of arithmetic underflow described in an earlier post.

For true reliability one simply has to do the maths and find the formulaic expressions for the derivatives. This has the advantage that points of possible arithmetic underflow can be identified and programmed around, while it also usually results in much faster fitting times. Using proper derivatives, the values for the parameters in Table 1 are -13.0408 for the Intercept and 0.116768 for the standard error. From this we can deduce that h=0.0001 was the best-performing approximation step in Table 1. This will not be the case for all models, however, and so the best way to accurately estimate a standard error is to work out the derivatives. It is admittedly a lot of work to do all the necessary maths, but, as shown in this example, it is the fastest and most reliable way to fit the model parameters.

Derivatives in Longevitas

Most models in Longevitas use derivatives to maximise the log-likelihood function and to estimate standard errors.  Users can switch between derivatives and numerical approximation using the configuration option Gradient Calculation Method.  The size of the step used in numerical approximation is controlled using the related configuration option Gradient Approximation Step.  Where numerical approximation is chosen, Longevitas uses central differences. 

Previous posts

Rewriting the rulebook

It is an unfortunate fact of life that through time every portfolio will acquire data artefacts that make risk analysis trickier. Policyholder duplication is one example of this and archival of claims breaking the time-series is another.
Tags: Filter information matrix by tag: technology, Filter information matrix by tag: data validation

A model point

The current issue of The Actuary magazine carries an article on the selection of model points.  Model points were widely used by actuaries in the 1980s and 1990s, when computing power was insufficient to perform complex policy calculations on every policy in a reasonable time-frame.  The idea is to select a much smaller number of sample policies, whose behaviour in aggregate mimics that of the portfolio overall.

Tags: Filter information matrix by tag: model points, Filter information matrix by tag: simulation

Add new comment

Restricted HTML

  • Allowed HTML tags: <a href hreflang> <em> <strong> <cite> <blockquote cite> <code> <ul type> <ol start type> <li> <dl> <dt> <dd> <h2 id> <h3 id> <h4 id> <h5 id> <h6 id>
  • Lines and paragraphs break automatically.
  • Web page addresses and email addresses turn into links automatically.