Are you allergic to statistical models?

Or do you know someone who is? Some people are uncomfortable with the idea of statistical models, especially ones with parameters. It is worth remembering that in 1958 Kaplan and Meier introduced the idea of an empirical survival curve, also called the product-limit estimator. The basic idea is to re-arrange the mortality experience data in such a way as to demonstrate the survival rates of different sub-groups. The key feature of the Kaplan-Meier curve is that there are no parameters involved: the empirical survival curve is simply a re-arrangement of the experience data, and involves no model fitting and no parameter estimation.

In the chart below we show the Kaplan-Meier curves for males and females in a large annuity portfolio:

Actual survival curve for males and females in an annuity portfolio

We see what we would have in any case expected: for any given age, females have a higher probability of survival than males. The red area is the average time lived for a male, while the blue area is the average time females live longer than males.

Survival curves are widely used in the analysis of medical trials, where a key piece of information is the point at which half of the members of a group are dead (or, equally, the point to which have of all members survive). In the chart above, the mid-way point at which half of all lives are dead is simply the median survival age. It is higher for females due to their lower mortality and longer life expectancy.

Survival analysis provides a very visual means of communicating results to people who are unfamiliar with parametric models. Empirical, model-free survival curves are possible for any categorisation you like, including socio-economic or lifestyle group, as shown in this article on analysing annuitant mortality without parametric models.

Written by: Stephen Richards
Publication Date:
Last Updated:

Kaplan-Meier in Longevitas

Longevitas users can choose whether or not to generate Kaplan-Meier curves with each model fitted. The default option is to have Kaplan-Meier curves generated, but it can be controlled in the Advanced Options section of the modelling screen. The Kaplan-Meier curves themselves can be plotted in the Curves tab of the model report. 

Previous posts

Postcodes

There is some degree of confusion over what people mean by "postcode" when applied to modelling mortality in the United Kingdom. There are varying ways of using postcodes, depending how much of the full postcode is actually used.
Tags: Filter information matrix by tag: postcodes, Filter information matrix by tag: profiling, Filter information matrix by tag: geodemographics, Filter information matrix by tag: Mosaic

Deduplication and annuities

Deduplication is an important step in data preparation for mortality modelling (or any other kind of modelling for that matter). If people in your data set have multiple benefit records, then the crucial independence assumption for statistical modelling in invalidated.
Tags: Filter information matrix by tag: deduplication, Filter information matrix by tag: duplicates, Filter information matrix by tag: annuities

Add new comment

Restricted HTML

  • Allowed HTML tags: <a href hreflang> <em> <strong> <cite> <blockquote cite> <code> <ul type> <ol start type> <li> <dl> <dt> <dd> <h2 id> <h3 id> <h4 id> <h5 id> <h6 id>
  • Lines and paragraphs break automatically.
  • Web page addresses and email addresses turn into links automatically.