Dealing with missing data

In an earlier post we looked at how to create a proxy for ill-health early retirements based on age at commencement.  This is an example of dealing with missing data — we infer a useful proxy to replace the lost or missing health status at retirement.

Another common problem occurs during data or system migrations, where historical experience data is often not carried across to a new administration system.  Migrations happen when a life office consolidates multiple systems into one, or when a pension scheme changes administrator.  System migrations aren't easy, and migrating past historical data is usually one of the last tasks on the priority list.  As a result, data migration is unfortunately one of the first tasks to be dropped when time gets tight.  This has resulted in many systems containing only partial mortality data.

Such situations naturally affect mortality investigations.  In particular, exposure calculations cannot include the pre-migration period if deaths data have not been migrated as well.  Failure to do this would mean under-estimating mortality rates, as exposure periods would be included without the corresponding deaths. Migrations are not always done as a single action, so it might not be as simple as counting exposure from a single date for all policies.  At worst, the data might have been migrated in stages, raising the problem of deduplicating across records which were migrated at different times.

One solution is to look at the payment records linked to the policy, and to use the earliest date of activity as an indicator of the migration date.  The earliest evidence of activity on the new administration system is likely to be very close to the date when the policy was actually migrated. For pensions, one would look at the earliest payment date.  For term-assurance policies, one would look at the premium-collection records and use the earliest premium collected on the new system.  For deferred pensions there is unfortunately neither payment nor premium collection, which is one reason why the data for such business is seldom usable for mortality investigations.

Actuaries often find themselves in the position of working with less-than-perfect data.  However, intelligent use of other data items can compensate for missing information.

Written by: Stephen Richards
Publication Date:
Last Updated:

Handling missing data in Longevitas

Longevitas users can allow for missing or archived deaths or claims by populating the EarliestActivityDate field.  Longevitas will then only count the exposure from the later of this date and the policy commencement date. 

Previous posts

Pension-fund socialism

In an earlier posting we looked at several examples where a pension scheme dominates the picture of the company's finances and value.
Tags: Filter information matrix by tag: pension schemes

Special assignment

We talked previously about the use of user-defined validation rules to clean up specific data artefacts you sometimes find in portfolio data. One question came up recently about modelling bespoke benefit bands, and this can also benefit from user-defined rules.
Tags: Filter information matrix by tag: technology, Filter information matrix by tag: data validation, Filter information matrix by tag: deduplication

Add new comment

Restricted HTML

  • Allowed HTML tags: <a href hreflang> <em> <strong> <cite> <blockquote cite> <code> <ul type> <ol start type> <li> <dl> <dt> <dd> <h2 id> <h3 id> <h4 id> <h5 id> <h6 id>
  • Lines and paragraphs break automatically.
  • Web page addresses and email addresses turn into links automatically.