The Mystery of the Non-fatal Deaths
In the course of a recent investigation, with my colleagues Dr Oytun Haçarız and Professor Torsten Kleinow, a key parameter was the mortality rate of persons suffering from Hypertrophic Cardiomyopathy (HCM), an inherited heart disorder characterized by thickening of the left ventricular muscle wall. It is quite rare, so precision is not to be expected, and indeed an annual mortality rate of 1% \((q_x=0.01)\), independent of age \(x\), is widely cited. It has found its way into actuarial applications, for example Howard (2014) estimates adverse selection costs arising in Canada were life insurers to be denied access to genetic test results.
Where did 1% come from? One hopes that it is an estimate from clinical studies of HCM, and indeed it is --- sort of. The problem is that HCM is quite rare, affecting maybe 1 in 500 people, so the usual kind of randomized clinical trial would be infeasibly large and expensive. The result is a long series of studies going back as far as the 1950s, starting small and gradually refining estimates of key quantities, including mortality rates.
Selective 'recruitment' to studies is a big problem for rare disorders. Researchers have no choice but to actively seek out sufferers. Naturally, they may be more likely to find serious cases and overlook more 'typical' sufferers, severely over-estimating mortality rates. But, as the disorder becomes better known in mainstream medicine, study populations become bigger and better defined. By now, after several decades of studying HCM, that 1% mortality rate was beginning to look quite reliable.
Or so we thought. We did take the precaution of actually reading the epidemiological literature on HCM, and saw this process of refinement in action. So far so good. Then by chance we noticed something odd.
Every survival study needs an endpoint, which is the event whose rate of occurrence (or dependence on age) we wish to measure. In many cases, observation of an individual subject ends before we get to observe the endpoint, but their survival up to the time of right-censoring still contributes information, and is characteristic of survival studies, see Macdonald, Richards and Currie (2018). In the case of a study of HCM mortality the endpoint is pretty obvious, namely death. But that is not what was used in studies of HCM. A typical definition was the following, from Elliott et al. (2006) (emphasis added):
"The following endpoints were used in the survival analysis: (1) sudden cardiac death --- witnessed sudden death with or without documented ventricular fibrillation, death within one hour of new symptoms, nocturnal death with no antecedent history of worsening symptoms, and successfully resuscitated cardiac arrest; (2)...'
In other words, 'death' was not necessarily fatal. Fortunately, three very large recent studies, although using the same endpoint, included enough detail to let us re-estimate the mortality rate. Instead of 1%, our estimate was 0.55% --- quite a difference. The details are in Haçarız, Kleinow and Macdonald (2021).
Why do HCM researchers consistently include non-fatal 'deaths' in their endpoint? We never found out. If anyone knows, do tell us. It is not the case for cardiomyopathies in general; when we went on to model Arrhythmogenic Right Ventricular Cardiomyopathy (ARVC) the endpoints used were as we would expect, see Haçarız, Kleinow and Macdonald (2022). Nor is it accidental; there exist reviews of historic HCM mortality studies with and without the non-fatal 'deaths'. But it is estimates 'with' that make the headlines.
Mortality rates are not the only parameters to be handled with care. The other widely-cited 'fact' about HCM is its prevalence of 1 in 500, or 0.2%, mentioned earlier. That was based on a sample of 4,111 unselected subjects --- fair enough --- of whom seven had clinical HCM. However, a later study of 3,600 unselected subjects found 22 with known gene mutations that cause HCM, a prevalence of 0.6%. And two huge studies of healthcare systems have found a prevalence of clinical HCM of less than 0.1%. So what should insurers make of genetic test results, even if they were allowed to use them? Perhaps a subject for a later blog, meanwhile see Haçarız, Kleinow and Macdonald (2021) again.
What lesson do we learn from this? Or rather, what very old lesson do we re-learn and reinforce? Never trust the headline figures, always go back to the original sources.
References:
Elliott, P. M., Gimeno, J. R., Thaman, R., Shah, J., Ward, D., Dickie, S., Tome Esteban, M. T. and McKenna, W. J. (2006) Historical trends in reported survival rates in patients with hypertrophic cardiomyopathy, Heart (British Cardiac Society), 92(6), 785-791.
Haçarız, O., Kleinow, T. and Macdonald, A. S. (2021) Genetics, insurance and hypertrophic cardiomyopathy, Scandinavian Actuarial Journal, 2021, 54-81.
Haçarız, O., Kleinow, T. and Macdonald, A. S. (2022) An actuarial model of arrythmogenic right ventricular cardiomyopathy and life insurance, Scandinavian Actuarial Journal, 2022, 94-114.
Howard, R. C. W. (2014) Report to CIA research committee: Genetic testing model: If the underwriters had no access to known results, Canadian Institute of Actuaries (CIA).
Macdonald, A. S., Richards. S. J. and Currie, I. D. (2018) Modelling Mortality with Actuarial Applications, Cambridge University Press., doi: 10.1017/9781107051386.
Previous posts
White Swans and the Moron Risk Premium
Interest rates and gilt yields are critical drivers of pension-scheme reserving and bulk-annuity pricing. However, many UK pension schemes self-insure when it comes to economic risks, with Liability Driven Investment (LDI) a common approach. This makes the turmoil in the UK Gilts market in Autumn 2022 of particular interest. Daily movements of 10-20 standard deviations arose as the
Normal behaviour
One interesting aspect of maximum-likelihood estimation is the common behaviour of estimators, regardless of the nature of the data and model. Recall that the maximum-likelihood estimate, \(\hat\theta\), is the value of a parameter \(\theta\) that maximises the likelihood function, \(L(\theta)\), or the log-likelihood function, \(\ell(\theta)=\log L(\theta)\). By way of example, consider the following three single-parameter distributions:
Add new comment