M is for Estimation

In earlier blogs I discussed two techniques for handling outliers in mortality forecasting models:

There is a triple benefit to these procedures:

  1. The identification of outliers is objective, based on statistical tests.

  2. Having identified the outliers, their effect can be co-estimated with the model parameters to reduce bias.

  3. If the most recent years are affected by outlier effects, robust starting points for the forecast can be calculated by simply deducting the outlier effects.

In the case of Chen & Liu (1993), there is a fourth benefit: the nature of the outlier can be classified.

However, during a recent presentation I was asked about  the older technique of likelihood robustification.  This has been superseded for time-series work, but it might be of historical interest to some readers; enthusiasts can consult the likes of Martin, Samarov & Vandaele (1982) for robustified likelihoods for ARIMA models.

To illustrate likelihood robustification in general, consider the contribution of a single data point, $x$, to the likelihood for a N(0,1) random variable:

\[L\propto \frac{1}{\sqrt{2\pi}}e^{-\frac{1}{2}x^2}\qquad(1)\]

Multiplicative factors involving only constants don't affect inference, so the contribution to the log-likelihood is therefore:

\[\ell = \log L = -\frac{1}{2}x^2\qquad(2)\]

In the literature on likelihood robustification it is the convention to minimise the negative log-likelihood, so we work with:

\[-\ell = \frac{1}{2}x^2\qquad(3)\]

The function $-\ell$ is plotted as the solid line in Figure 1.  In this form it is referred to as a loss function.  The contribution to the log-likelihood increases quadratically with $x$.  We know that 95% of N(0,1) variates lie in (-1.96, 1.96), so an outlier value of 5 (say) will have a large and distorting influence on inference using the standard quadratic loss function.

Figure 1. Quadratic loss function, $-\ell$, and pseudo-Huber loss function, $-\ell^*$.

Quadratic loss function and pseudoHuber loss function

Huber (1964) introduced the idea of a loss function that behaved like equation (3) for non-outliers, but which restricted the contribution made by more extreme observations.  The term used for inference here is M-estimation, of which the maximum-likelihood estimate is a special case.  Specifically, to counter the distorting influence of outliers we replace the quadratic loss function in equation (3) with a function without an exponentially increasing influence.  There are many examples in the literature, but one is:

\[-\ell^* = \delta\left(\sqrt{1+(x/\delta)^2}-1\right)\qquad(4)\]

which is also plotted in Figure 1 for $\delta=1$.  The robustified function $-\ell^*$ behaves much like $-\ell$ in the region (-1, 1), but outliers outside (-2, 2) clearly have much less influence under $-\ell^*$ than under $-\ell$.  Thus, a robustified alternative to the log-likelihood can be constructed with an appropriate choice of loss function and any accompanying parameter (like $\delta$ in equation (4)).  Such methodologies still have their place in statistics, but the methodologies of Chen & Liu (1993) and Galeano, Peña and Tsay (2006) are preferred for time-series work for the reasons stated at the start.

References:

Chen, C. and Liu, L-M. (1993) Joint Estimation of Model Parameters and Outlier Effects in Time Series, Journal of the American Statistical Association, March 1993, Vol. 88, No. 421, pages 284–297.

Galeano, P., Peña, D. and Tsay, R. S. (2006) Outlier Detection in Multivariate Time Series by Projection Pursuit, Journal of the American Statistical Association, June 2006, Vol. 101, No. 474, pages 654–669.

Huber, P. (1964) Robust Estimation of a Location Parameter, The Annals of Mathematical Statistics, Vol. 35, No 1, March 1964, pages 73-101.

Martin, R. D., Samarov, A. and Vandaele, W. (1982) Robust methods for ARIMA models, Department of Statistics, University of Washington, Seattle, March 1982, Technical Report No. 21.

Previous posts

Measuring liability uncertainty

Pricing block transactions is a high-stakes business.  An insurer writing a bulk annuity has one chance to assess the price to charge for taking on pension liabilities.  There is a lot to consider, but at least there is data to work with: for the economic assumptions like interest rates and inflation, the insurer has market prices.  For the mortality basis, the insurer usually gets several years of mortality-experience data from the pensi

Tags: Filter information matrix by tag: mis-estimation risk, Filter information matrix by tag: covariance matrix, Filter information matrix by tag: log-likelihood

Understanding reviewers - a guide for authors

I recently came across an online article by W. S. Warren, the deputy editor of Science Advances.  In the article Warren outlines some easy ways for submitting authors to improve their paper's chances of being accepted for journal publication.

Tags: Filter information matrix by tag: academic publishing

Add new comment

Restricted HTML

  • Allowed HTML tags: <a href hreflang> <em> <strong> <cite> <blockquote cite> <code> <ul type> <ol start type> <li> <dl> <dt> <dd> <h2 id> <h3 id> <h4 id> <h5 id> <h6 id>
  • Lines and paragraphs break automatically.
  • Web page addresses and email addresses turn into links automatically.