1 Introduction

Claims reserving, pricing and capital modelling are core to actuarial functions. The assumptions used in the underlying actuarial models play a key role in the management of any insurance company.

Knowing when those underlying assumptions are no longer valid is critical for the business to initiate change. Transparent models that clearly state the underlying assumptions are easier to test and challenge and hence, can speed up the process for change.

Unfortunately, many of the underlying risk factors in insurance are not directly measurable and are latent in nature. Although prices are set for all policies, only a fraction of them will have losses. Reserving by its very nature is based on relatively sparse data to make predictions about future payments, potentially over long time horizons.

Combining expertise about future developments with historical data is therefore common practice for many reserving teams, particularly when entering a new product, line of business, geography or when changes to products and business processes would make past data a less credible predictor. Modern Bayesian modelling provides a rich toolkit for bringing together the expertise and business insight of the actuary, and augmenting and updating it with data.

In situations where the actuary has access to large volumes of data, non-parametric machine learning techniques might provide a better approach. Some of these are based on enhancement of traditional approaches such as the chain-ladder (Wüthrich (2018), Carrato and Visintin (2019)), with others using neural networks (Kuo (2018), Gabrielli, Richman, and Wüthrich (2018)) and Gaussian processes (Lally and Hartman (2018)).

With small and sparse data, parametric models such as growth curves can help the actuary to capture key claims development features while not overfitting, but may require expertise and judgement in the selection of the growth curve and its parametrisation (Sherman (1984), Clark (2003), Guszcza (2008)).

Hierarchical compartmental reserving models provide a parametric framework for describing the high-level business processes driving claims development in insurance using differential equations (Morris (2016)). Rather than selecting a growth curve, the experienced modeller can build loss emergence patterns from first principles. In addition, they can be constructed in such a way which allows outstanding and paid data to be described simultaneously (see Figure 1.1).

Comparison of reserving methods and models.

Figure 1.1: Comparison of reserving methods and models.

The starting point mirrors that of a scientist trying to describe a particular process in the real world using a mathematical model. The model will, by its very nature, only be able to approximate the real world. We derive a ‘small world’ view that makes simplifying assumptions about the real world, but which may allow us to improve our understanding of key processes. In turn we can attempt to address our real world questions by testing various ideas about how the real world functions.

Compared to many machine learning methods, which are sometimes described as black boxes, hierarchical compartmental reserving models can be viewed as transparent boxes. All modelling assumptions must be articulated by the practitioner, which has the benefit that expert knowledge can be incorporated and each modelling assumption can be challenged more easily by other experts.

Only once we can simulate artificial data that resembles our expected observations do we proceed to fit any model.

1.1 Outline of the document

This document builds on the original paper by Jake Morris (Morris (2016)). It provides a practical introduction to hierarchical compartmental reserving in a Bayesian framework and is outlined as follows:

  • In section 2 we develop the original ordinary differential equation (ODE) model and demonstrate how the model can be modified to allow for different claims processes, e.g. different settlement speeds for standard vs. dispute claims and different exposure to reporting processes.
  • In section 3 we build the stochastic part of the model and provide guidance on how to parametrise prior parameter distributions to optimise model convergence. Furthermore, we discuss why one should model incremental paid data in the context of underlying statistical assumptions and previously published methodologies.
  • In section 4 we add hierarchical structure to the model, which links compartmental models back to credibility theory and regularisation. The ‘GenIns’ dataset is used to illustrate these concepts as we fit the model to actual claims data, and we highlight the conceptual differences between expected and ultimate loss ratios when interpreting model outputs.
  • Section 5 concludes with a case study demonstrating how such models can be implemented in R/Stan using the ‘brms’ package. Models of varying complexity are tested against each other, with add-ons such as parameter variation by both origin and development period and market cycle sub-models. Model selection and validation is demonstrated using posterior predictive checks and hold-out sample methods.
  • Section 6 summarises the document and provides an outlook for future research.

We assume the reader is somewhat familiar with Bayesian modelling concepts. Good introductory textbooks to Bayesian data analysis are (McElreath (2015), Kruschke (2014), Gelman et al. (2014)). For hierarchical models we recommend (Gelman and Hill (2007)), and for best practices on a Bayesian workflow see (Betancourt (2018)).

In this document we will demonstrate practical examples using the brms (Bürkner (2017)) interface to the probabilistic programming language Stan (Stan Development Team (2019)) from R (R Core Team (2019)).

The brm function – short for ‘Bayesian regression model’ – in brms allows us to write our models in a similar way to a GLM or multilevel model with the popular glm or lme4::lmer (Bates et al. (2015)) R functions. The Stan code is generated and executed by brm. Experienced users can access all underlying Stan code from brms as required.

Stan is a C++ library for Bayesian inference using the No-U-Turn sampler (a variant of Hamiltonian Monte Carlo – ‘HMC’) or frequentist inference via L-BFGS optimization (Carpenter et al. (2017)). For an introduction to HMC see (Betancourt (2017)).

The Stan language is similar to BUGS (Lunn et al. (2000)) and JAGS (Plummer (2003)), which use Gibbs sampling instead of Hamiltonian Monte Carlo. BUGS was used in (Morris (2016)), and has been used for Bayesian reserving models by others (Scollnik (2001), Verrall (2004), Zhang, Dukic, and Guszcza (2012)), while (Schmid (2010), Meyers (2015)) use JAGS. Examples of reserving models built in Stan can be found in (Cooney (2017), Gao (2018)).


Bates, Douglas, Martin Mächler, Ben Bolker, and Steve Walker. 2015. “Fitting Linear Mixed-Effects Models Using lme4.” Journal of Statistical Software 67 (1): 1–48. https://doi.org/10.18637/jss.v067.i01.

Betancourt, Michael. 2017. “A Conceptual Introduction to Hamiltonian Monte Carlo.” https://arxiv.org/abs/1701.02434. http://arxiv.org/abs/1701.02434.

Betancourt, Michael. 2018. “Towards a Principled Bayesian Workflow (RStan).” https://betanalpha.github.io/assets/case_studies/principled_bayesian_workflow.html.

Bürkner, Paul-Christian. 2017. “brms: An R Package for Bayesian Multilevel Models Using Stan.” Journal of Statistical Software 80 (1): 1–28. https://doi.org/10.18637/jss.v080.i01.

Carpenter, Bob, Andrew Gelman, Matthew Hoffman, Daniel Lee, Ben Goodrich, Michael Betancourt, Marcus Brubaker, Jiqiang Guo, Peter Li, and Allen Riddell. 2017. “Stan: A Probabilistic Programming Language.” Journal of Statistical Software, Articles 76 (1): 1–32. https://doi.org/10.18637/jss.v076.i01.

Carrato, Alessandro, and Michele Visintin. 2019. From the Chain Ladder to Individual Claims Reserving Using Machine Learning Techniques. ASTIN Colloquium. https://www.colloquium2019.org.za/wp-content/uploads/2019/04/Alessandro-Carrato-From-Chain-Ladder-to-Individual-Claims-Reserving-using-Machine-Learning-ASTIN-Colloquium.pdf.

Clark, David R. 2003. LDF Curve-Fitting and Stochastic Reserving: A Maximum Likelihood Approach. Casualty Actuarial Society; http://www.casact.org/pubs/forum/03fforum/03ff041.pdf.

Cooney, Mick. 2017. “Modelling Loss Curves in Insurance with RStan.” Stan Case Studies 4. https://mc-stan.org/users/documentation/case-studies/losscurves_casestudy.html.

Gabrielli, Andrea, Ronald Richman, and Mario V Wüthrich. 2018. “Neural Network Embedding of the over-Dispersed Poisson Reserving Model.” Available at SSRN: Https://Ssrn.com/Abstract=3288454.

Gao, Guangyuan. 2018. Bayesian Claims Reserving Methods in Non-Life Insurance with Stan: An Introduction. Springer. https://doi.org/10.1007/978-981-13-3609-6.

Gelman, A., B. Carlin, H. Stern, D. B. Dunson, A. Vehtari, and D. B. Rubin. 2014. Bayesian Data Analysis, Third Edition (Chapman & Hall/Crc Texts in Statistical Science). Hardcover; Chapman; Hall/CRC.

Gelman, Andrew, and Jennifer Hill. 2007. Data Analysis Using Regression and Multilevel/Hierarchical Models. Analytical Methods for Social Research. United Kingdom: Cambridge University Press.

Guszcza, James. 2008. “Hierarchical Growth Curve Models for Loss Reserving.” In Casualty Actuarial Society E-Forum, Fall 2008, 146–73. https://www.casact.org/pubs/forum/08fforum/7Guszcza.pdf.

Kruschke, J. 2014. Doing Bayesian Data Analysis: A Tutorial with R, Jags, and Stan. Elsevier Science. https://books.google.co.uk/books?id=FzvLAwAAQBAJ.

Kuo, Kevin. 2018. “DeepTriangle: A Deep Learning Approach to Loss Reserving.” arXiv Preprint arXiv:1804.09253.

Lally, Nathan, and Brian Hartman. 2018. “Estimating Loss Reserves Using Hierarchical Bayesian Gaussian Process Regression with Input Warping.” Insurance: Mathematics and Economics 82: 124–40. https://doi.org/https://doi.org/10.1016/j.insmatheco.2018.06.008.

Lunn, David J, Andrew Thomas, Nicky Best, and David Spiegelhalter. 2000. “WinBUGS-a Bayesian Modelling Framework: Concepts, Structure, and Extensibility.” Statistics and Computing 10 (4): 325–37.

McElreath, R. 2015. Statistical Rethinking: A Bayesian Course with Examples in R and Stan. Chapman & Hall/CRC Texts in Statistical Science. CRC Press. https://www.crcpress.com/Statistical-Rethinking-A-Bayesian-Course-with-Examples-in-R-and-Stan/McElreath/p/book/9781482253443.

Meyers, Glenn. 2015. Stochastic Loss Reserving Using Bayesian MCMC Models. CAS Monograph Series. http://www.casact.org/pubs/monographs/papers/01-Meyers.PDF; Casualty Actuarial Society.

Morris, Jake. 2016. Hierarchical Compartmental Models for Loss Reserving. Casualty Actuarial Society Summer E-Forum; https://www.casact.org/pubs/forum/16sforum/Morris.pdf.

Plummer, Martyn. 2003. “JAGS: A Program for Analysis of Bayesian Graphical Models Using Gibbs Sampling.”

R Core Team. 2019. R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing; https://www.R-project.org/.

Schmid, Frank. 2010. “Robust Loss Development Using MCMC.” SSRN Electronic Journal, February. https://doi.org/10.2139/ssrn.1501706.

Scollnik, D. P. M. 2001. “Actuarial Modeling with MCMC and BUGS.” North American Actuarial Journal 5 (2): 96–124.

Sherman, Richard E. 1984. “Extrapolating, Smoothing, and Interpolating Development Factors.” Proceedings of the Casualty Actuarial Society LXXI (135,136): 122–55.

Stan Development Team. 2019. “RStan: The R Interface to Stan.” http://mc-stan.org/.

Verrall, Richard. 2004. “A Bayesian Generalized Linear Model for the Bornhuetter-Ferguson Method of Claims Reserving.” North American Actuarial Journal 8 (July). https://doi.org/10.1080/10920277.2004.10596152.

Wüthrich, Mario V. 2018. “Machine Learning in Individual Claims Reserving.” Scandinavian Actuarial Journal 2018 (6): 465–80.

Zhang, Yanwei, Vanja Dukic, and James Guszcza. 2012. “A Bayesian Nonlinear Model for Forecasting Insurance Loss Payments.” Journal of the Royal Statistical Society, Series A 175: 637–56.