Machine learning was the hot topic at GIRO 2018. What advantages does machine learning offer for pricing? How can insurers bring these models to production?
Cytora’s actuarial and data science teams tackled these questions during a workshop at the GIRO conference last week. Run by the Institute and Faculty of Actuaries, GIRO attracts more than 800 delegates, who meet to discuss best practice and emerging trends in general insurance pricing and modelling.
The workshop ‘Machine Learning and Fairness for Commercial Insurance’ was run by Cytora’s lead actuary Paul Bassan, and senior data scientist Oliver Laslett.
Paul’s expertise lies in building machine learning models to optimise rating and pricing. Oliver spends most of his time building and testing machine learning models, and developing methods to explain their outputs.
Together Paul and Oliver took the audience on a tour of machine learning, covering pricing advantage, model explainability, and strategies for fairness.
Finding pricing advantage with machine learning models
Machine learning uses statistical techniques to give computer systems the ability to ‘learn’ with data, without being explicitly programmed. It offers the next generation of regression methods, which can be applied to pricing, and make it easier to deal with complex data with underlying structure.
“The GLM has worked really well for decades now, and powers most of the pricing in the industry. But there are definitely areas where GLMs underperform for pricing.”
Paul’s part of the workshop focused on two key advantages of machine learning models:
- Nonlinear responses. Generalised linear models (GLM)s often lead to localised regions of underpricing and overpricing. By employing nonlinear machine learning techniques, a model can move closer to adequate pricing across the full range of inputs.
- Complex interactions. As datasets scale to include hundreds of rating factors, it becomes increasingly unfeasible to assess every combination of rating factor interaction terms. As data grows, interactions become easier to deal with using machine learning methods.
Diving into model explainability
Among the open questions with the introduction of machine learning into commercial insurance is how to build these models to support stakeholder communication and professional requirements.
Oliver’s section of the workshop highlighted that machine learning need not be a black box. Established techniques exist for investigating both machine learning models and their outputs. At times, insurers will face a trade-off between predictive power and explainability.
As new methods for investigating machine learning methods are developed, we expect insurers to grow more confident in using these methods and shift more towards models with higher predictive power.
In contrast to worries about computers making unsupervised decisions, machine learning can actually be a tool to improve fairness in insurance pricing. Fairness is achievable if practitioners are active about designing for it.
- Observe relevant rating factors. For example, data gathered through telemetrics is a direct measure of driving aggression in motor insurance.
- Adjust premiums to optimise metrics of fairness. A range of different definitions of fairness may be applied depending on the situation (e.g. equality of opportunity, demographic parity, equalised odds).
- Design and train algorithms with fairness baked in. Once an appropriate metric of fairness has been determined, practitioners can build algorithms that optimise for these measures.
Delivering machine learning models into production
Throughout the GIRO conference, data science and machine learning were among the hottest topics. We spoke to a number of practitioners who were keen to learn about how they might benefit from the introduction of these techniques.
Notably, there is no silver bullet here. Developing and testing machine learning models takes experience and skill, and starting to work with these methods will not magically transform insurance pricing.
Instead, machine learning provides an extension of the modelling toolbox that enables insurers to process and learn from more data, and to find regions of pricing opportunity where GLMs under or over price.