Achieving digitisation and automation in risk processing: Part 2

By Aeneas Wiener, CTO, Cytora

Achieving digitisation and automation in risk processing: Part 2

In the first Part of “Achieving digitisation and automation in risk processing” we’ve explored why it’s crucial for insurers to facilitate better performance in risk processing. We’ve also dived into how digital risk flows would look in an ideal world and the results from achieving it.

In this second part, we explore what it would take (across engineering resources, time and cost) to build a digital risk processing system in-house with an insurer’s own IT resources. This post will focus on what are the different components a digital risk processing platform must have, as well as the tasks and implications for any insurer who has decided to build a digital risk processing platform all on their own.

The four building blocks of a digital risk processing platform 

There are at least four technical building blocks of a digital risk processing platform. When delivered using an agile methodology, each of these needs its own independent product squad to build and maintain. To give a sense of what needs to be built, there’s a rough outline of the scope of these four areas:

  1. Multi-step workflow builder

    This is a component that can create and execute multi-step digital workflows. The workflow builder needs to support the execution of different steps across digital workflows to process new business submissions, renewals, adjustments and other request types. Some steps are about fetching data and attaching it to work items (e.g. insurance submissions). Others are about showing the work items to human operators to complete missing data based on confidence thresholds.

    For capturing missing data from human operators, the requirement is for a dynamic and visually rich user interface that can render data forms, files and external data (including map data). Because of the rapid rate of change when it comes to the availability of external data, this risk console should be configuration driven, so that it dynamically shows new data fields as they are integrated into the platform, without requiring a costly IT-driven change every time a new data source or extracted target field is added. The interface also should be machine learning optimised, so that every action taken by human operators in the console is captured as training data to continuously improve automation levels. It should include a business rules engine to be able to respond quickly to changing market environments, e.g. enabling management to easily evolve triage rules that are applied for incoming submissions. Usually, the workflows finish with output connectors that are responsible for inserting the enriched and extracted data into destination systems such as the CRM, Underwriting Workbench, Policy Admin System and Data Platform.

  2. Data integration layer

    The data integration layer flexibly connects data flows from upstream and downstream systems (including internal and external data sources, but also policy administration systems, email inboxes, CRMs, data warehouses, information extraction models, and risk scoring models maintained by data science teams). This flexible integration pattern enables neighbouring systems to be used in a digital risk processing flow. Bonus point if it comes with a data library of pre-integrated data sources.

    Realistically speaking, this is not something that an insurer is able to easily build in-house. Even though an iPaaS (integration platform as a service such as MuleSoft, SnapLogic etc) is a good starting point for this, It is not enough. This data platform must include advanced machine learning building blocks to perform entity resolution (e.g. to turn address strings into resolved address IDs, and to turn company names and descriptions into universal company IDs.) These indexing and query resolution problems are sometimes implemented as learning-to-rank tasks in machine learning, similar to the way Google turns a search query into the best search result.

  3. Document extraction pre-trained to the insurance domain

    The document extraction building block turns emails, PDFs and Excel sheets and other attachments into standardised digital risk records suitable for automated decision making.

    The extraction system should be pre-trained, and continuously re-trained. The initial training in the insurance domain should be based on having already been exposed to a wide range of submissions and other document types from the insurance domain for years. This enables the system to achieve the best possible performance when detecting claims tables, schedules of values, and broker branch identifiers, which can come from the email headers and body, attached PDFs, or large Excel sheets.

    Simply extracting content from PDFs is not enough: this component needs to include insurance domain optimized entity resolvers (e.g. to resolve human written sum insured currency amounts into machine-readable values that a business rules engine can operate against).

    The system should learn over time: an exception handling user interface should handle missing data, and capture training data for the document extraction learning loops.

  4. Learning loops infrastructure

    Extracting and storing the data is not enough. To close the learning loop, insurers must also have a way to operationalise insight back into the workflow. Currently, a lot of data is lost inside insurance companies, e.g. whenever a submission is rejected and the data is not keyed into any systems. Additionally, when the data does exist, those data points are often not captured in a machine learning ready way, so analytics efforts remain one-off efforts. A learning loop infrastructure should close the loop between:

    • collecting the analytics and ML ready data in a data platform 
    • deriving insights from it (e.g. suggested rules for the business rules engine, a trained conversion model) 
    • serving these model artefacts back into the live flow of risks.

The skills and teams needed to deliver

In terms of required skills, the configurable workflow builder needs significant front-end and user experience design resources, whilst the data integration layer, the document extraction system, and the machine learning platform require backend and machine learning engineers. In addition to these software engineers, each of the independent squads will be led by a product manager, and have access to a (possibly shared) cloud platform engineering team.

The initial build phase is even more taxing on resources and relies on experienced product directors and senior cloud architects with document extraction, machine learning, and insurance experience.

In addition to the four core product development and engineering teams, there are three operational areas that will require staffing: 

  1. Systems integrations (to insert the extracted and enriched data into CRMs, workbench, PAS and data warehouse) 
  2. Configuration management (to encode and continuously update the workflows and business rules, and decide which internal data sources should be used, which requires data research expertise)
  3. Underwriter adoption and technical support

The challenges of taking the ‘build’ approach

The above summary gives a sense of why this type of platform is complex to build, involving many interconnected parts that only work optimally if they are compatible with each other.

In addition, because of the specialised skills involved in building a digital risk processing platform, growing and maintaining a world-class cross-disciplinary product team in this area is a significant responsibility and entails ongoing risks. One reason for this is the importance of human-in-the-loop workflows and interfaces. This requires advanced product design and UX capabilities (e.g. see the research of Ge Wang) which is not a core area for insurers so it will be harder to grow and maintain a team of experts in this area.

Here are the implications of building an in-house solution:

Time to value

Time to value is slow with an in-house option. This is because even if an insurer has the technical leadership and product vision in place to guide the development of a digital risk processing platform, the time to an MVP would still be at least a year, assuming no adverse unforeseen events like key employee churn, or slow adoption.

We also find that with complex software projects it often takes multiple attempts until a system is developed that truly addresses the use cases, which makes time to value less predictable for insurers that decide to bootstrap something starting from zero.

We have also observed that when insurers take on cutting edge technology projects like this one their key person risk increases significantly. This often happens because of time and budget pressures, which mean that especially complex machine learning and data systems built in-house are often dependent on the early team members who first built them.

Delivery certainty

Control is high with the in-house option, as the direction of the product is fully within the control of the insurer.  However, internal-know how is likely to be lower. How many full-time experts on data, machine learning and cloud-native software development does the organisation have to be able to drive the development of a large scale project like this. Ideally, the insurer would trust the project leadership in hands of someone who had built a platform like this before. If there’s a lack of a very strong vision for the architecture and the deliverables an insurer is running into the risk of going through a lengthy and costly research and development phase before the actual engineering project can begin.

Run cost

The total cost of ownership is very high for the in-house option because unlike dedicated software vendors, the insurer will not be able to spread their development costs across 10s to 100s of customers. The dominant expense will be personnel costs, across the build and maintenance phases.

This means the insurer will have to carry the cost of the initial build, and of possible failed projects along the way, on their own and mean that the business case is harder to justify than working with a software provider.  

Once live with a build option, insurers will need to develop and extend the roadmap themselves, shipping new features continuously to compete and differentiate against other software platforms that over time will develop attractive unit economics. 

Personnel requirements

Personnel requirements are high, starting with the need for four cross-disciplinary product squads to create and maintain the technology building blocks. The need for experienced senior technology leaders is particularly high during the initial build phase, which creates additional project risk due to fierce competition for some of the specialised skills involved.

In addition to the development teams, there is a need for business-facing teams that handle systems integrations, the configuration of workflows and triage rules, deciding on and configuring data integrations, as well as a team to support underwriters through the initial adoption of the new processes.


Building a platform that supports multiple workflows for multiple lines, across potentially multiple geographies is no easy feat. While it may seem easy to begin with i.e. when the scope is limited and is answering a small use case. However, the scope will naturally grow in size as the requirements are being gathered and other lines of business join in wanting to improve their processes. With that in mind, it’s important to analyse the potential entirety of the project and the implications of taking the path of building an in-house solution versus partnering with a dedicated provider.