Hi and welcome. In this video, we’ll explore the complexity of digital data extraction in commercial insurance. We’ll discuss the various challenges that arise in this process and how to address them effectively.
Digital insurance data extraction is not a straightforward task. Underwriting risks and processing claims isn’t just a case of extracting data; the process involves several layers of complexity that need to be managed to ensure accurate and efficient data processing.
Let’s take a look at these layers more closely:
Diversity of Data Sources
Insurance submissions often take the form of a broker submission, which collates a number of documents. Within these numerous documents, data needs to be evaluated, structured into a desired format and processed before an underwriter or a claim handler can start their work. At Cytora, the submission can be sent via emails, broker portals or an API.
The complexity at this stage is that each request type could be a submission, a First Notice of Loss or a claim report which may include different formats and structures, such as complex excel files, PDFs, unstructured documents and even handwritten forms. This diversity poses a significant challenge for digital data extraction systems that prefer order and consistency.
Incomplete Information
Different insurers may require different pieces of information from submissions and/or claims reports in order to fully assess a risk. These requests can often arrive incomplete, with key documents missing, meaning that there isn’t enough necessary data, required to make an informed decision.
This gap between the information provided and the data needed adds to the complexity, demanding more time and resources to resolve the case.
Traditionally in manual data extraction, insurance professionals, or outsourcing teams, tend to spend a lot of their time manually validating the documents from internal and external sources. This process usually takes days, or even weeks, adding further delays to the process.
Handwritten Submissions
Handwritten documents are often found in broker submissions or claims. A police report for example, is often a handwritten document. Those documents could be challenging to decipher for humans and machines alike.
Addressing complexity
To effectively manage the complexity of data extraction, there are several strategies that can be employed:
Advanced LLMs
As we explored in Module 2, using advanced machine learning models, such as LLMs, can greatly improve data extraction accuracy from diverse and unstructured sources.
Schema Development
Developing a comprehensive schema that defines all the data points required for risk assessment also helps to ensure that the necessary information is extracted consistently. It also highlights to the user all missing mandatory fields to perform an action, which helps communication with brokers and speeds up the review process.
Data Validation
Implementing robust data validation processes, such as clearance checks, is also key to ensure that the extracted data is accurate and complete. This involves cross-referencing the extracted data with internal systems to identify duplicates and validate information. We’ll explore validation in greater detail in Module 4, looking at how continuously training and refining the extraction models, can help to improve accuracy and efficiency over time.
Digital insurance data extraction is inherently complex due to several factors, but insurers can overcome these challenges and achieve accurate and efficient data extraction, by leveraging advanced technologies and implementing robust strategies.
In the next video, we will discuss the strategies and methodologies for effectively deploying data extraction when underwriting risks or handling claims.