Duration:
5
minutes
Summary:
This lesson will explain how to define taxonomy and the three key levers that are used to achieve successful outcomes.
Module
2
:
Classification
← Back to Module
2

2

.

5

Defining a Taxonomy

Transcript

Hi and welcome. In this video, we're going to explore how to define taxonomy and the three key levers that are used to achieve successful outcomes.

Understanding the balance between these levers is crucial for creating an effective and efficient classification system.

If classification is organising incoming files and documents into categories, then a taxonomy is simply the list of categories or ‘classes’ that you want to organise files and documents into. The number of classes in the taxonomy is what influences its complexity.

The higher the number of classes created, the more complex the modelling problem becomes, which in turn reduces the chance of achieving high accuracy. Conversely, a low number of classes can lead to misclassification as the model struggles to find an appropriate category for certain items. Let's now look at three different levels of class granularity. Here are some examples of broad classes you might see used in a taxonomy:

The advantages of these classes is that they’re simple, easy to use, less complex to classify and suitable for high level overviews and general reporting. However, there’s potential for lower accuracy in classification due to the lack of granularity in the categories.

Now let’s look at Intermediate classes.

These intermediate classes provide more granularity than the broad classes which gives improved classification accuracy and are more suited to specific reporting and analytics. However, these are slightly more complex to apply and require more detailed classification guidelines and maintenance.

Finally let’s take a look at the third lever - detailed classes 

These classes provide high granularity and specificity, enhanced accuracy for detailed reporting and analytics and are better aligned with specific policy features and customer needs. However, they are highly complex and require more maintenance as well as comprehensive classification guidelines and training. Because of this they could potentially appear overwhelming for newer users without experience in this area.

Taking into account all of the pros and cons of each of these classes shows us why it’s important to apply a balancing act between these three levers. 

When designing a taxonomy, it's crucial to balance granularity with usability. The right number of classes depends on the specific needs of the business, the complexity of the products and the intended use of the taxonomy. Iterative refinement and stakeholder feedback are key to finding the right balance. We’ll explore this in more detail in the next video.

Let's now explore the concept of semantic distinct class naming. This means that the more distinct the names of individual classes, the lower the chance of errors. 

Let’s see this in practice with an example:

Semantic distinct class naming as shown in this example minimises errors by reducing ambiguity. It enhances accuracy and reliability in assigning documents to the appropriate classes. It’s worth remembering that achieving semantic distinct class naming might require a higher number of classes, which can sometimes complicate the taxonomy.

Finally, let's look at the semantic relationship between the class name and content of the file. The more representative and unambiguous the class name is to the content, the higher the chance of the model selecting the correct class. Let’s see this in practice with an example:

As shown here, a strong semantic relationship between class names and file content improves classification accuracy by clearly guiding the expected content of each category. While semantic distinct class naming, together with a strong relationship between class names and file content enhances classification accuracy, they may lead to higher complexity. Balancing this specificity with manageability is crucial for developing an effective taxonomy for document classification. 

We will go on to explore the key strategy for achieving an optimal taxonomy in the next video.

Previous lesson
Next lesson
Previous lesson
No previous lesson
Next lesson
No next lesson
By using this website you agree to our cookie policy
Okay