Back to blog
Articles
Tutorials
November 26, 2020
·
3 MIN READ

Bottom-up NLU with HumanFirst

November 26, 2020
|
3 MIN READ

Latest content

Tutorials
5 min read

Optimizing RAG with Knowledge Base Maintenance

How to find gaps between knowledge base content and real user questions.
April 23, 2024
Tutorials
4 min read

Scaling Quality Assurance with HumanFirst and Google Cloud

How to use HumanFirst with Vertex AI to test, improve, and trust agent performance.
March 14, 2024
Announcements
2 min read

Full Circle: HumanFirst Welcomes Maeghan Smulders as COO

Personal and professional history might not repeat, but it certainly rhymes. I’m thrilled to join the team at HumanFirst, and reconnect with a team of founders I not only trust, but deeply admire.
February 13, 2024
Tutorials
4 min read

Accelerating Data Analysis with HumanFirst and Google Cloud

How to use HumanFirst with CCAI-generated data to accelerate data analysis.
January 24, 2024
Tutorials
4 min read

Exploring Contact Center Data with HumanFirst and Google Cloud

How to use HumanFirst with CCAI-generated data to streamline topic modeling.
January 11, 2024
Articles
5 min

Building In Alignment: The Role of Observability in LLM-Led Conversational Design

Building In Alignment: The Role of Observability in LLM-Led Conversational Design
December 6, 2023
Articles
5 min read

Rivet Is An Open-Source Visual AI Programming Environment

Rivet is suited for building complex agents with LLM Prompts, and it was Open Sourced recently.
September 27, 2023
Articles
6 min read

What Is The Future Of Prompt Engineering?

The skill of Prompt Engineering has been touted as the ultimate skill of the future. But, will prompt engineering be around in the near future? In this article I attempt to decompose how the future LLM interface might look like…considering it will be conversational.
September 26, 2023
Articles
4 min read

LLM Drift

A recent study coined the term LLM Drift. LLM Drift is definite changes in LLM responses and behaviour, over a relatively short period of time.
September 25, 2023
Tutorials
5 min read

Optimizing RAG with Knowledge Base Maintenance

How to find gaps between knowledge base content and real user questions.
April 23, 2024
Tutorials
4 min read

Scaling Quality Assurance with HumanFirst and Google Cloud

How to use HumanFirst with Vertex AI to test, improve, and trust agent performance.
March 14, 2024
Announcements
2 min read

Full Circle: HumanFirst Welcomes Maeghan Smulders as COO

Personal and professional history might not repeat, but it certainly rhymes. I’m thrilled to join the team at HumanFirst, and reconnect with a team of founders I not only trust, but deeply admire.
February 13, 2024

Let your data drive.

Learn how to apply the tried and tested divide-and-conquer approach to labeling large datasets using HumanFirst.

Heads Up: If you’re new to bottom-up labeling, please read “A bottom-up approach to NLU”.

What is bottom-up labeling?

Bottom-up labeling applies the tried and tested divide-and-conquer approach to the problem of labeling large datasets, with great success. Instead of expecting a human or unsupervised algorithm to correctly “predict” what intents and abstractions exist in the data, it provides a simple framework to iteratively discover this information. [1]

Below is a simple example of what bottom-up labeling looks like. Starting from the left with unlabeled utterances and moving to the right shows intents with increasing specificity. This specificity is achieved using a bottom-up approach to labeling. We’ll show you how to put this approach into practice using HumanFirst!

Part 1: Setting things up!

This article is part 1 in a series that will show you how to apply a bottom-up approach to labeling and intent discovery with HumanFirst. In this article, we’ll focus on getting started with HumanFirst and how to set up the bottom-up labeling process.

Getting started is simple.

Step 1: Upload your raw conversational data to HumanFirst. You can upload utterances or 2-way conversations in TXT or CSV formats respectively. For more information click here.

Step 2: Head to the Unlabeled Data section of HumanFirst and begin selecting utterances that are related with a high level of abstraction (i.e. questions, problems, requests etc).

In the example above we chose the initial level of abstraction to be: questions. We then selected utterances that relate to this label. Once a decent amount have been selected, we label the utterances in an intent. We’ll call this one “has a question”.

The outcome of these steps is valuable, as it provides high-quality and domain-specific training data to classify users who “have a question”.

Step 4: We’ll now want to look at our intent “has a question” and begin selecting some of its training data.

As you can see, selecting an utterance within an intent causes a semantic re-rank within the training data. This speeds up the selection and re-factoring process.

Step 5: Assign the selected utterances to a more specific sub-intent of your choosing.

We end up with two sub-intents: has a question > about account & has a question > about settings

Step 6: Repeat steps 4 & 5 within your new sub-intents (to classify labeled utterances into further sub-intents) until the desired level of granularity is achieved.

Every step produces training data for classifiers that can recognize increasingly specific intents: this is one of the major advantages of this approach.

Repeating this approach will yield an intent structure/hierarchy that will reflect your domain. After a few minutes of this process we’ve generated an intent structure that contains trained classifiers at every level of abstraction. This facilitates the understanding of our corpus and our identification of long-tail intents.

References

[1] A bottom-up approach to intent discovery and training

Subscribe to HumanFirst Blog

Get the latest posts delivered right to your inbox