Back to blog
Articles
Articles
September 14, 2023
·
4 min read

How Does Large Language Models Use Long Contexts?

September 14, 2023
|
4 min read

Latest content

Tutorials
5 min read

Optimizing RAG with Knowledge Base Maintenance

How to find gaps between knowledge base content and real user questions.
April 23, 2024
Tutorials
4 min read

Scaling Quality Assurance with HumanFirst and Google Cloud

How to use HumanFirst with Vertex AI to test, improve, and trust agent performance.
March 14, 2024
Announcements
2 min read

Full Circle: HumanFirst Welcomes Maeghan Smulders as COO

Personal and professional history might not repeat, but it certainly rhymes. I’m thrilled to join the team at HumanFirst, and reconnect with a team of founders I not only trust, but deeply admire.
February 13, 2024
Tutorials
4 min read

Accelerating Data Analysis with HumanFirst and Google Cloud

How to use HumanFirst with CCAI-generated data to accelerate data analysis.
January 24, 2024
Tutorials
4 min read

Exploring Contact Center Data with HumanFirst and Google Cloud

How to use HumanFirst with CCAI-generated data to streamline topic modeling.
January 11, 2024
Articles
5 min

Building In Alignment: The Role of Observability in LLM-Led Conversational Design

Building In Alignment: The Role of Observability in LLM-Led Conversational Design
December 6, 2023
Articles
5 min read

Rivet Is An Open-Source Visual AI Programming Environment

Rivet is suited for building complex agents with LLM Prompts, and it was Open Sourced recently.
September 27, 2023
Articles
6 min read

What Is The Future Of Prompt Engineering?

The skill of Prompt Engineering has been touted as the ultimate skill of the future. But, will prompt engineering be around in the near future? In this article I attempt to decompose how the future LLM interface might look like…considering it will be conversational.
September 26, 2023
Articles
4 min read

LLM Drift

A recent study coined the term LLM Drift. LLM Drift is definite changes in LLM responses and behaviour, over a relatively short period of time.
September 25, 2023
Tutorials
5 min read

Optimizing RAG with Knowledge Base Maintenance

How to find gaps between knowledge base content and real user questions.
April 23, 2024
Tutorials
4 min read

Scaling Quality Assurance with HumanFirst and Google Cloud

How to use HumanFirst with Vertex AI to test, improve, and trust agent performance.
March 14, 2024
Announcements
2 min read

Full Circle: HumanFirst Welcomes Maeghan Smulders as COO

Personal and professional history might not repeat, but it certainly rhymes. I’m thrilled to join the team at HumanFirst, and reconnect with a team of founders I not only trust, but deeply admire.
February 13, 2024

Let your data drive.

Articles

How Does Large Language Models Use Long Contexts?

COBUS GREYLING
September 14, 2023
.
4 min read

And how to manage the performance and cost of large context input to LLMs.

TL;DR

  1. There is a deprecation in LLM performance when large context windows are leveraged.
  2. Offloading complexity to the LLM provider will turn the application into a black-box without the ability to granularly manage cost, input and output token use, model performance and context.
  3. A simplistic approach will incur technical debt which will have to be addressed later in the application lifecycle.
  4. Offloading complexity and data management to the LLM also closely ties the Generative App to a specific LLM. Generative Apps can be LLM agnostic by following a RAG approach.
  5. The ideal scenario is where the LLM is a utility and do not manage data or hold application complexity.
  6. Via a RAG implementation, use-cases demanding large context windows can be managed outside the ambit of the LLM.

As seen in the chart below, the context size of Large Language Models (LLMs) are growing and currently range between 4,000 to 100,000 tokens. Hence there is the temptation to over simplify LLM enterprise implementations and directly and natively leverage the large context window of LLMs.

Source

This avenue is very attractive in the short-term, in terms of favourable time-to-market, cost, solution complexity.

The disadvantages include the fact that the LLM becomes a black-box with no operational insights past the LLM input and output point.

Model performance substantially decreases as input contexts grow longer. — Source

Considering the graph below, cost is also important in terms of token use during text input and output. It is clear from the token use/cost breakdown that the output token use can be exorbitant.

Hence there are cost considerations to truncate the text input and shortening the LLM output. This goes to illustrate that this truncating will necessitate the introduction of complexity if implementers do not want to be completely at the mercy and behest of LLM suppliers.

Added to these considerations, long context accuracy has come under the scrutiny.

A recent study has found that LLM performance is best when the relevant information is present at the start or end of the input context.

And in contrast, performance degrades when data relevant to the user query is in the middle of long context.

Source

The graph below graphically illustrates how the accuracy improves at the beginning and end of the information entered.

And the performance deprecation when referencing data in the middle is also visible.

Source

Added to this, models with extended context windows does not generally perform better than other smaller context models.

Source

The graphs above shows different scenarios in terms of number of documents contrasted against accuracy and the position of the document holding the answer. Again performance is generally highest when relevant information is positioned at the very start or very end of the context, and rapidly degrades when models must reason over information in the middle of their input context.

I’m currently the Chief Evangelist @ HumanFirst. I explore & write about all things at the intersection of AI & language; ranging from LLMs, Chatbots, Voicebots, Development Frameworks, Data-Centric latent spaces & more.

Subscribe to HumanFirst Blog

Get the latest posts delivered right to your inbox