Back to blog
Articles
Articles
September 5, 2023
·

RAG & LLM Context Size

September 5, 2023
|

Latest content

Tutorials
5 min read

Optimizing RAG with Knowledge Base Maintenance

How to find gaps between knowledge base content and real user questions.
April 23, 2024
Tutorials
4 min read

Scaling Quality Assurance with HumanFirst and Google Cloud

How to use HumanFirst with Vertex AI to test, improve, and trust agent performance.
March 14, 2024
Announcements
2 min read

Full Circle: HumanFirst Welcomes Maeghan Smulders as COO

Personal and professional history might not repeat, but it certainly rhymes. I’m thrilled to join the team at HumanFirst, and reconnect with a team of founders I not only trust, but deeply admire.
February 13, 2024
Tutorials
4 min read

Accelerating Data Analysis with HumanFirst and Google Cloud

How to use HumanFirst with CCAI-generated data to accelerate data analysis.
January 24, 2024
Tutorials
4 min read

Exploring Contact Center Data with HumanFirst and Google Cloud

How to use HumanFirst with CCAI-generated data to streamline topic modeling.
January 11, 2024
Articles
5 min

Building In Alignment: The Role of Observability in LLM-Led Conversational Design

Building In Alignment: The Role of Observability in LLM-Led Conversational Design
December 6, 2023
Articles
5 min read

Rivet Is An Open-Source Visual AI Programming Environment

Rivet is suited for building complex agents with LLM Prompts, and it was Open Sourced recently.
September 27, 2023
Articles
6 min read

What Is The Future Of Prompt Engineering?

The skill of Prompt Engineering has been touted as the ultimate skill of the future. But, will prompt engineering be around in the near future? In this article I attempt to decompose how the future LLM interface might look like…considering it will be conversational.
September 26, 2023
Articles
4 min read

LLM Drift

A recent study coined the term LLM Drift. LLM Drift is definite changes in LLM responses and behaviour, over a relatively short period of time.
September 25, 2023
Tutorials
5 min read

Optimizing RAG with Knowledge Base Maintenance

How to find gaps between knowledge base content and real user questions.
April 23, 2024
Tutorials
4 min read

Scaling Quality Assurance with HumanFirst and Google Cloud

How to use HumanFirst with Vertex AI to test, improve, and trust agent performance.
March 14, 2024
Announcements
2 min read

Full Circle: HumanFirst Welcomes Maeghan Smulders as COO

Personal and professional history might not repeat, but it certainly rhymes. I’m thrilled to join the team at HumanFirst, and reconnect with a team of founders I not only trust, but deeply admire.
February 13, 2024

Let your data drive.

In this article I consider the growing context of various Large Language Models (LLMs) to what extent it can be used and how a principle like RAG applies.

There is a split taking place in LLM usage into two main streams; commercial single end-users & enterprise implementations.

For end-users there are polished UIs available like ChatGPT, HuggingChat…Cohere has also launched a personal assistant called Coral. There are also growing playground functionality from companies like Vellum and Vercel.

All of these User Interfaces are no-code web-based UIs which allows for individuals to use LLMs as productivity tools; as I describe in this article.

16k context means the model can now support ~20 pages of text in a single request.

- OpenAI

For instance, via the 16k context window I could submit a 14 page document to the OpenAI LLM. For personal use the document can be checked and curated in a manual fashion, and output can also be visually inspected prior to use.

Being able to submit large documents in the case of the 100,000 context window of Anthropic acts as a significant boost for productivity together with searching and manipulating documents.

And there is a huge drive from model providers to reach the end-user with a conversational UI and functionality. This has lead to products and companies within the LLM and Foundation Model product landscape to be superseded.

The image below shows 23 large language models, with their model name, the model supplier and the current context size of the model.

The chasm between personal use and enterprise use is huge. And while LLM suppliers are aiming for end-users as part of their product offering…the challenge and complexity lies with enterprise implementations.

Enterprise implementations needs to consider elements like: data privacy, PII, inference latency, token use, redundancy, the geographic location of models…and more.

And while the context window is important for the sheer amount of data which can be submitted; cost, latency and accuracy are important.

Secondly, transparency is impeded. It is in the interest of LLM providers when users submit large chunks of text to the LLM and leave the heavy lifting to the LLM.

The user has no control on how the data is processed and chunked by the LLM. The more astute approach would be the have a process where data is chunked.

After which the data can be loaded into a document store or vector database where text snippets of about 200 words can be injected into the prompt.

Hallucination is negated with presenting highly concise and contextually relevant reference data at inference. The threat of hallucination again surfaces with submitting large amounts of data for the LLM to interpret.

Lastly, LLM model independence can be achieved by way of RAG, as averageLLMs perform on par with state-of-the-art models when presented with well-formed prompts.

I’m currently the Chief Evangelist @ HumanFirst. I explore & write about all things at the intersection of AI & language; ranging from LLMs, Chatbots, Voicebots, Development Frameworks, Data-Centric latent spaces & more.

Subscribe to HumanFirst Blog

Get the latest posts delivered right to your inbox