Back to blog
Articles
Articles
August 8, 2023
·
6 min read

Prompt Tuning, Hard Prompts & Soft Prompts

August 8, 2023
|
6 min read

Latest content

Tutorials
5 min read

Optimizing RAG with Knowledge Base Maintenance

How to find gaps between knowledge base content and real user questions.
April 23, 2024
Tutorials
4 min read

Scaling Quality Assurance with HumanFirst and Google Cloud

How to use HumanFirst with Vertex AI to test, improve, and trust agent performance.
March 14, 2024
Announcements
2 min read

Full Circle: HumanFirst Welcomes Maeghan Smulders as COO

Personal and professional history might not repeat, but it certainly rhymes. I’m thrilled to join the team at HumanFirst, and reconnect with a team of founders I not only trust, but deeply admire.
February 13, 2024
Tutorials
4 min read

Accelerating Data Analysis with HumanFirst and Google Cloud

How to use HumanFirst with CCAI-generated data to accelerate data analysis.
January 24, 2024
Tutorials
4 min read

Exploring Contact Center Data with HumanFirst and Google Cloud

How to use HumanFirst with CCAI-generated data to streamline topic modeling.
January 11, 2024
Articles
5 min

Building In Alignment: The Role of Observability in LLM-Led Conversational Design

Building In Alignment: The Role of Observability in LLM-Led Conversational Design
December 6, 2023
Articles
5 min read

Rivet Is An Open-Source Visual AI Programming Environment

Rivet is suited for building complex agents with LLM Prompts, and it was Open Sourced recently.
September 27, 2023
Articles
6 min read

What Is The Future Of Prompt Engineering?

The skill of Prompt Engineering has been touted as the ultimate skill of the future. But, will prompt engineering be around in the near future? In this article I attempt to decompose how the future LLM interface might look like…considering it will be conversational.
September 26, 2023
Articles
4 min read

LLM Drift

A recent study coined the term LLM Drift. LLM Drift is definite changes in LLM responses and behaviour, over a relatively short period of time.
September 25, 2023
Tutorials
5 min read

Optimizing RAG with Knowledge Base Maintenance

How to find gaps between knowledge base content and real user questions.
April 23, 2024
Tutorials
4 min read

Scaling Quality Assurance with HumanFirst and Google Cloud

How to use HumanFirst with Vertex AI to test, improve, and trust agent performance.
March 14, 2024
Announcements
2 min read

Full Circle: HumanFirst Welcomes Maeghan Smulders as COO

Personal and professional history might not repeat, but it certainly rhymes. I’m thrilled to join the team at HumanFirst, and reconnect with a team of founders I not only trust, but deeply admire.
February 13, 2024

Let your data drive.

Articles

Prompt Tuning, Hard Prompts & Soft Prompts

COBUS GREYLING
August 8, 2023
.
6 min read

Prompt Engineering is the method of accessing Large Language Models (LLMs), hence implementations like Pipelines, Agents, Prompt Chaining & more which are LLM based are all premised on some form of Prompt Engineering.

Prompt Engineering is a simplistic and intuitive way to interact and interface with a powerful system like a Large Language Model (LLM).

Hence why we see the current levels to which Prompt Engineering has democratised access and general use of LLMs.

And as seen in the updated image below, a number of LLM-based Generative AI application architecture approaches have taken shape, all with the notion of Prompting at its centre.

Updated (Source)

Hard Prompts

Hard Prompts can be seen as the idea of a defined prompt which is static, or at best a template. A generative AI application can also have multiple prompt templates at its disposal to make use of.

Hard prompts are manually handcrafted text prompts with discrete input tokens. ~ HuggingFace

Prompt templating allows for prompts to be stored, re-used, shared, and programmed. And generative prompts can be incorporated in programs for programming, storage and re-use.

Prompt Decomposition & Templating (Source)

And even-though templating brings a level of flexibility the prompt is still very much set, or in other words, a hard prompt.

Consider the LLM-based Agent example below from LangChain. The prompt template is to a large degree static and instructs the agent on what to do. Generally, the template incorporates:

  • tools defines which tools the agent has access to, and when the tools should be called.
  • input: generic user input

Soft Prompts

Soft prompts are created during the process of prompt tuning.

Unlike hard prompts, soft prompts cannot be viewed and edited in text.Prompts consist of an embedding, a string of numbers, that derives knowledge from the larger model.

So for sure, a disadvantage is the lack of interpretability of soft prompts. The AI discovers prompts relevant for a specific task but can’t explain why it chose those embeddings. Like deep learning models themselves, soft prompts are opaque.

Soft prompts act as a substitute for additional training data. Researchers recently estimated that a good language classifier prompt is worth hundreds to thousands of extra data points.

Source

NVIDIA describes the process of prompt tuning as follows.

Prompt tuning involves using a small trainable model before using the LLM. The small model is used to encode the text prompt and generate task-specific virtual tokens.

“soft” prompts designed by an AI that outperformed human-engineered “hard” prompts. ~ Source

These virtual tokens are pre-appended to the prompt and passed to the LLM. When the tuning process is complete, these virtual tokens are stored in a lookup table and used during inference, replacing the smaller model.

Prompt tuning created a smaller light weight model which sits in front of the frozen pre-trained model. Hence soft prompts via prompt tuning is an additive method for only training and adding prompts to a pre-trained model.

This process involves training and updating a smaller set of prompt parameters for each downstream task instead of fully fine-tuning a separate model.

As models grow larger and larger, prompt tuning can be more efficient, and results are even better as model parameters scale.

NVIDIA Prompt Engineering and Prompt-Tuning (Source)

In Conclusion

The whole idea of the process of prompt tuning creating soft prompts to interact with a static pre-trained LLM is surely efficient and a streamlined process.

LLMs perform much better when context is supplied and prompt tuning is a fast and efficient way of creating that much needed context on the fly, in an automated fashion which is not static.

However, as IBM noted, this process is opaque and not transparent. The sheer abstract nature of soft prompts can make it harder to benchmark and test model performance, especially when smaller level of tweaks are required.

Vector databases, Agents and prompt pipelines have been used as avenues to supply LLMs with relevant contextual data at the right juncture of a conversation.

And even-though these approaches are less efficient than prompt tuning, the transparency and human interpretability of these approaches are attractive. Especially from an organisational perspective where fine-tuning and scaling are important.

For a complete step-by-step tutorial on prompt tuning and soft prompts, take a look at this HuggingFace post.

I’m currently the Chief Evangelist @ HumanFirst. I explore & write about all things at the intersection of AI & language; ranging from LLMs, Chatbots, Voicebots, Development Frameworks, Data-Centric latent spaces & more.

Subscribe to HumanFirst Blog

Get the latest posts delivered right to your inbox