
By making use of haystack and Open Assistant, you are able to create a HuggingChat or ChatGPT like application.
HuggingChat and ChatGPT (as seen below) are both general chatbot interfaces. Users can have general conversations with the chatbots via a web based GUI. The idea is that both of these services will act as the ultimate personal assistant.

These interfaces are constituted by three key layers:
- Large Language Model
- API Call
- Conversation memory/context management
- Graphic User Interface

As use of these chat interfaces grow, users have a natural desire to develop or build on these interfaces. And the LLMs on which these chat interfaces are built, are now available.
In the case ChatGPT the LLMs used are gpt-4, gpt-3.5-turbo, gpt-4–0214, gpt-3.5-turbo-0301.
In the case of HuggingChat, the current model is OpenAssistant/oasst-sft-6-llama-30b-xor .
But, having API access does not address the requirement for managing conversation context and memory, this will have to be developed.
The integration with memory, enables human-like conversations with Large Language Models (LLMs). Conversation context is managed and users can ask follow-up questions by implicitly referencing conversational context. This element is vital for longer multi-turn conversations.
Haystack recently released a notebook guiding users step-by-step in building your own HuggingChat interface based on the same LLM as HuggingChat is currently using.
As you can see below, the notebook also generates a GUI through which you can have a conversation with the chatbot, view and clear the memory.

Here follows a complete guide on how to have your own personal assistant running in a notebook.
Firstly, install haystack:
In order to access HuggingFace’s hosted inference API’s, you need to provide an API key for HuggingFace:
A PromptNode is initialised with the model name, api key and max length setting. We will reference the same LLM used by HuggingChat:
OpenAssistant/oasst-sft-1-pythia-12b
In order to make the conversational interface more humanlike, memory needs to be created.
There are two types of memory options in Haystack:
- ConversationMemory: stores the conversation history (default).
- ConversationSummaryMemory: stores the conversation history and periodically generates summaries.
The ConversationalAgent can now be initialised. As PromptTemplate, ConversationalAgent uses conversational-agent by default.
The memory use can be illustrated by asking a question, followed by a second question. With the second question implicitly referencing the context of the first question:
Question One:
Tell me three most interesting things about Istanbul, Turkey
Question Two:
Can you elaborate on the second item?
Below you can see how questions can be posted:
And the follow-up implicit question:
Take your chat experience to the next level with the example application below…
Here is the interactive chat window right within Colab which the code will generate:

Execute the code cell below and use the text area to exchange messages with the conversational agent. Use the buttons on the right to load or delete the chat history.

I’m currently the Chief Evangelist @ HumanFirst. I explore and write about all things at the intersection of AI and language; ranging from LLMs, Chatbots, Voicebots, Development Frameworks, Data-Centric latent spaces and more.