This guide shows how to implement a simple user memory system that derives and stores facts about users that
are then referenced later on.
A fully working example can be found on GitHub.
It’s setup as a discord bot so view our Discord guide
for more details on how to set that up.
Initial Setup
First we’ll have some setup code to initialize our LLM, Honcho client, and prompts.
from langchain import OpenAI
from langchain.chat_models import ChatOpenAI
from langchain.prompts.chat import (
ChatPromptTemplate,
SystemMessagePromptTemplate,
HumanMessagePromptTemplate,
)
from langchain.schema import (
AIMessage,
HumanMessage,
SystemMessage
)
from langchain_core.output_parsers import NumberedListOutputParser
llm: ChatOpenAI = ChatOpenAI(model_name="gpt-4")
app_name = "Fact-Memory"
honcho = Honcho(environment="demo")
The 3 steps to this project are:
- Derive facts about the user.
- Store those facts in a per-user vector db (
Collection
).
- Retrieve those facts later on to improve our response.
Step 1 - Deriving Facts
For this part we will be leveraging LangChain and GPT-4 to derivce facts on the user based on their recent input.
def derive_facts(user_input):
fact_derivation = ChatPromptTemplate.from_messages([
SystemMessagePromptTemplate(prompt=prompt)
])
chain = fact_derivation | llm
response = await chain.ainvoke({
"user_input": input
})
output_parser = NumberedListOutputParser()
facts = output_parser.parse(response.content)
return facts
Breaking down this function we start by using LCEL to make a chain that will take the user_input
as a variable. The
prompt used here is below:
_type: prompt
input_variables:
["user_input"]
template: >
You are tasked with deriving discrete facts about the user based on their input. The goal is to only extract absolute facts from the message, do not make inferences beyond the text provided.
```{user_input}```
Output the facts as a numbered list.
We then invoke the chain and parse the output to get our list of facts. We take
advantage of one of LangChain’s built in output parsers for this.
Step 2 - Storing Facts
This is where Honcho comes into play. With Honcho we can initialize
Collections
for each user and can store facts as vector embeddings. You can
use multiple collections if you want to segment different types of facts or
data, but for now we just need one.
def store_facts(app_id, user_id, facts):
user = honcho.apps.users.get_or_create(name=user_id, app_id=app.id)
try:
collection = honcho.apps.users.collections.get_by_name(app_id=app.id, user_id=user.id, name="discord")
except NotFoundError as e:
collection = honcho.apps.users.collections.create(app_id=app.id, user_id=user.id, name="discord")
collection: Collection
for fact in facts:
honcho.apps.users.collections.documents.create(
app_id=app_id, user_id=user_id, collection_id=collection.id, content=fact
)
That’s it. We just make the collection if it doesn’t already exist for the user and add the facts.
Step 3 - Retrieving Facts
This is where we actually make user of the facts. To do this we need to first determine a query to the collection. This query is natural language that is compared to other facts in the collection using cosine similarity.
The trick we use below is to have the LLM determine the query and then use it in a later step.
def introspect(chat_history, input):
introspection_prompt = ChatPromptTemplate.from_messages([
system_introspection
])
chain = introspection_prompt | llm
response = await chain.ainvoke({
"chat_history": chat_history,
"input": input
})
questions = output_parser.parse(response.content)
return questions
def respond(chat_history, input):
response_prompt = ChatPromptTemplate.from_messages([
system_response,
chat_history
])
retrieved_facts = collection.query(query=input, top_k=10)
chain = response_prompt | llm
response = await chain.ainvoke({
"facts": retrieved_facts,
})
return response.content
In the the introspect
method we are using the user’s response as input to derive the query. The prompt used is below:
_type: prompt
input_variables:
["chat_history", "user_input"]
template: >
Given the conversation history and user input, use your theory of mind skills to list out questions you'd like to know about the user in order to best respond to them.
Chat history: ```{chat_history}```
User input: ```{user_input}```
Output the questions as a numbered list.
Then we use the question that it generated as the query in our collection.query()
method. the top_k
parameter allows you to control
how many retrieved facts to return with a max of 50.
The prompt use there is below as well:
_type: prompt
input_variables:
["facts"]
template: >
You are a helpful assistant. Craft a useful response based on the context provided in the conversation history and the facts we know about the user:
```{facts}```
This is a very simple method of using Honcho to hold user context. For further reading on the limits read about
violation of expectation.