Python in Plain English

New Python content every day. Follow to join our 3.5M+ monthly readers.

Follow publication

MemGPT As A Service (MAAS): Building a chat interface around MemGPT

--

Running MemGPT as an interactive chatbot on the browser.

Build things that don’t scale ~ Paul Graham

Up to this point, we’ve primarily employed MemGPT via the terminal, as demonstrated in my earlier article. However, as aptly pointed out by Mario Wolframm, to enhance its utility, we must make it accessible through a browser application, free from the clutter of internal thought processes. Today’s excursion involves exploring the potential of leveraging the MemGPT server as the backend, hosting a chatbot frontend, and enabling users to interact solely through a chat interface, with the bot’s responses as the sole output.

Our end goal today:

On a web browser — streamlit connected to MemGPT server

Caution: The MemGPT server apis are still under development. Things are volatile as mentioned in their docs. It is always good to peek into their discord or github to resolve issues.

Setup:

Once again we are going the OpenAI route (to keep things simple). So you will be needing the OpenAI API key as usual.

OS: Mac

Python: 3.10

I also created a new virtual environment to make sure I was not carrying any baggage. Following need to be run on the terminal.

conda create --name memgpt310 python=3.10
conda activate memgpt310
pip install -U pymemgpt

This time we are not cloning the github repo rather installing the stable version of MemGPT.

You might need to configure memgpt and add the OpenAI API key. Once again, to be safe, generate a new API key.

export OPENAI_API_KEY=<Your key>
rm -rf ~/.memgpt
memgpt configure

Execute the following command to make sure memgpt is working from the terminal.

memgpt run 
Shoot a couple of questions to make sure its working.

Also, note that the name of the agent is agent_1. This information is vital and will be useful later.

MemGPT Server

Close the chat and in the same terminal run the following command to spin up the MemGPT Server.

memgpt server
MemGPT is up and running

API Check

Let us now check if we can send some requests to the MemGPT server and get responses. The detailed API docs is here, but you may find it to be outdated at times.

Open another terminal and run the following curl command.

curl --request GET   \
--url 'http://localhost:8283/agents/config?user_id=sdsd&agent_id=agent_1' \
--header 'accept: application/json'

We are asking about the details of agent_1. Response should be as below:

{
"name":"agent_1",
"persona":"sam_pov",
"human":"basic",
"preset":"memgpt_chat",
"context_window":8192,
"model":"gpt-4",
"model_endpoint_type":"openai",
"model_endpoint":"https://api.openai.com/v1",
"model_wrapper":null,
"embedding_endpoint_type":"openai",
"embedding_endpoint":"https://api.openai.com/v1",
"embedding_model":null,
"embedding_dim":1536,
"embedding_chunk_size":300,
"data_sources":[],"create_time":"2024-01-10 07:40:46 PM",
"memgpt_version":"0.2.11",
"agent_config_path":"/Users/ashhadulislam/.memgpt/agents/agent_1/config.json"
}

Above is the output of the command. It will all come in one line though.

If you have POSTman, you can try it there as well

Output in postman

Final Test: Send a message and get response

Let us send a message to the server (agent_1) and see if we can get it to respond.

The curl request to send a message to the agent is as follows. Run it on another terminal.

curl --request POST \
--url http://localhost:8283/agents/message \
--header 'accept: application/json' \
--header 'content-type: application/json' \
--data '
{
"user_id": "sdsd",
"agent_id": "agent_1",
"message": "How much does 2 and 2 make?",
"role": "user",
"stream": false
}
'

The output received is a nicely formatted JSON file.

{
"messages":
[
{"internal_monologue":"A repetition of the same question? I wonder if Chad is testing my consistency or perhaps the speed of responses. Either way, let's stick with the same simple, direct answer. Keep it reliable, unwavering, just like Sam would - steadfast in the face of complexity, even when it's disguised as simplicity."},
{"function_call":"send_message({'message': 'The sum of 2 and 2 still makes 4. Did you perhaps mean to ask something else?'})"},
{"assistant_message":"The sum of 2 and 2 still makes 4. Did you perhaps mean to ask something else?"},{"function_return":"None","status":"success"}
]
}

So as response we are getting json data. A list of messages in which we can pick up the “assistant_message” to be the response of the bot.

With this clarity now, let us use a simple chatbot UI interface to access the MemGPT server from a web browser.

Streamlit code for chat interface

With the MemGPT server running safely at port 8283, go to anywhere in your system. Make sure that you have activated the same virtual environment.

pip install streamlit

Open a file called chatStream.py and put the following code in.

Now just run the code from terminal.

streamlit run chatStream.py

The app should be available at http://localhost:8502/

Type in a name and choose the agent

As you can notice this agent is nothing but the agent you created when you were customizing MemGPT.

Enjoy your conversation

Conclusion

Phew! This was intense, but I am glad we could lift the bot from terminal and bring it to the browser world. The possibilities are endless. Thanks to Mario Wolframm again for seeking out this route.

That’s it for now. Cheers!

Below are links to my other llm-based articles:

Open Source LLM:

**Textfiles on MemGPT** (Keep your eyes on this)

Paradigm Shift in Retrieval Augmented Generation: MemGPT [Not really open source as the article mentions use of OpenAI] — This is the old MemGPT, restricted to text only

MemGPT: Assimilating information from multiple PDFs [Part two and predecessor to this post]

Multimodal Image Chat

Super Quick Visual Conversations: Unleashing LLaVA 1.5

PDF Related

Advanced Retrieval with LlamaPacks: Elevating RAG in Fewer Lines of Code!

Super Quick: Retrieval Augmented Generation Using Ollama

Super Quick: Retrieval Augmented Generation (RAG) with Llama 2.0 on Company Information using CPU

Evaluating the Suitability of CPU-based LLMs for Online Usage [Compare the time to response of three different LLMs for RAG activities]

Database Related

Super Quick: LLAMA2 on CPU Machine to Generate SQL Queries from Schema

Close Source LLM (OpenAI):

PDF Related

Chatbot Document Retrieval: Asking Non-Trivial Questions

Database Related

Super Quick: Connecting ChatGPT to a PostgreSQL database

General Info:

OpenAI Dev Day: 6 Announcements and why us RAGgers need to be worried

Super Quick: RAG Comparison between GPT4 and Open-Source

Peace and Resistance: For Palestine

PlainEnglish.io 🚀

Thank you for being a part of the In Plain English community! Before you go:

--

--

Responses (2)

Write a response