Paradigm Shift in Retrieval Augmented Generation: MemGPT
Answering complex queries spanning multiple documents
A challenge encountered in the context of Retrieval Augmented Generation, where vector embeddings are utilized, is the limitation posed when seeking information that extends across multiple documents. This limitation arises because the Language Model (LLM) is restricted by the discrete information contained within separate vector embeddings from different files.
Thus, if we were to infer something from two different files, the model might fail. Here is an example. We have two documents, one is a project details document with almost 3147 words. The second document has just one line.
Document 1:
Project Details Document
Machine Learning: Scamming Website detection
1. Basic details
Student name: Ismail Khuda Bukhsh
…
The expectations of stakeholders must be controlled by providing frequent updates and openly addressing any problems.
Above is a project documentation on scamming website detection by a person called Ismail Khuda Bukhsh. It is just a sample. The second document contains just one line as follows.
Document 2:
Ashhad is working on the project Machine Learning: Scamming Website detection
The 2nd document has only one line — as above.
Test on ChatGPT
from llama_index import SimpleDirectoryReader
from llama_index import GPTListIndex
from llama_index import GPTVectorStoreIndex
from llama_index import LLMPredictor, PromptHelper
from llama_index import ServiceContext, load_graph_from_storage
from llama_index import StorageContext, load_index_from_storage
from langchain import OpenAI
import gradio as gr
import sys
import os
os.environ["OPENAI_API_KEY"] = '<Your Key here>'
# data/pdfs has all the pdf files you want to train your chatbot on
documents = SimpleDirectoryReader('../data/pdfs_1-10_text/').load_data()
index = GPTVectorStoreIndex.from_documents(documents)
llm_predictor = LLMPredictor(llm=OpenAI(temperature=0,
model_name="text-davinci-003"))
from llama_index import GPTVectorStoreIndex
max_input_size = 4096
num_output = 256
max_chunk_overlap = 0.7 # now takes a value between 0 and 1, before, ints
prompt_helper = PromptHelper(max_input_size, num_output, max_chunk_overlap)
service_context = ServiceContext.from_defaults(llm_predictor=llm_predictor,
prompt_helper=prompt_helper)
index = GPTVectorStoreIndex.from_documents(
documents, service_context=service_context
)
# save index
index.storage_context.persist("../data/indices/pdfs_1-10_text/")
# rebuild storage context
storage_context = StorageContext.from_defaults(persist_dir='../data/indices/pdfs_1-10_text')
# load index
index = load_index_from_storage(storage_context)
query_engine = index.as_query_engine()
# response = query_engine.query("is there someone both Ashhad and samir have published with?")
response = query_engine.query("Who is ismail?")
print("A:",response)
response = query_engine.query("Who is Ashhad?")
print("A:",response)
response = query_engine.query("How many people are working on the Machine Learning: Phishing Websites Detection project, who are they?")
# replace above with your question
print("A:",response)
Output By ChatGPT:
Q: Who is ismail?
A: Ismail Khuda Bukhsh is the student working on the project Machine Learning: Scamming Website detection.
Q: Who is Ashhad?
A: Ashhad is a student studying Cybersecurity and Forensic Computing who is working on a project to create an anti-phishing website.
Q: How many people are working on the Machine Learning: Phishing Websites Detection project, who are they?
A: There is only one person working on the Machine Learning: Phishing Websites Detection project, and that is Ismail Khuda Bukhsh.
As you see here, the Chatbot is not able to identify that Ismail and Ashhad are working on the project together. Although it individually identifies Ismail and Ashhad working on the project, it is not able to consolidate the information.
Enter MemGPT
Let us see how MemGPT performs.
Enter your message: Who is ismail?
A: The Ismail I found in my records is named Ismail Khuda Bukhsh. He is a student who is involved in a project concerning Machine Learning and scamming website detection.
Enter your message: Who is Ashhad?
A: The Ashhad I found in my records is associated with the project ‘Machine Learning: Scamming Website detection’.
Enter your message: How many people are working on the Machine Learning: Phishing Websites Detection project, who are they?
A: From my records, there are two individuals, Ismail Khuda Bukhsh and Ashhad, associated with a project titled ‘Machine Learning: Scamming Website detection’. Could this be the same project you’re referring to as ‘Machine Learning: Phishing Websites Detection’? If so, I am aware of these two individuals working on it.
Steps to set up MemGPT and test as above
I have installed pytorch separately with cuda enabled.
pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
First you need to download the MemGPT repository and install the dependencies.
https://github.com/cpacker/MemGPT.git
cd MemGPT
pip install -r requirements.txt
export OPENAI_API_KEY=<your-openai-api-key>
Next we add two data files in the folder data/text_files

AnnexDoc.txt and ProjectDetails.txt are two files I created to experiment the knowledge assimilation by MemGPT. You can download them from the links.
Run as follows.
python3 main.py --archival_storage_files="data/text_files/*.txt" --persona=memgpt_doc --human=basic
Questioning Time:



This clearly illustrates the remarkable capability of MemGPT to gather and synthesize information from diverse sources, enabling it to provide comprehensive and well-informed responses.
That is it for now. Obviously we need to see how we can replace ChatGPT with other LLMs like LLAMA/Falcon/Mistral.
Thats it for today. Let’s explore exciting experiments and practical uses for this platform, and maybe work together to create something amazing. Take care.
Following are my other llm related articles:
Open Source LLM:
Multimodal Image Chat
Super Quick Visual Conversations: Unleashing LLaVA 1.5
PDF Related
Chatbot Document Retrieval: Asking Non-Trivial Questions
Super Quick: Retrieval Augmented Generation (RAG) with Llama 2.0 on Company Information using CPU
Super Quick: Fine-tuning LLAMA 2.0 on CPU with personal data
Database Related
Super Quick: LLAMA2 on CPU Machine to Generate SQL Queries from Schema
Close Source LLM (OpenAI):
PDF Related
Super Quick PDF-based ChatGPT Tutorial in Python
Chatbot Document Retrieval: Asking Non-Trivial Questions
Database Related
Super Quick: Connecting ChatGPT to a PostgreSQL database
In Plain English
Thank you for being a part of our community! Before you go:
- Be sure to clap and follow the writer! 👏
- You can find even more content at PlainEnglish.io 🚀
- Sign up for our free weekly newsletter. 🗞️
- Follow us: Twitter(X), LinkedIn, YouTube, Discord.
- Check out our other platforms: Stackademic, CoFeed, Venture.