- Published on
Build a chat bot for your application using RAG and chatGPT
- Authors
- Name
- Abhishek D M
Introduction
Imagine you have an application where users typically interact through a UI. Now, you want to provide a chatbot option for users to interact with instead.
Fine-tuning a large language model (LLM) with your application's data would be impractical due to scalability issues with user interactions. Instead, we can use Retrieval-Augmented Generation (RAG) to enrich the LLM with private or proprietary data not available on the public internet, enabling it to provide useful insights specific to your application.
Architecture
The high-level architecture for RAG includes the following workflows:
Text generation workflow - The user interacts with the chatbot via text, which is then converted to embeddings using an embedding model compatible with the chosen LLM. In this example, we use OpenAI's (text-embedding-3) embeddings.
Data ingestion workflow - To supplement the LLM with application-specific knowledge, we use RAG. This involves:
- Data source - This is the proprietary data to be used for the chatbot's responses.
- Vector Store - Data from the source is broken into smaller chunks, converted into embeddings, and stored in a vector store database. Here, We'll use chroma as our vectore store.
GenAI inference workflow - The LLM uses the context to respond to the user query.
Tech Stack
Our Stack includes:
- Language: python
- LLM: chatGPT 4.0
- LLM Embeddings: text-embedding-3-smalle
- Vector store: chroma
- Sdk: langchain
- UI: streamlit
- Application: Any that has API Endpoints
Setup
print("Hello AI")