Site RAG Project Description
What is the project about?
Site RAG is a Chrome extension that allows users to ask questions about websites using a Retrieval-Augmented Generation (RAG) approach.
What problem does it solve?
It enables users to quickly get answers to questions based on the content of a website, either a single page or an entire site, without manually searching through the information. It also allows for persistent indexing of websites for repeated querying.
What are the features of the project?
- One-off queries on the current page.
- Indexing of the current page and persisting documents in a vector store for RAG.
- Indexing of an entire site and persisting documents in a vector store for RAG.
- 100% local operation within the browser, storing secrets in browser storage.
- Optional connection to a locally running Ollama instance for local LLM inference.
- "Multi query mode" generates multiple queries for a more comprehensive search.
- Support for follow-up questions, maintaining context from previous interactions.
- "Context stuff mode" includes the entire content of the current page in the system prompt.
- Support for multiple LLM providers.
What are the technologies used in the project?
- Language: Likely JavaScript (based on
yarn install
,yarn build
, and file extensions like.ts
). - Framework/Libraries: LangChain (implied by
langchain.chat_models_universal.initChatModel
), and other dependencies managed byyarn
. - Database: Supabase (PostgreSQL with pgvector extension) for vector storage.
- LLM Providers: Anthropic, OpenAI, Google GenAI, Together AI, and potentially others (extensible).
- Web Scraping: FireCrawl API.
- Chrome Extension API: For building the browser extension.
What are the benefits of the project?
- Efficient Information Retrieval: Quickly find answers within websites.
- Local Processing: Data and processing stay within the user's browser.
- Flexibility: Supports various LLMs and indexing options.
- Persistence: Indexed data can be stored for later use.
- Contextual Awareness: Maintains context in follow-up questions.
What are the use cases of the project?
- Researching information on a specific website.
- Quickly finding answers to questions while browsing.
- Creating a knowledge base from a website for repeated querying.
- Comparing information across multiple pages or websites.
- Summarizing the content of web pages or entire sites.
