Site RAG Project Description

What is the project about?

Site RAG is a Chrome extension that allows users to ask questions about websites using a Retrieval-Augmented Generation (RAG) approach.

What problem does it solve?

It enables users to quickly get answers to questions based on the content of a website, either a single page or an entire site, without manually searching through the information. It also allows for persistent indexing of websites for repeated querying.

What are the features of the project?

One-off queries on the current page.
Indexing of the current page and persisting documents in a vector store for RAG.
Indexing of an entire site and persisting documents in a vector store for RAG.
100% local operation within the browser, storing secrets in browser storage.
Optional connection to a locally running Ollama instance for local LLM inference.
"Multi query mode" generates multiple queries for a more comprehensive search.
Support for follow-up questions, maintaining context from previous interactions.
"Context stuff mode" includes the entire content of the current page in the system prompt.
Support for multiple LLM providers.

What are the technologies used in the project?

Language: Likely JavaScript (based on yarn install, yarn build, and file extensions like .ts).
Framework/Libraries: LangChain (implied by langchain.chat_models_universal.initChatModel), and other dependencies managed by yarn.
Database: Supabase (PostgreSQL with pgvector extension) for vector storage.
LLM Providers: Anthropic, OpenAI, Google GenAI, Together AI, and potentially others (extensible).
Web Scraping: FireCrawl API.
Chrome Extension API: For building the browser extension.

What are the benefits of the project?

Efficient Information Retrieval: Quickly find answers within websites.
Local Processing: Data and processing stay within the user's browser.
Flexibility: Supports various LLMs and indexing options.
Persistence: Indexed data can be stored for later use.
Contextual Awareness: Maintains context in follow-up questions.

What are the use cases of the project?

Researching information on a specific website.
Quickly finding answers to questions while browsing.
Creating a knowledge base from a website for repeated querying.
Comparing information across multiple pages or websites.
Summarizing the content of web pages or entire sites.