Guidance

What is the project about?

Guidance is a programming paradigm and Python library designed for controlling and steering language models (LLMs) more effectively and efficiently than traditional prompting or fine-tuning. It allows for structured output generation, constrained generation, and seamless interleaving of control flow and generation.

What problem does it solve?

Lack of Control: Traditional prompting gives limited control over the structure and content of LLM output.
Inefficiency: Standard prompting and chaining can be slow and expensive, requiring multiple LLM calls and intermediate parsing.
Tokenization Issues: Standard tokenization can lead to unexpected behavior and biases, especially at prompt boundaries.
Complex Tool Integration: Integrating tools (like calculators or search engines) with LLMs traditionally requires complex parsing and handling of intermediate outputs.
Lack of structured output.

What are the features of the project?

Pure Python Syntax: Write generation logic using familiar Python constructs (conditionals, loops) with added LLM-specific functionality.
Constrained Generation: Force the model to generate output that adheres to specific constraints:
- Selection: Choose from a predefined set of options.
- Regular Expressions: Match generated text against regular expressions.
- Context-Free Grammars: Define complex output structures using CFGs.
- Pre-built components: Use built in functions like substring and json.
Stateful Control + Generation: Create functions that combine control flow (if/else, loops) with generation, eliminating the need for external parsers. This is equivalent to a single LLM call, improving speed.
Tool Use: Easily integrate external tools (like calculators) by defining trigger grammars and tool functions. The model automatically stops generation, calls the tool, and resumes.
Token Healing: Automatically handles token boundary issues, allowing users to work with text instead of worrying about tokenization artifacts.
Rich Templating: Use f-string-like syntax for easy template creation.
Chat Abstraction: Provides a clean interface for interacting with chat models, handling special tokens automatically.
Reusable Components: Create and reuse custom guidance functions.
Streaming Support: Supports streaming output, integrated with Jupyter notebooks.
Multi-modal Support: Works with images, demonstrated with Gemini.
Backend Compatibility: Supports various backends, including Transformers, llama.cpp, AzureAI, VertexAI, and OpenAI.

What are the technologies used in the project?

Python: The primary programming language.
Language Models: Supports various LLMs through different backends:
- Transformers: (Hugging Face Transformers library)
- llama.cpp: For local execution of Llama models.
- OpenAI: (GPT-3.5, etc.)
- VertexAI: (Google's Vertex AI platform, including PaLM 2 and Gemini)
- AzureAI: (Microsoft Azure AI)

What are the benefits of the project?

Increased Control: Precise control over LLM output structure and content.
Improved Efficiency: Faster and more cost-effective than traditional prompting due to batched text and single-call execution.
Simplified Development: Easier to write and maintain complex LLM interactions.
Reduced Errors: Token healing and constrained generation minimize unexpected behavior.
Seamless Tool Integration: Simplified tool use without complex parsing.
Higher Quality Output: Constrained generation and structured output.
Portability: Write once, run on multiple backends.

What are the use cases of the project?

Chatbots: Building chatbots with complex conversation flows and tool integration.
Structured Data Extraction: Extracting information from text into structured formats (e.g., JSON).
Code Generation: Generating code that adheres to specific syntax and constraints.
Content Creation: Creating content with specific formatting and style requirements.
Reasoning Tasks: Implementing reasoning frameworks like ReAct.
Question Answering: Building question-answering systems with controlled responses.
Any task requiring precise control over LLM output.