GitHub

LitServe

A Lightning-fast serving engine for AI models built on FastAPI.

What is the project about?

LitServe is a serving engine designed for deploying AI models. It enhances FastAPI with features tailored for AI workloads, such as batching, streaming, and GPU autoscaling.

What problem does it solve?

It simplifies the process of deploying and serving AI models, eliminating the need to rebuild a FastAPI server for each model. It also improves performance compared to plain FastAPI, and provides enterprise-scale features.

What are the features of the project?

  • (2x)+ faster than plain FastAPI
  • Bring your own model
  • Build compound systems (1+ models)
  • GPU autoscaling
  • Batching
  • Streaming
  • Worker autoscaling
  • Self-host on your machines or fully managed on Lightning AI
  • Serve all models: (LLMs, vision, etc.)
  • Scale to zero (serverless)
  • Supports PyTorch, JAX, TF, etc...
  • OpenAPI compliant
  • Open AI compatibility
  • Authentication
  • Dockerization

What are the technologies used in the project?

  • FastAPI
  • Python
  • Support for PyTorch, JAX, TensorFlow, and other ML frameworks.
  • Optional: vLLM, LitGPT

What are the benefits of the project?

  • Faster serving: At least 2x faster than plain FastAPI due to AI-specific multi-worker handling.
  • Easy to use: Simple API for defining and deploying models.
  • Flexibility: Supports various models and frameworks, compound AI systems.
  • Scalability: Features like batching, GPU autoscaling, and worker autoscaling.
  • Hosting Options: Self-host or use Lightning Studios for managed deployment.
  • Enterprise Ready: Features like authentication, and autoscaling.

What are the use cases of the project?

  • Deploying any type of AI model (LLMs, vision, audio, NLP, etc.).
  • Building compound AI systems with multiple models.
  • Creating APIs for AI-powered applications.
  • Serving models for real-time inference.
  • High-performance LLM serving (with integrations like vLLM or LitGPT).
  • RAG applications.
  • Proxy Server.
LitServe screenshot