GitHub

Transformers.js: State-of-the-art Machine Learning for the Web

What is the project about?

Transformers.js is a JavaScript library that brings the power of Hugging Face's Transformers to the web browser. It allows you to run various pre-trained machine learning models directly in the browser without requiring a server. It's designed to be a JavaScript equivalent of the Python transformers library.

What problem does it solve?

  • Serverless Inference: Eliminates the need for a dedicated backend server to perform inference with Transformer models. This reduces latency, improves user privacy (data doesn't leave the user's device), and lowers infrastructure costs.
  • Accessibility: Makes state-of-the-art machine learning models accessible to web developers without requiring deep expertise in Python or server-side infrastructure.
  • Offline Capability: Since processing happens in the browser, applications can potentially work offline (after initial model download).
  • Cross-Platform: Works in any modern web browser, including those on mobile devices.

What are the features of the project?

  • pipeline API: Provides a simple and familiar API (similar to the Python library) for running common tasks. Pipelines handle preprocessing, model execution, and postprocessing.
  • Wide Range of Tasks: Supports a broad spectrum of tasks across different modalities:
    • Natural Language Processing (NLP): Sentiment analysis, text classification, question answering, summarization, translation, text generation, named entity recognition, and more.
    • Computer Vision: Image classification, object detection, image segmentation, depth estimation.
    • Audio: Automatic speech recognition, audio classification, text-to-speech.
    • Multimodal: Zero-shot classification (audio, image, object detection), image-to-text, document question answering.
  • Model Support: Works with a large number of pre-trained models from the Hugging Face Hub, covering various architectures (BERT, GPT-2, ViT, Whisper, and many more).
  • ONNX Runtime Integration: Uses ONNX Runtime for efficient model execution in the browser (both WebAssembly/CPU and WebGPU).
  • Model Quantization: Supports quantized models (e.g., q4, q8) to reduce model size and improve performance, especially in resource-constrained environments like web browsers.
  • WebGPU Support: Allows leveraging the GPU for accelerated inference (experimental).
  • Customizable: Allows specifying custom model locations, disabling remote model loading, and configuring WASM paths.
  • Easy Model Conversion: Provides a script to convert PyTorch, TensorFlow, or JAX models to ONNX format for use with Transformers.js.
  • Examples and Templates: Offers various example applications and templates (React, Next.js, Node.js, browser extensions, etc.) to help developers get started.

What are the technologies used in the project?

  • JavaScript: The primary language of the library.
  • ONNX Runtime: A cross-platform machine learning inference engine. Transformers.js uses the WebAssembly (WASM) and WebGPU builds of ONNX Runtime.
  • WebAssembly (WASM): Provides near-native performance for CPU-based inference.
  • WebGPU: An emerging web standard for GPU computation (used for GPU-accelerated inference).
  • Hugging Face Hub: Used for accessing pre-trained models and (optionally) datasets.
  • 🤗 Optimum: Used by the conversion script to convert and quantize models to ONNX.
  • Node.js/npm: Used for package management and development.
  • CDN (jsDelivr): Provides an alternative way to include the library in web projects without a bundler.

What are the benefits of the project?

  • Reduced Latency: Inference happens locally, eliminating network round trips to a server.
  • Enhanced Privacy: User data remains on the client-side.
  • Lower Costs: No need for server infrastructure to run models.
  • Offline Functionality: Applications can work offline after the initial model download.
  • Simplified Development: Easy-to-use API makes integrating ML models into web apps straightforward.
  • Scalability: Client-side processing scales naturally with the number of users.
  • Democratization of AI: Makes advanced ML models more accessible to web developers.

What are the use cases of the project?

  • Real-time Language Translation: Translate text in a web page or application instantly.
  • Sentiment Analysis: Analyze the sentiment of user input in real-time (e.g., in a chat application).
  • Text Summarization: Summarize articles or documents within a browser extension.
  • Image Classification: Classify images uploaded by users directly in the browser.
  • Object Detection: Detect objects in images or video streams in real-time (e.g., for accessibility features).
  • Speech Recognition: Transcribe audio in the browser (e.g., for voice-controlled applications).
  • Text-to-Speech: Generate speech from text within a web application.
  • Interactive Demos: Create interactive demos of ML models that run entirely in the browser.
  • Educational Tools: Build educational applications that teach about machine learning.
  • Client-side Semantic Search: Search images or text based on meaning, not just keywords.
  • Code Completion: Provide AI-powered code completion in a web-based code editor.
  • Gaming: Create real-time ML-powered games, like sketch recognition.
  • Browser Extensions: Enhance browsing experience with ML-powered features.
transformers.js screenshot