GitHub

Project Description: Surya

What is the project about?

Surya is a comprehensive document Optical Character Recognition (OCR) toolkit designed for extracting information from various types of documents.

What problem does it solve?

Surya addresses the challenge of accurately extracting text and structural information from documents, including those in multiple languages, complex layouts, and containing tables or mathematical formulas. It aims to provide a high-quality, open-source alternative to commercial OCR services.

What are the features of the project?

  • Multilingual OCR: Supports OCR in 90+ languages, with performance comparable to cloud-based services.
  • Text Line Detection: Identifies text lines in any language.
  • Layout Analysis: Detects document elements like tables, images, headers, and more.
  • Reading Order Detection: Determines the correct reading order of text in complex layouts.
  • Table Recognition: Identifies table structures, including rows, columns, and cells.
  • LaTeX OCR: Extracts mathematical formulas in LaTeX format.

What are the technologies used in the project?

  • Python 3.10+
  • PyTorch
  • Streamlit (for the interactive app)
  • Other libraries and models mentioned in the "Thanks" section, such as Segformer, EfficientViT, timm, Donut, transformers, and CRAFT.

What are the benefits of the project?

  • High Accuracy: Benchmarks favorably against commercial OCR services.
  • Versatility: Handles a wide range of document types and languages.
  • Open Source: Freely available for personal and research use, with commercial use options.
  • Customizable: Settings can be adjusted via environment variables.
  • Performance Optimization: Supports batch processing and model compilation for speed.

What are the use cases of the project?

  • Digitizing documents for archiving or analysis.
  • Extracting data from scanned documents, forms, and tables.
  • Creating accessible versions of documents for visually impaired users.
  • Building document understanding pipelines for research or commercial applications.
  • OCR for scientific papers, including mathematical formulas.
surya screenshot