GitHub

What is the project about?

WebUI is a user-friendly interface built on Gradio for interacting with AI agents that can access websites. It extends the functionality of the browser-use project, enabling easier control and expanded capabilities.

What problem does it solve?

It simplifies the interaction with AI-powered web browsing, making it more accessible to users. It removes the need for complex configurations and provides a visual way to manage and observe AI agent actions on the web. It also addresses the common issue of needing to re-login to websites by allowing the use of a custom browser.

What are the features of the project?

  • User-Friendly Interface: A Gradio-based UI for easy interaction with the browser agent.
  • Expanded LLM Support: Integrates with various Large Language Models (LLMs), including Google, OpenAI, Azure OpenAI, Anthropic, DeepSeek, and Ollama.
  • Custom Browser Support: Allows users to use their own browser, avoiding re-authentication and supporting high-definition screen recording.
  • Persistent Browser Sessions: Option to keep the browser window open between AI tasks, preserving history and state.
  • Docker Support: Easy deployment and management using Docker and Docker Compose.
  • Configurable Resolution: Set custom screen resolutions for the browser.
  • VNC Integration: View browser interactions in real-time via a VNC viewer.
  • Theming: Customizable UI themes.

What are the technologies used in the project?

  • Python: The primary programming language.
  • Gradio: For building the web UI.
  • Playwright: For browser automation.
  • Large Language Models (LLMs): Google, OpenAI, Azure OpenAI, Anthropic, DeepSeek, Ollama.
  • Docker/Docker Compose: For containerization and deployment.
  • noVNC: For remote viewing of the browser.
  • uv: Python environment management.

What are the benefits of the project?

  • Simplified Interaction: Makes it easier for users to interact with AI-powered web browsing.
  • Increased Accessibility: Lowers the barrier to entry for using AI agents on the web.
  • Flexibility: Supports various LLMs and custom browser configurations.
  • Transparency: Persistent sessions and VNC viewing allow users to see the AI's actions.
  • Easy Deployment: Docker support simplifies installation and management.

What are the use cases of the project?

  • Automated Web Tasks: Automating tasks on websites that require complex interactions.
  • Web Scraping and Data Extraction: Using AI to gather information from websites.
  • AI Agent Testing and Development: Providing a platform for testing and developing web-based AI agents.
  • Research: Studying how AI agents interact with the web.
  • Accessibility: Potentially assisting users with disabilities in navigating the web.
web-ui screenshot