What is the project about?
WebUI is a user-friendly interface built on Gradio for interacting with AI agents that can access websites. It extends the functionality of the browser-use
project, enabling easier control and expanded capabilities.
What problem does it solve?
It simplifies the interaction with AI-powered web browsing, making it more accessible to users. It removes the need for complex configurations and provides a visual way to manage and observe AI agent actions on the web. It also addresses the common issue of needing to re-login to websites by allowing the use of a custom browser.
What are the features of the project?
- User-Friendly Interface: A Gradio-based UI for easy interaction with the browser agent.
- Expanded LLM Support: Integrates with various Large Language Models (LLMs), including Google, OpenAI, Azure OpenAI, Anthropic, DeepSeek, and Ollama.
- Custom Browser Support: Allows users to use their own browser, avoiding re-authentication and supporting high-definition screen recording.
- Persistent Browser Sessions: Option to keep the browser window open between AI tasks, preserving history and state.
- Docker Support: Easy deployment and management using Docker and Docker Compose.
- Configurable Resolution: Set custom screen resolutions for the browser.
- VNC Integration: View browser interactions in real-time via a VNC viewer.
- Theming: Customizable UI themes.
What are the technologies used in the project?
- Python: The primary programming language.
- Gradio: For building the web UI.
- Playwright: For browser automation.
- Large Language Models (LLMs): Google, OpenAI, Azure OpenAI, Anthropic, DeepSeek, Ollama.
- Docker/Docker Compose: For containerization and deployment.
- noVNC: For remote viewing of the browser.
- uv: Python environment management.
What are the benefits of the project?
- Simplified Interaction: Makes it easier for users to interact with AI-powered web browsing.
- Increased Accessibility: Lowers the barrier to entry for using AI agents on the web.
- Flexibility: Supports various LLMs and custom browser configurations.
- Transparency: Persistent sessions and VNC viewing allow users to see the AI's actions.
- Easy Deployment: Docker support simplifies installation and management.
What are the use cases of the project?
- Automated Web Tasks: Automating tasks on websites that require complex interactions.
- Web Scraping and Data Extraction: Using AI to gather information from websites.
- AI Agent Testing and Development: Providing a platform for testing and developing web-based AI agents.
- Research: Studying how AI agents interact with the web.
- Accessibility: Potentially assisting users with disabilities in navigating the web.
