nanobrowser

Open-Source Chrome extension for AI-powered web automation. Run multi-agent workflows using your own LLM API key. Alternative to OpenAI Operator.

Stars: 4652

Visit

Nanobrowser is an open-source AI web automation tool that runs in your browser. It is a free alternative to OpenAI Operator with flexible LLM options and a multi-agent system. Nanobrowser offers premium web automation capabilities while keeping users in complete control, with features like a multi-agent system, interactive side panel, task automation, follow-up questions, and multiple LLM support. Users can easily download and install Nanobrowser as a Chrome extension, configure agent models, and accomplish tasks such as news summary, GitHub research, and shopping research with just a sentence. The tool uses a specialized multi-agent system powered by large language models to understand and execute complex web tasks. Nanobrowser is actively developed with plans to expand LLM support, implement security measures, optimize memory usage, enable session replay, and develop specialized agents for domain-specific tasks. Contributions from the community are welcome to improve Nanobrowser and build the future of web automation.

README:

🌐 Nanobrowser

Nanobrowser is an open-source AI web automation tool that runs in your browser. A free alternative to OpenAI Operator with flexible LLM options and multi-agent system.

⬇️ Get Nanobrowser from Chrome Web Store for free

👏 Join the community in Discord | X

❤️ Loving Nanobrowser? Give us a star 🌟 and help spread the word!

Nanobrowser's multi-agent system analyzing HuggingFace in real-time, with the Planner intelligently self-correcting when encountering obstacles and dynamically instructing the Navigator to adjust its approach—all running locally in your browser.

🔥Why Nanobrowser?

Looking for a powerful AI web agent without the $200/month price tag of OpenAI Operator? Nanobrowser , as a chrome extension, delivers premium web automation capabilities while keeping you in complete control:

100% Free - No subscription fees or hidden costs. Just install and use your own API keys, and you only pay what you use with your own API keys.
Privacy-Focused - Everything runs in your local browser. Your credentials stay with you, never shared with any cloud service.
Flexible LLM Options - Connect to your preferred LLM providers with the freedom to choose different models for different agents.
Fully Open Source - Complete transparency in how your browser is automated. No black boxes or hidden processes.

Note: We currently support OpenAI, Anthropic, Gemini, Ollama and custom OpenAI-Compatible providers, more providers will be supported.

📊 Key Features

Multi-agent System: Specialized AI agents collaborate to accomplish complex web workflows
Interactive Side Panel: Intuitive chat interface with real-time status updates
Task Automation: Seamlessly automate repetitive web automation tasks across websites
Follow-up Questions: Ask contextual follow-up questions about completed tasks
Conversation History: Easily access and manage your AI agent interaction history
Multiple LLM Support: Connect your preferred LLM providers and assign different models to different agents

🚀 Quick Start

Install from Chrome Web Store (Stable Version):
- Visit the Nanobrowser Chrome Web Store page
- Click "Add to Chrome" button
- Confirm the installation when prompted

Important Note: For latest features, install from "Manually Install Latest Version" below, as Chrome Web Store version may be delayed due to review process.

Configure Agent Models:
- Click the Nanobrowser icon in your toolbar to open the sidebar
- Click the Settings icon (top right)
- Add your LLM API keys
- Choose which model to use for different agents (Navigator, Planner, Validator)

🔧 Manually Install Latest Version

To get the most recent version with all the latest features:

Download
- Download the latest nanobrowser.zip file from the official Github release page.
Install:
- Unzip nanobrowser.zip.
- Open chrome://extensions/ in Chrome
- Enable Developer mode (top right)
- Click Load unpacked (top left)
- Select the unzipped nanobrowser folder.
Configure Agent Models
- Click the Nanobrowser icon in your toolbar to open the sidebar
- Click the Settings icon (top right).
- Add your LLM API keys.
- Choose which model to use for different agents (Navigator, Planner, Validator)
Upgrading:
- Download the latest nanobrowser.zip file from the release page.
- Unzip and replace your existing Nanobrowser files with the new ones.
- Go to chrome://extensions/ in Chrome and click the refresh icon on the Nanobrowser card.

🛠️ Build from Source

If you prefer to build Nanobrowser yourself, follow these steps:

Prerequisites:
- Node.js (v22.12.0 or higher)
- pnpm (v9.15.1 or higher)

Clone the Repository:

git clone https://github.com/nanobrowser/nanobrowser.git
cd nanobrowser

Install Dependencies:
```
pnpm install
```
Build the Extension:
```
pnpm build
```
Load the Extension:
- The built extension will be in the dist directory
- Follow the installation steps from the Manually Install section to load the extension into your browser
Development Mode (optional):
```
pnpm dev
```

🤖 Choosing Your Models

Nanobrowser allows you to configure different LLM models for each agent to balance performance and cost. Here are recommended configurations:

Better Performance

Planner & Validator: Claude 3.7 Sonnet
- Better reasoning and planning capabilities
- More reliable task validation
Navigator: Claude 3.5 Haiku
- Efficient for web navigation tasks
- Good balance of performance and cost

Cost-Effective Configuration

Planner & Validator: Claude Haiku or GPT-4o
- Reasonable performance at lower cost
- May require more iterations for complex tasks
Navigator: Gemini 2.0 Flash or GPT-4o-mini
- Lightweight and cost-efficient
- Suitable for basic navigation tasks

Local Models

Setup Options:
- Use Ollama or other custom OpenAI-compatible providers to run models locally
- Zero API costs and complete privacy with no data leaving your machine
Recommended Models:
- Falcon3 10B
- Qwen 2.5 Coder 14B
- Mistral Small 24B
- We welcome community experience sharing with other local models in our Discord
Prompt Engineering:
- Local models require more specific and cleaner prompts
- Avoid high-level, ambiguous commands
- Break complex tasks into clear, detailed steps
- Provide explicit context and constraints

Note: The cost-effective configuration may produce less stable outputs and require more iterations for complex tasks.

Tip: Feel free to experiment with your own model configurations! Found a great combination? Share it with the community in our Discord to help others optimize their setup.

💡 See It In Action

Here are some powerful tasks you can accomplish with just a sentence:

News Summary:

"Go to TechCrunch and extract top 10 headlines from the last 24 hours"
GitHub Research:

"Look for the trending Python repositories on GitHub with most stars"
Shopping Research:

"Find a portable Bluetooth speaker on Amazon with a water-resistant design, under $50. It should have a minimum battery life of 10 hours"

🛠️ Roadmap

We're actively developing Nanobrowser with exciting features on the horizon, welcome to join us!

Check out our detailed roadmap and upcoming features in our GitHub Discussions.

🤝 Contributing

We need your help to make Nanobrowser even better! Contributions of all kinds are welcome:

Share Prompts & Use Cases
- Join our Discord server.
- share how you're using Nanobrowser. Help us build a library of useful prompts and real-world use cases.
Provide Feedback
- Try Nanobrowser and give us feedback on its performance or suggest improvements in our Discord server.
Contribute Code
- Check out our CONTRIBUTING.md for guidelines on how to contribute code to the project.
- Submit pull requests for bug fixes, features, or documentation improvements.

We believe in the power of open source and community collaboration. Join us in building the future of web automation!

🔒 Security

If you discover a security vulnerability, please DO NOT disclose it publicly through issues, pull requests, or discussions.

Instead, please create a GitHub Security Advisory to report the vulnerability responsibly. This allows us to address the issue before it's publicly disclosed.

We appreciate your help in keeping Nanobrowser and its users safe!

💬 Community

Join our growing community of developers and users:

Discord - Chat with team and community
Twitter - Follow for updates and announcements
GitHub Discussions - Share ideas and ask questions

👏 Acknowledgments

Nanobrowser builds on top of other awesome open-source projects:

Huge thanks to their creators and contributors!

📄 License

This project is licensed under the Apache License 2.0 - see the LICENSE file for details.

Made with ❤️ by the Nanobrowser Team.

Like Nanobrowser? Give us a star 🌟 and join us in Discord | X

For Tasks:

Click tags to check more tools for each tasks

automate tasks extract news research on github find products online manage web workflows

For Jobs:

web developer data analyst automation engineer ai researcher software engineer

Alternative AI tools for nanobrowser

Similar Open Source Tools

nanobrowser

github

: 4.7k

Ollama-Colab-Integration

Ollama Colab Integration V4 is a tool designed to enhance the interaction and management of large language models. It allows users to quantize models within their notebook environment, access a variety of models through a user-friendly interface, and manage public endpoints efficiently. The tool also provides features like LiteLLM proxy control, model insights, and customizable model file templating. Users can troubleshoot model loading issues, CPU fallback strategies, and manage VRAM and RAM effectively. Additionally, the tool offers functionalities for downloading model files from Hugging Face, model conversion with high precision, model quantization using Q and Kquants, and securely uploading converted models to Hugging Face.

github

: 93

ai_automation_suggester

An integration for Home Assistant that leverages AI models to understand your unique home environment and propose intelligent automations. By analyzing your entities, devices, areas, and existing automations, the AI Automation Suggester helps you discover new, context-aware use cases you might not have considered, ultimately streamlining your home management and improving efficiency, comfort, and convenience. The tool acts as a personal automation consultant, providing actionable YAML-based automations that can save energy, improve security, enhance comfort, and reduce manual intervention. It turns the complexity of a large Home Assistant environment into actionable insights and tangible benefits.

github

: 345

obsidian-smart-composer

Smart Composer is an Obsidian plugin that enhances note-taking and content creation by integrating AI capabilities. It allows users to efficiently write by referencing their vault content, providing contextual chat with precise context selection, multimedia context support for website links and images, document edit suggestions, and vault search for relevant notes. The plugin also offers features like custom model selection, local model support, custom system prompts, and prompt templates. Users can set up the plugin by installing it through the Obsidian community plugins, enabling it, and configuring API keys for supported providers like OpenAI, Anthropic, and Gemini. Smart Composer aims to streamline the writing process by leveraging AI technology within the Obsidian platform.

github

: 1.1k

Simplifine

Simplifine is an open-source library designed for easy LLM finetuning, enabling users to perform tasks such as supervised fine tuning, question-answer finetuning, contrastive loss for embedding tasks, multi-label classification finetuning, and more. It provides features like WandB logging, in-built evaluation tools, automated finetuning parameters, and state-of-the-art optimization techniques. The library offers bug fixes, new features, and documentation updates in its latest version. Users can install Simplifine via pip or directly from GitHub. The project welcomes contributors and provides comprehensive documentation and support for users.

github

: 65

logicstudio.ai

LogicStudio.ai is a powerful visual canvas-based tool for building, managing, and visualizing complex logic flows involving AI agents, data inputs, and outputs. It provides an intuitive interface to streamline development processes by offering features like drag-and-drop canvas design, dynamic components, real-time connections, import/export capabilities, zoom & pan controls, file management, AI integration, editable views, and various output formats. Users can easily add, connect, configure, and manage components to create interactive systems and workflows.

github

: 66

refact

This repository contains Refact WebUI for fine-tuning and self-hosting of code models, which can be used inside Refact plugins for code completion and chat. Users can fine-tune open-source code models, self-host them, download and upload Lloras, use models for code completion and chat inside Refact plugins, shard models, host multiple small models on one GPU, and connect GPT-models for chat using OpenAI and Anthropic keys. The repository provides a Docker container for running the self-hosted server and supports various models for completion, chat, and fine-tuning. Refact is free for individuals and small teams under the BSD-3-Clause license, with custom installation options available for GPU support. The community and support include contributing guidelines, GitHub issues for bugs, a community forum, Discord for chatting, and Twitter for product news and updates.

github

: 1.8k

kollektiv

Kollektiv is a Retrieval-Augmented Generation (RAG) system designed to enable users to chat with their favorite documentation easily. It aims to provide LLMs with access to the most up-to-date knowledge, reducing inaccuracies and improving productivity. The system utilizes intelligent web crawling, advanced document processing, vector search, multi-query expansion, smart re-ranking, AI-powered responses, and dynamic system prompts. The technical stack includes Python/FastAPI for backend, Supabase, ChromaDB, and Redis for storage, OpenAI and Anthropic Claude 3.5 Sonnet for AI/ML, and Chainlit for UI. Kollektiv is licensed under a modified version of the Apache License 2.0, allowing free use for non-commercial purposes.

github

: 74

Riona-AI-Agent

Riona-AI-Agent is a versatile AI chatbot designed to assist users in various tasks. It utilizes natural language processing and machine learning algorithms to understand user queries and provide accurate responses. The chatbot can be integrated into websites, applications, and messaging platforms to enhance user experience and streamline communication. With its customizable features and easy deployment, Riona-AI-Agent is suitable for businesses, developers, and individuals looking to automate customer support, provide information, and engage with users in a conversational manner.

github

: 2.2k

ai-driven-dev-community

AI Driven Dev Community is a repository aimed at helping developers become more efficient by utilizing AI tools in their daily coding tasks. It provides a collection of tools, prompts, snippets, and agents for developers to integrate AI into their workflow. The repository is regularly updated with new resources and focuses on best practices for using AI in development work. Users can find tools like Espanso, ChatGPT, GitHub Copilot, and VSCode recommended for enhancing their coding experience. Additionally, the repository offers guidance on customizing AI for developers, installing AI toolbox for software engineers, and contributing to the community through easy steps.

github

: 69

comfyui_LLM_Polymath

github

: 54

GPTPortal

github

: 184

easydiffusion

Easy Diffusion 3.0 is a user-friendly tool for installing and using Stable Diffusion on your computer. It offers hassle-free installation, clutter-free UI, task queue, intelligent model detection, live preview, image modifiers, multiple prompts file, saving generated images, UI themes, searchable models dropdown, and supports various image generation tasks like 'Text to Image', 'Image to Image', and 'InPainting'. The tool also provides advanced features such as custom models, merge models, custom VAE models, multi-GPU support, auto-updater, developer console, and more. It is designed for both new users and advanced users looking for powerful AI image generation capabilities.

github

: 9.7k

omniscient

Omniscient is an advanced AI Platform offered as a SaaS, empowering projects with cutting-edge artificial intelligence capabilities. Seamlessly integrating with Next.js 14, React, Typescript, and APIs like OpenAI and Replicate, it provides solutions for code generation, conversation simulation, image creation, music composition, and video generation.

github

: 82

ChatFAQ

ChatFAQ is an open-source comprehensive platform for creating a wide variety of chatbots: generic ones, business-trained, or even capable of redirecting requests to human operators. It includes a specialized NLP/NLG engine based on a RAG architecture and customized chat widgets, ensuring a tailored experience for users and avoiding vendor lock-in.

github

: 128

AmigaGPT

AmigaGPT is a versatile ChatGPT client for AmigaOS 3.x, 4.1, and MorphOS. It brings the capabilities of OpenAI’s GPT to Amiga systems, enabling text generation, question answering, and creative exploration. AmigaGPT can generate images using DALL-E, supports speech output, and seamlessly integrates with AmigaOS. Users can customize the UI, choose fonts and colors, and enjoy a native user experience. The tool requires specific system requirements and offers features like state-of-the-art language models, AI image generation, speech capability, and UI customization.

github

: 57

For similar tasks

Magick

Magick is a groundbreaking visual AIDE (Artificial Intelligence Development Environment) for no-code data pipelines and multimodal agents. Magick can connect to other services and comes with nodes and templates well-suited for intelligent agents, chatbots, complex reasoning systems and realistic characters.

github

: 675

danswer

Danswer is an open-source Gen-AI Chat and Unified Search tool that connects to your company's docs, apps, and people. It provides a Chat interface and plugs into any LLM of your choice. Danswer can be deployed anywhere and for any scale - on a laptop, on-premise, or to cloud. Since you own the deployment, your user data and chats are fully in your own control. Danswer is MIT licensed and designed to be modular and easily extensible. The system also comes fully ready for production usage with user authentication, role management (admin/basic users), chat persistence, and a UI for configuring Personas (AI Assistants) and their Prompts. Danswer also serves as a Unified Search across all common workplace tools such as Slack, Google Drive, Confluence, etc. By combining LLMs and team specific knowledge, Danswer becomes a subject matter expert for the team. Imagine ChatGPT if it had access to your team's unique knowledge! It enables questions such as "A customer wants feature X, is this already supported?" or "Where's the pull request for feature Y?"

github

: 10.5k

semantic-kernel

Semantic Kernel is an SDK that integrates Large Language Models (LLMs) like OpenAI, Azure OpenAI, and Hugging Face with conventional programming languages like C#, Python, and Java. Semantic Kernel achieves this by allowing you to define plugins that can be chained together in just a few lines of code. What makes Semantic Kernel _special_ , however, is its ability to _automatically_ orchestrate plugins with AI. With Semantic Kernel planners, you can ask an LLM to generate a plan that achieves a user's unique goal. Afterwards, Semantic Kernel will execute the plan for the user.

github

: 23.9k

floneum

Floneum is a graph editor that makes it easy to develop your own AI workflows. It uses large language models (LLMs) to run AI models locally, without any external dependencies or even a GPU. This makes it easy to use LLMs with your own data, without worrying about privacy. Floneum also has a plugin system that allows you to improve the performance of LLMs and make them work better for your specific use case. Plugins can be used in any language that supports web assembly, and they can control the output of LLMs with a process similar to JSONformer or guidance.

github

: 1.8k

mindsdb

MindsDB is a platform for customizing AI from enterprise data. You can create, serve, and fine-tune models in real-time from your database, vector store, and application data. MindsDB "enhances" SQL syntax with AI capabilities to make it accessible for developers worldwide. With MindsDB’s nearly 200 integrations, any developer can create AI customized for their purpose, faster and more securely. Their AI systems will constantly improve themselves — using companies’ own data, in real-time.

github

: 27.6k

aiscript

AiScript is a lightweight scripting language that runs on JavaScript. It supports arrays, objects, and functions as first-class citizens, and is easy to write without the need for semicolons or commas. AiScript runs in a secure sandbox environment, preventing infinite loops from freezing the host. It also allows for easy provision of variables and functions from the host.

github

: 201

activepieces

Activepieces is an open source replacement for Zapier, designed to be extensible through a type-safe pieces framework written in Typescript. It features a user-friendly Workflow Builder with support for Branches, Loops, and Drag and Drop. Activepieces integrates with Google Sheets, OpenAI, Discord, and RSS, along with 80+ other integrations. The list of supported integrations continues to grow rapidly, thanks to valuable contributions from the community. Activepieces is an open ecosystem; all piece source code is available in the repository, and they are versioned and published directly to npmjs.com upon contributions. If you cannot find a specific piece on the pieces roadmap, please submit a request by visiting the following link: Request Piece Alternatively, if you are a developer, you can quickly build your own piece using our TypeScript framework. For guidance, please refer to the following guide: Contributor's Guide

github

: 12.6k

superagent-js

Superagent is an open source framework that enables any developer to integrate production ready AI Assistants into any application in a matter of minutes.

github

: 80

For similar jobs

weave

Weave is a toolkit for developing Generative AI applications, built by Weights & Biases. With Weave, you can log and debug language model inputs, outputs, and traces; build rigorous, apples-to-apples evaluations for language model use cases; and organize all the information generated across the LLM workflow, from experimentation to evaluations to production. Weave aims to bring rigor, best-practices, and composability to the inherently experimental process of developing Generative AI software, without introducing cognitive overhead.

github

: 855

LLMStack

LLMStack is a no-code platform for building generative AI agents, workflows, and chatbots. It allows users to connect their own data, internal tools, and GPT-powered models without any coding experience. LLMStack can be deployed to the cloud or on-premise and can be accessed via HTTP API or triggered from Slack or Discord.

github

: 1.5k

VisionCraft

The VisionCraft API is a free API for using over 100 different AI models. From images to sound.

github

: 94

kaito

Kaito is an operator that automates the AI/ML inference model deployment in a Kubernetes cluster. It manages large model files using container images, avoids tuning deployment parameters to fit GPU hardware by providing preset configurations, auto-provisions GPU nodes based on model requirements, and hosts large model images in the public Microsoft Container Registry (MCR) if the license allows. Using Kaito, the workflow of onboarding large AI inference models in Kubernetes is largely simplified.

github

: 405

PyRIT

PyRIT is an open access automation framework designed to empower security professionals and ML engineers to red team foundation models and their applications. It automates AI Red Teaming tasks to allow operators to focus on more complicated and time-consuming tasks and can also identify security harms such as misuse (e.g., malware generation, jailbreaking), and privacy harms (e.g., identity theft). The goal is to allow researchers to have a baseline of how well their model and entire inference pipeline is doing against different harm categories and to be able to compare that baseline to future iterations of their model. This allows them to have empirical data on how well their model is doing today, and detect any degradation of performance based on future improvements.

github

: 2.3k

tabby

Tabby is a self-hosted AI coding assistant, offering an open-source and on-premises alternative to GitHub Copilot. It boasts several key features: * Self-contained, with no need for a DBMS or cloud service. * OpenAPI interface, easy to integrate with existing infrastructure (e.g Cloud IDE). * Supports consumer-grade GPUs.

github

: 30.6k

spear

SPEAR (Simulator for Photorealistic Embodied AI Research) is a powerful tool for training embodied agents. It features 300 unique virtual indoor environments with 2,566 unique rooms and 17,234 unique objects that can be manipulated individually. Each environment is designed by a professional artist and features detailed geometry, photorealistic materials, and a unique floor plan and object layout. SPEAR is implemented as Unreal Engine assets and provides an OpenAI Gym interface for interacting with the environments via Python.

github

: 224

Magick

github

: 675