browser
Lightpanda: the headless browser designed for AI and automation
Stars: 1445
Lightpanda Browser is an open-source headless browser designed for fast web automation, AI agents, LLM training, scraping, and testing. It features ultra-low memory footprint, exceptionally fast execution, and compatibility with Playwright and Puppeteer through CDP. Built for performance, Lightpanda offers Javascript execution, support for Web APIs, and is optimized for minimal memory usage. It is a modern solution for web scraping and automation tasks, providing a lightweight alternative to traditional browsers like Chrome.
README:
Lightpanda is the open-source browser made for headless usage:
- Javascript execution
- Support of Web APIs (partial, WIP)
- Compatible with Playwright, Puppeteer through CDP (WIP)
Fast web automation for AI agents, LLM training, scraping and testing with minimal memory footprint:
- Ultra-low memory footprint (9x less than Chrome)
- Exceptionally fast execution (11x faster than Chrome) & instant startup
See benchmark details.
You can download the last binary from the nightly builds for Linux x86_64 and MacOS aarch64.
# Download the binary
$ wget https://github.com/lightpanda-io/browser/releases/download/nightly/lightpanda-x86_64-linux
$ chmod a+x ./lightpanda-x86_64-linux
$ ./lightpanda-x86_64-linux -h
usage: ./lightpanda-x86_64-linux [options] [URL]
start Lightpanda browser
* if an url is provided the browser will fetch the page and exit
* otherwhise the browser starts a CDP server
-h, --help Print this help message and exit.
--host Host of the CDP server (default "127.0.0.1")
--port Port of the CDP server (default "9222")
--timeout Timeout for incoming connections of the CDP server (in seconds, default "3")
--dump Dump document in stdout (fetch mode only)
$ ./lightpanda-x86_64-linux --dump https://lightpanda.io
info(browser): GET https://lightpanda.io/ http.Status.ok
info(browser): fetch script https://api.website.lightpanda.io/js/script.js: http.Status.ok
info(browser): eval remote https://api.website.lightpanda.io/js/script.js: TypeError: Cannot read properties of undefined (reading 'pushState')
<!DOCTYPE html>
$ ./lightpanda-x86_64-linux --host 127.0.0.1 --port 9222
info(websocket): starting blocking worker to listen on 127.0.0.1:9222
info(server): accepting new conn...
Once the CDP server started, you can run a Puppeteer script by configuring the
browserWSEndpoint
.
'use scrict'
import puppeteer from 'puppeteer-core';
// use browserWSEndpoint to pass the Lightpanda's CDP server address.
const browser = await puppeteer.connect({
browserWSEndpoint: "ws://127.0.0.1:9222",
});
// The rest of your script remains the same.
const context = await browser.createBrowserContext();
const page = await context.newPage();
await page.goto('https://wikipedia.com/');
await page.close();
await context.close();
Lightpanda is written with Zig 0.13.0
. You have to
install it with the right version in order to build the project.
Lightpanda also depends on zig-js-runtime (with v8), Netsurf libs and Mimalloc.
To be able to build the v8 engine for zig-js-runtime, you have to install some libs:
For Debian/Ubuntu based Linux:
sudo apt install xz-utils \
python3 ca-certificates git \
pkg-config libglib2.0-dev \
gperf libexpat1-dev \
cmake clang
For MacOS, you only need cmake:
brew install cmake
You can run make install
to install deps all in one (or make install-dev
if you need the development versions).
Be aware that the build task is very long and cpu consuming, as you will build from sources all dependancies, including the v8 Javascript engine.
The project uses git submodules for dependencies.
To init or update the submodules in the vendor/
directory:
make install-submodule
Netsurf libs
Netsurf libs are used for HTML parsing and DOM tree generation.
make install-netsurf
For dev env, use make install-netsurf-dev
.
Mimalloc
Mimalloc is used as a C memory allocator.
make install-mimalloc
For dev env, use make install-mimalloc-dev
.
Note: when Mimalloc is built in dev mode, you can dump memory stats with the
env var MIMALLOC_SHOW_STATS=1
. See
https://microsoft.github.io/mimalloc/environment.html.
zig-js-runtime
Our own Zig/Javascript runtime, which includes the v8 Javascript engine.
This build task is very long and cpu consuming, as you will build v8 from sources.
make install-zig-js-runtime
For dev env, use make iinstall-zig-js-runtime-dev
.
You can test Lightpanda by running make test
.
Lightpanda is tested against the standardized Web Platform Tests.
The relevant tests cases are committed in a dedicated repository which is fetched by the make install-submodule
command.
All the tests cases executed are located in the tests/wpt
sub-directory.
For reference, you can easily execute a WPT test case with your browser via wpt.live.
To run all the tests:
make wpt
Or one specific test:
make wpt Node-childNodes.html
We add new relevant tests cases files when we implemented changes in Lightpanda.
To add a new test, copy the file you want from the WPT
repo into the tests/wpt
directory.
tests/wpt
.
Lightpanda accepts pull requests through GitHub.
You have to sign our CLA during the pull request process otherwise we're not able to accept your contributions.
In the good old days, scraping a webpage was as easy as making an HTTP request, cURL-like. It’s not possible anymore, because Javascript is everywhere, like it or not:
- Ajax, Single Page App, infinite loading, “click to display”, instant search, etc.
- JS web frameworks: React, Vue, Angular & others
If we need Javascript, why not use a real web browser? Take a huge desktop application, hack it, and run it on the server. Hundreds or thousands of instances of Chrome if you use it at scale. Are you sure it’s such a good idea?
- Heavy on RAM and CPU, expensive to run
- Hard to package, deploy and maintain at scale
- Bloated, lots of features are not useful in headless usage
If we want both Javascript and performance in a true headless browser, we need to start from scratch. Not another iteration of Chromium, really from a blank page. Crazy right? But that’s we did:
- Not based on Chromium, Blink or WebKit
- Low-level system programming language (Zig) with optimisations in mind
- Opinionated: without graphical rendering
Lightpanda is still a work in progress and is currently at a Beta stage.
Here are the key features we have implemented:
- [x] HTTP loader
- [x] HTML parser and DOM tree (based on Netsurf libs)
- [x] Javascript support (v8)
- [x] Basic DOM APIs
- [x] Ajax
- [x] XHR API
- [x] Fetch API
- [x] DOM dump
- [x] Basic CDP/websockets server
NOTE: There are hundreds of Web APIs. Developing a browser (even just for headless mode) is a huge task. Coverage will increase over time.
You can also follow the progress of our Javascript support in our dedicated zig-js-runtime project.
For Tasks:
Click tags to check more tools for each tasksFor Jobs:
Alternative AI tools for browser
Similar Open Source Tools
browser
Lightpanda Browser is an open-source headless browser designed for fast web automation, AI agents, LLM training, scraping, and testing. It features ultra-low memory footprint, exceptionally fast execution, and compatibility with Playwright and Puppeteer through CDP. Built for performance, Lightpanda offers Javascript execution, support for Web APIs, and is optimized for minimal memory usage. It is a modern solution for web scraping and automation tasks, providing a lightweight alternative to traditional browsers like Chrome.
depthai
This repository contains a demo application for DepthAI, a tool that can load different networks, create pipelines, record video, and more. It provides documentation for installation and usage, including running programs through Docker. Users can explore DepthAI features via command line arguments or a clickable QT interface. Supported models include various AI models for tasks like face detection, human pose estimation, and object detection. The tool collects anonymous usage statistics by default, which can be disabled. Users can report issues to the development team for support and troubleshooting.
metavoice-src
MetaVoice-1B is a 1.2B parameter base model trained on 100K hours of speech for TTS (text-to-speech). It has been built with the following priorities: * Emotional speech rhythm and tone in English. * Zero-shot cloning for American & British voices, with 30s reference audio. * Support for (cross-lingual) voice cloning with finetuning. * We have had success with as little as 1 minute training data for Indian speakers. * Synthesis of arbitrary length text
gpt-engineer
GPT-Engineer is a tool that allows you to specify a software in natural language, sit back and watch as an AI writes and executes the code, and ask the AI to implement improvements.
UltraSinger
UltraSinger is a tool under development that automatically creates UltraStar.txt, midi, and notes from music. It pitches UltraStar files, adds text and tapping, creates separate UltraStar karaoke files, re-pitches current UltraStar files, and calculates in-game score. It uses multiple AI models to extract text from voice and determine pitch. Users should mention UltraSinger in UltraStar.txt files and only use it on Creative Commons licensed songs.
bedrock-claude-chat
This repository is a sample chatbot using the Anthropic company's LLM Claude, one of the foundational models provided by Amazon Bedrock for generative AI. It allows users to have basic conversations with the chatbot, personalize it with their own instructions and external knowledge, and analyze usage for each user/bot on the administrator dashboard. The chatbot supports various languages, including English, Japanese, Korean, Chinese, French, German, and Spanish. Deployment is straightforward and can be done via the command line or by using AWS CDK. The architecture is built on AWS managed services, eliminating the need for infrastructure management and ensuring scalability, reliability, and security.
linkedin-api
The Linkedin API for Python allows users to programmatically search profiles, send messages, and find jobs using a regular Linkedin user account. It does not require 'official' API access, just a valid Linkedin account. However, it is important to note that this library is not officially supported by LinkedIn and using it may violate LinkedIn's Terms of Service. Users can authenticate using any Linkedin account credentials and access features like getting profiles, profile contact info, and connections. The library also provides commercial alternatives for extracting data, scraping public profiles, and accessing a full LinkedIn API. It is not endorsed or supported by LinkedIn and is intended for educational purposes and personal use only.
pytest-evals
pytest-evals is a minimalistic pytest plugin designed to help evaluate the performance of Language Model (LLM) outputs against test cases. It allows users to test and evaluate LLM prompts against multiple cases, track metrics, and integrate easily with pytest, Jupyter notebooks, and CI/CD pipelines. Users can scale up by running tests in parallel with pytest-xdist and asynchronously with pytest-asyncio. The tool focuses on simplifying evaluation processes without the need for complex frameworks, keeping tests and evaluations together, and emphasizing logic over infrastructure.
lmql
LMQL is a programming language designed for large language models (LLMs) that offers a unique way of integrating traditional programming with LLM interaction. It allows users to write programs that combine algorithmic logic with LLM calls, enabling model reasoning capabilities within the context of the program. LMQL provides features such as Python syntax integration, rich control-flow options, advanced decoding techniques, powerful constraints via logit masking, runtime optimization, sync and async API support, multi-model compatibility, and extensive applications like JSON decoding and interactive chat interfaces. The tool also offers library integration, flexible tooling, and output streaming options for easy model output handling.
web-llm
WebLLM is a modular and customizable javascript package that directly brings language model chats directly onto web browsers with hardware acceleration. Everything runs inside the browser with no server support and is accelerated with WebGPU. WebLLM is fully compatible with OpenAI API. That is, you can use the same OpenAI API on any open source models locally, with functionalities including json-mode, function-calling, streaming, etc. We can bring a lot of fun opportunities to build AI assistants for everyone and enable privacy while enjoying GPU acceleration.
SecureAI-Tools
SecureAI Tools is a private and secure AI tool that allows users to chat with AI models, chat with documents (PDFs), and run AI models locally. It comes with built-in authentication and user management, making it suitable for family members or coworkers. The tool is self-hosting optimized and provides necessary scripts and docker-compose files for easy setup in under 5 minutes. Users can customize the tool by editing the .env file and enabling GPU support for faster inference. SecureAI Tools also supports remote OpenAI-compatible APIs, with lower hardware requirements for using remote APIs only. The tool's features wishlist includes chat sharing, mobile-friendly UI, and support for more file types and markdown rendering.
torchchat
torchchat is a codebase showcasing the ability to run large language models (LLMs) seamlessly. It allows running LLMs using Python in various environments such as desktop, server, iOS, and Android. The tool supports running models via PyTorch, chatting, generating text, running chat in the browser, and running models on desktop/server without Python. It also provides features like AOT Inductor for faster execution, running in C++ using the runner, and deploying and running on iOS and Android. The tool supports popular hardware and OS including Linux, Mac OS, Android, and iOS, with various data types and execution modes available.
telemetry-airflow
This repository codifies the Airflow cluster that is deployed at workflow.telemetry.mozilla.org (behind SSO) and commonly referred to as "WTMO" or simply "Airflow". Some links relevant to users and developers of WTMO: * The `dags` directory in this repository contains some custom DAG definitions * Many of the DAGs registered with WTMO don't live in this repository, but are instead generated from ETL task definitions in bigquery-etl * The Data SRE team maintains a WTMO Developer Guide (behind SSO)
ComfyUI-mnemic-nodes
ComfyUI-mnemic-nodes is a repository hosting a collection of nodes developed for ComfyUI, providing useful components to enhance project functionality. The nodes include features like returning file paths, saving text files, downloading images from URLs, tokenizing text, cleaning strings, querying Groq language models, generating negative prompts, and more. Some nodes are experimental and marked with a 'Caution' label. Installation instructions and setup details are provided for each node, along with examples and presets for different tasks.
VoiceStreamAI
VoiceStreamAI is a Python 3-based server and JavaScript client solution for near-realtime audio streaming and transcription using WebSocket. It employs Huggingface's Voice Activity Detection (VAD) and OpenAI's Whisper model for accurate speech recognition. The system features real-time audio streaming, modular design for easy integration of VAD and ASR technologies, customizable audio chunk processing strategies, support for multilingual transcription, and secure sockets support. It uses a factory and strategy pattern implementation for flexible component management and provides a unit testing framework for robust development.
Bard-API
The Bard API is a Python package that returns responses from Google Bard through the value of a cookie. It is an unofficial API that operates through reverse-engineering, utilizing cookie values to interact with Google Bard for users struggling with frequent authentication problems or unable to authenticate via Google Authentication. The Bard API is not a free service, but rather a tool provided to assist developers with testing certain functionalities due to the delayed development and release of Google Bard's API. It has been designed with a lightweight structure that can easily adapt to the emergence of an official API. Therefore, using it for any other purposes is strongly discouraged. If you have access to a reliable official PaLM-2 API or Google Generative AI API, replace the provided response with the corresponding official code. Check out https://github.com/dsdanielpark/Bard-API/issues/262.
For similar tasks
clickolas-cage
Clickolas-cage is a Chrome extension designed to autonomously perform web browsing actions to achieve specific goals using LLM as a brain. Users can interact with the extension by setting goals, which triggers a series of actions including navigation, element extraction, and step generation. The extension is developed using Node.js and can be locally run for testing and development purposes before packing it for submission to the Chrome Web Store.
scylla
Scylla is an intelligent proxy pool tool designed for humanities, enabling users to extract content from the internet and build their own Large Language Models in the AI era. It features automatic proxy IP crawling and validation, an easy-to-use JSON API, a simple web-based user interface, HTTP forward proxy server, Scrapy and requests integration, and headless browser crawling. Users can start using Scylla with just one command, making it a versatile tool for various web scraping and content extraction tasks.
browser
Lightpanda Browser is an open-source headless browser designed for fast web automation, AI agents, LLM training, scraping, and testing. It features ultra-low memory footprint, exceptionally fast execution, and compatibility with Playwright and Puppeteer through CDP. Built for performance, Lightpanda offers Javascript execution, support for Web APIs, and is optimized for minimal memory usage. It is a modern solution for web scraping and automation tasks, providing a lightweight alternative to traditional browsers like Chrome.
PythonPark
PythonPark is a paradise for learning Python, providing babysitter-level tutorials on AI labs, treasure videos, data structures, study guides, machine learning practicals, deep learning practicals, Python basics, web scraping, big company interview experiences, programming life, and resource sharing. Original articles are published at least twice a week, with the latest articles being first released on WeChat and videos on Bilibili. Join the WeChat group for technical discussions or to provide feedback. Continuously improving and outputting content!
Fyin
Fyin is an open-source tool that serves as an alternative to Perplexity AI, allowing users to run it locally for faster answers. It features the ability to run locally using ollama or OpenAI API, a local VectorDB for fast search, quick searching, scraping & answering due to parallelism, configurable number of search results to parse, and local scraping of websites. The tool aims to provide a more efficient and customizable solution for obtaining answers through search and scraping functionalities.
ArcadiaScriptPublic
ArcadiaScriptPublic is a repository containing various scripts for learning and practicing JavaScript, Python, and Shell scripting. It is intended for testing and educational purposes only, and not for commercial use. The repository does not guarantee the legality, accuracy, completeness, or effectiveness of the scripts, and users are advised to use them at their own discretion. No resources from the repository are allowed to be republished or redistributed by any public account or self-media. The repository owner disclaims any responsibility for script-related issues, including losses or damages resulting from script errors. Users indirectly utilizing the scripts, such as setting up VPS or engaging in activities that violate national/regional laws or regulations, are solely responsible for any privacy leaks or consequences. If any entity or individual believes that the scripts in the project may infringe upon their rights, they should promptly notify and provide proof of identity and ownership, upon which the relevant scripts will be removed after verification. Anyone viewing or using the scripts in this project should carefully read and accept the disclaimer provided by zjk2017/ArcadiaScriptPublic, as the repository reserves the right to change or supplement the disclaimer at any time. Users must completely delete the downloaded content from their computers or phones within 24 hours of downloading, and any form of profit chain generation is strictly prohibited.
notte
Notte is a web browser designed specifically for LLM agents, providing a language-first web navigation experience without the need for DOM/HTML parsing. It transforms websites into structured, navigable maps described in natural language, enabling users to interact with the web using natural language commands. By simplifying browser complexity, Notte allows LLM policies to focus on conversational reasoning and planning, reducing token usage, costs, and latency. The tool supports various language model providers and offers a reinforcement learning style action space and controls for full navigation control.
zep-python
Zep is an open-source platform for building and deploying large language model (LLM) applications. It provides a suite of tools and services that make it easy to integrate LLMs into your applications, including chat history memory, embedding, vector search, and data enrichment. Zep is designed to be scalable, reliable, and easy to use, making it a great choice for developers who want to build LLM-powered applications quickly and easily.
For similar jobs
sweep
Sweep is an AI junior developer that turns bugs and feature requests into code changes. It automatically handles developer experience improvements like adding type hints and improving test coverage.
teams-ai
The Teams AI Library is a software development kit (SDK) that helps developers create bots that can interact with Teams and Microsoft 365 applications. It is built on top of the Bot Framework SDK and simplifies the process of developing bots that interact with Teams' artificial intelligence capabilities. The SDK is available for JavaScript/TypeScript, .NET, and Python.
ai-guide
This guide is dedicated to Large Language Models (LLMs) that you can run on your home computer. It assumes your PC is a lower-end, non-gaming setup.
classifai
Supercharge WordPress Content Workflows and Engagement with Artificial Intelligence. Tap into leading cloud-based services like OpenAI, Microsoft Azure AI, Google Gemini and IBM Watson to augment your WordPress-powered websites. Publish content faster while improving SEO performance and increasing audience engagement. ClassifAI integrates Artificial Intelligence and Machine Learning technologies to lighten your workload and eliminate tedious tasks, giving you more time to create original content that matters.
chatbot-ui
Chatbot UI is an open-source AI chat app that allows users to create and deploy their own AI chatbots. It is easy to use and can be customized to fit any need. Chatbot UI is perfect for businesses, developers, and anyone who wants to create a chatbot.
BricksLLM
BricksLLM is a cloud native AI gateway written in Go. Currently, it provides native support for OpenAI, Anthropic, Azure OpenAI and vLLM. BricksLLM aims to provide enterprise level infrastructure that can power any LLM production use cases. Here are some use cases for BricksLLM: * Set LLM usage limits for users on different pricing tiers * Track LLM usage on a per user and per organization basis * Block or redact requests containing PIIs * Improve LLM reliability with failovers, retries and caching * Distribute API keys with rate limits and cost limits for internal development/production use cases * Distribute API keys with rate limits and cost limits for students
uAgents
uAgents is a Python library developed by Fetch.ai that allows for the creation of autonomous AI agents. These agents can perform various tasks on a schedule or take action on various events. uAgents are easy to create and manage, and they are connected to a fast-growing network of other uAgents. They are also secure, with cryptographically secured messages and wallets.
griptape
Griptape is a modular Python framework for building AI-powered applications that securely connect to your enterprise data and APIs. It offers developers the ability to maintain control and flexibility at every step. Griptape's core components include Structures (Agents, Pipelines, and Workflows), Tasks, Tools, Memory (Conversation Memory, Task Memory, and Meta Memory), Drivers (Prompt and Embedding Drivers, Vector Store Drivers, Image Generation Drivers, Image Query Drivers, SQL Drivers, Web Scraper Drivers, and Conversation Memory Drivers), Engines (Query Engines, Extraction Engines, Summary Engines, Image Generation Engines, and Image Query Engines), and additional components (Rulesets, Loaders, Artifacts, Chunkers, and Tokenizers). Griptape enables developers to create AI-powered applications with ease and efficiency.