midscene

midscene

Automate browser actions, extract data, and perform assertions using AI. It offers JavaScript SDK, Chrome extension, and support for scripting in YAML.

Stars: 2441

Visit
 screenshot

Midscene.js is an AI-powered automation SDK that allows users to control web pages, perform assertions, and extract data in JSON format using natural language. It offers features such as natural language interaction, understanding UI and providing responses in JSON, intuitive assertion based on AI understanding, compatibility with public multimodal LLMs like GPT-4o, visualization tool for easy debugging, and a brand new experience in automation development.

README:

Midscene.js

Midscene.js

English | 简体中文

Joyful UI Automation

npm version downloads License

Midscene.js is an AI-powered automation SDK can control the page, perform assertions, and extract data in JSON format using natural language.

Features ✨

  • Natural Language Interaction 👆: Describe the steps, and let Midscene plan and control the user interface for you
  • Understand UI, Answer in JSON 🔍: Provide prompts regarding the desired data format, and then receive the expected response in JSON format.
  • Intuitive Assertion 🤔: Make assertions in natural language; it’s all based on AI understanding.
  • Experience by Chrome Extension 🖥️: Start immediately with the Chrome Extension. No code is needed while exploring.
  • Visualized Report 🎞️: With our visualized report file, you can easily understand and debug the whole process.
  • Out-of-box LLM 🪓: It is fine to use public multimodal LLMs like GPT-4o. There is no need for any custom training.
  • Brand New Experience! 🔥: Experience a whole new world of automation development. Enjoy!

Resources 📄

Community

License

Midscene.js is MIT licensed.

For Tasks:

Click tags to check more tools for each tasks

For Jobs:

Alternative AI tools for midscene

Similar Open Source Tools

For similar tasks

For similar jobs