
playword
Automate browsers with AI, bringing productivity and fun to your testing!
Stars: 52

PlayWord is a tool designed to supercharge web test automation experience with AI. It provides core features such as enabling browser operations and validations using natural language inputs, as well as monitoring interface to record and dry-run test steps. PlayWord supports multiple AI services including Anthropic, Google, and OpenAI, allowing users to select the appropriate provider based on their requirements. The tool also offers features like assertion handling, frame handling, custom variables, test recordings, and an Observer module to track user interactions on web pages. With PlayWord, users can interact with web pages using natural language commands, reducing the need to worry about element locators and providing AI-powered adaptation to UI changes.
README:
Supercharge your web test automation experience with AI.
Choose the package that best suits your needs.
The @playword/core
package provides the core features of PlayWord and can be used as Node.js modules.
It includes the following modules:
- PlayWord: Enables browser operations and validations using natural language inputs to interact with web pages.
- Observer: Mounts a monitoring interface on the browser to record and dry-run captured test steps.
# Install with any package manager you prefer
npm install @playword/core --save-dev
The @playword/cli
package enables you to use the features of PlayWord directly through the command line.
For ease of use, I recommend running this package with npx
.
# Run a PlayWord test
npx @playword/cli test --headed --verbose -b webkit
# Start the Observer
npx @playword/cli observe -b chromium -v
See documentation for usage examples and options.
PlayWord supports multiple AI services, including Anthropic, Google, and OpenAI. You can select the appropriate provider based on your requirements.
There are two ways to provide the required API key to PlayWord:
1. Export the API key as an environment variable:
export OPENAI_API_KEY="sk-..."
2. Pass the API key as a parameter during initialization:
import { chromium } from 'playwright'
const browser = await chromium.launch()
const context = await browser.newContext()
const playword = new PlayWord(context, {
aiOptions: {
baseURL: 'https://...', // Custom API endpoint (If applicable)
openAIApiKey: 'sk-...',
model: 'gpt-4o' // If not specified, the default model is gpt-4o-mini.
}
})
1. Export the API key as an environment variable:
export GOOGLE_API_KEY="AI..."
2. Pass the API key as a parameter during initialization:
const playword = new PlayWord(context, {
aiOptions: {
googleApiKey: 'AI...',
model: 'gemini-2.0-flash' // If not specified, the default model is gemini-2.0-flash-lite.
}
})
Since Anthropic does not offer its own embeddings model, integrating Anthropic requires an additional API key for embeddings.
Currently, PlayWord supports the following providers for embeddings:
- VoyageAI (officially recommended by Anthropic)
- OpenAI
1. Export API keys as environment variables:
export ANTHROPIC_API_KEY="sk-..."
export VOYAGEAI_API_KEY="pa-..."
2. Pass the API keys as parameters during initialization:
const playword = new PlayWord(context, {
aiOptions: {
anthropicApiKey: 'sk-...',
voyageAIApiKey: 'pa-...',
model: 'claude-3-7-sonnet-latest' // If not specified, the default model is claude-3-5-haiku-latest.
}
})
Name | Type | Default | Description |
---|---|---|---|
aiOptions | object | {} | Configuration options for the AI instance. |
debug | boolean | false | Whether to enable debug mode. |
delay | number | 250 | Delay between each step in milliseconds. |
record | boolean | string | false | Whether to record actions performed and where to save the recordings. |
In its basic usage, you can use the say
method to interact with the page.
No need to worry about locating elements or performing interactionsβPlayWord handles all of that for you.
await playword.say('Navigate to https://www.google.com')
await playword.say('Type "Hello, World!" in the search bar')
await playword.say('Press enter')
PlayWord uses keywords to identify whether a step is an assertion. This approach ensures more stable results compared to relying solely on AI judgment.
Using PlayWord within Playwright Test
import { PlayWord } from '@playword/core'
import { expect, test } from '@playwright/test'
test('get started link', async ({ context }) => {
const playword = new PlayWord(context, { debug: true, record: 'recordings/getStartLink.json' })
await playword.say('go to https://playwright.dev/')
await playword.say('click the link "Get started"')
expect(await playword.say('Verify if the installation heading is visible')).toBe(true)
})
The input starting with any of the following case-insensitive keywords will be recognized as an assertion:
- are
- assert
- assure
- can
- check
- compare
- confirm
- could
- did
- do
- does
- ensure
- expect
- guarantee
- has
- have
- is
- match
- satisfy
- shall
- should
- test
- then
- was
- were
- validate
- verify
To interact with elements inside frames, simply instruct PlayWord to switch to the desired frame.
await playword.say('Go to https://iframetester.com')
await playword.say('Type "https://www.saucedemo.com" in the URL field')
await playword.say('Click the render button')
await playword.say('Switch to the frame with the url "https://www.saucedemo.com"')
// Perform actions inside the frame
await playword.say('Type standard_user into the username field')
Hardcoding sensitive information in your test cases is not a good practice.
Instead, use custom variables with the syntax {VARIABLE_NAME}
and define them in your environment settings.
Assume the following environment variables are set in .env
# .env
USERNAME=standard_user
PASSWORD=secret_sauce
// Load environment variables
import 'dotenv/config'
// {USERNAME} and {PASSWORD} will be replaced with the values from the environment
await playword.say('Input {USERNAME} in the username field')
await playword.say('Input {PASSWORD} in the password field')
PlayWord supports recording test executions and replaying them later for efficient and consistent testing.
// Save recordings to the default path (.playword/recordings.json)
const playword = new PlayWord(context, { record: true })
// Save recordings to a custom path (Must be `.json`)
const playword = new PlayWord(context, { record: 'spec/test-shopping-cart.json' })
If recordings are available, PlayWord prioritizes using them to execute tests, reducing the need to consume API tokens.
If a recorded action fails, PlayWord automatically retries it using AI.
To ensure PlayWord uses AI for specific steps during playback, start the input with [AI]
.
await playword.say('[AI] click the "Login" button')
await playword.say('[AI] verify the URL matches "https://www.saucedemo.com/inventory.html"')
The Observer module tracks user interactions on web pages and swiftly generates accurate test steps using AI.
Upon activation, Playwright injects the Observer UI into every launched browser webpage. As you manually interact with the page, the AI interprets your actions, generates corresponding test steps, and records action details.
The Observer provides several controls to manage and interact with your test recordings:
-
Accept: Add test steps to the recording. (Can also be invoked by pressing the
a
key) -
Cancel: Skip test steps without adding them to the recording. (Can also be invoked by pressing the
c
key) - Preview: View the test steps recorded so far.
- Clear: Delete recorded test steps.
-
Dry Run: Trial-run the recorded test steps. (Can press the
esc
key to stop the dry-run process)
And it captures various user interactions on the webpage as follows:
- Click: Triggered when an element on the webpage is clicked.
- Hover: Triggered when hovering over an element for more than three seconds
- Input: Triggered after entering content into an input field and then clicking the input field again.
- Navigate: Triggered when the page navigates to a new URL or is refreshed.
- Select: Triggered after selecting an option from a dropdown menu.
For complex actions and assertions that the Observer cannot directly record, you can manually edit the step descriptions, enabling the AI to accurately capture your intentions.
To start using the Observer, create a PlayWord instance in headed mode, pass it to the Observer, and initiate observation with Playwright.
import { chromium } from 'playwright'
import { Observer, PlayWord } from '@playword/core'
const browser = await chromium.launch({ headless: false /** Enable headed mode */ })
const context = await browser.newContext()
const playword = new PlayWord(context)
const observer = new Observer(playword, {
delay: 500,
recordPath: 'spec/test-login.json'
})
// Start the Observer
await observer.observe()
// Open a new page to observe
await context.newPage()
Name | Type | Default | Description |
---|---|---|---|
delay | number | 250 | Delay between each step in milliseconds during the dry-run process. |
recordPath | string | .playword/recordings.json | Where to save the recordings. (Must be .json ) |
Aspect | Traditional Testing | PlayWord |
---|---|---|
Dev Experience | Locating elements is very frustrating | AI takes care of locating elements. Say goodbye to locators |
Dev Speed | Time is needed for writing both test cases and code | Test cases serve both as documentation and executable tests |
Maintainance | High maintenance cost due to UI changes | AI-powered adaption to UI changes |
Learning Curve | Requires knowledge of testing frameworks and tools | Just use natural language to execute tests |
- Click on an element
- Go to a specific URL
- Hover over an element
- Press a key or keys
- Scroll in a specific direction (top, bottom, up, down)
- Select an option from a select element
- Sleep for a specific duration in milliseconds
- Switch to a specific frame
- Switch to other pages
- Type text into an input field or textarea
- Wait for text to appear on the page
- Check if an element contains specific text
- Check if an element does not contain specific text
- Check if an element content is equal to specific text
- Check if an element content is not equal to specific text
- Check if an element is visible
- Check if an element is not visible
- Check if the page contains specific text
- Check if the page does not contain specific text
- Check if the page title is equal to specific text
- Check if the page URL matches specific RegExp patterns
For Tasks:
Click tags to check more tools for each tasksFor Jobs:
Alternative AI tools for playword
Similar Open Source Tools

playword
PlayWord is a tool designed to supercharge web test automation experience with AI. It provides core features such as enabling browser operations and validations using natural language inputs, as well as monitoring interface to record and dry-run test steps. PlayWord supports multiple AI services including Anthropic, Google, and OpenAI, allowing users to select the appropriate provider based on their requirements. The tool also offers features like assertion handling, frame handling, custom variables, test recordings, and an Observer module to track user interactions on web pages. With PlayWord, users can interact with web pages using natural language commands, reducing the need to worry about element locators and providing AI-powered adaptation to UI changes.

code2prompt
Code2Prompt is a powerful command-line tool that generates comprehensive prompts from codebases, designed to streamline interactions between developers and Large Language Models (LLMs) for code analysis, documentation, and improvement tasks. It bridges the gap between codebases and LLMs by converting projects into AI-friendly prompts, enabling users to leverage AI for various software development tasks. The tool offers features like holistic codebase representation, intelligent source tree generation, customizable prompt templates, smart token management, Gitignore integration, flexible file handling, clipboard-ready output, multiple output options, and enhanced code readability.

xFasterTransformer
xFasterTransformer is an optimized solution for Large Language Models (LLMs) on the X86 platform, providing high performance and scalability for inference on mainstream LLM models. It offers C++ and Python APIs for easy integration, along with example codes and benchmark scripts. Users can prepare models in a different format, convert them, and use the APIs for tasks like encoding input prompts, generating token ids, and serving inference requests. The tool supports various data types and models, and can run in single or multi-rank modes using MPI. A web demo based on Gradio is available for popular LLM models like ChatGLM and Llama2. Benchmark scripts help evaluate model inference performance quickly, and MLServer enables serving with REST and gRPC interfaces.

steel-browser
Steel is an open-source browser API designed for AI agents and applications, simplifying the process of building live web agents and browser automation tools. It serves as a core building block for a production-ready, containerized browser sandbox with features like stealth capabilities, text-to-markdown session management, UI for session viewing/debugging, and full browser control through popular automation frameworks. Steel allows users to control, run, and manage a production-ready browser environment via a REST API, offering features such as full browser control, session management, proxy support, extension support, debugging tools, anti-detection mechanisms, resource management, and various browser tools. It aims to streamline complex browsing tasks programmatically, enabling users to focus on their AI applications while Steel handles the underlying complexity.

RainbowGPT
RainbowGPT is a versatile tool that offers a range of functionalities, including Stock Analysis for financial decision-making, MySQL Management for database navigation, and integration of AI technologies like GPT-4 and ChatGlm3. It provides a user-friendly interface suitable for all skill levels, ensuring seamless information flow and continuous expansion of emerging technologies. The tool enhances adaptability, creativity, and insight, making it a valuable asset for various projects and tasks.

wanda
Official PyTorch implementation of Wanda (Pruning by Weights and Activations), a simple and effective pruning approach for large language models. The pruning approach removes weights on a per-output basis, by the product of weight magnitudes and input activation norms. The repository provides support for various features such as LLaMA-2, ablation study on OBS weight update, zero-shot evaluation, and speedup evaluation. Users can replicate main results from the paper using provided bash commands. The tool aims to enhance the efficiency and performance of language models through structured and unstructured sparsity techniques.

rag-chatbot
The RAG ChatBot project combines Lama.cpp, Chroma, and Streamlit to build a Conversation-aware Chatbot and a Retrieval-augmented generation (RAG) ChatBot. The RAG Chatbot works by taking a collection of Markdown files as input and provides answers based on the context provided by those files. It utilizes a Memory Builder component to load Markdown pages, divide them into sections, calculate embeddings, and save them in an embedding database. The chatbot retrieves relevant sections from the database, rewrites questions for optimal retrieval, and generates answers using a local language model. It also remembers previous interactions for more accurate responses. Various strategies are implemented to deal with context overflows, including creating and refining context, hierarchical summarization, and async hierarchical summarization.

EasyInstruct
EasyInstruct is a Python package proposed as an easy-to-use instruction processing framework for Large Language Models (LLMs) like GPT-4, LLaMA, ChatGLM in your research experiments. EasyInstruct modularizes instruction generation, selection, and prompting, while also considering their combination and interaction.

raycast_api_proxy
The Raycast AI Proxy is a tool that acts as a proxy for the Raycast AI application, allowing users to utilize the application without subscribing. It intercepts and forwards Raycast requests to various AI APIs, then reformats the responses for Raycast. The tool supports multiple AI providers and allows for custom model configurations. Users can generate self-signed certificates, add them to the system keychain, and modify DNS settings to redirect requests to the proxy. The tool is designed to work with providers like OpenAI, Azure OpenAI, Google, and more, enabling tasks such as AI chat completions, translations, and image generation.

airbadge
Airbadge is a Stripe addon for Auth.js that provides an easy way to create a SaaS site without writing any authentication or payment code. It integrates Stripe Checkout into the signup flow, offers over 50 OAuth options for authentication, allows route and UI restriction based on subscription, enables self-service account management, handles all Stripe webhooks, supports trials and free plans, includes subscription and plan data in the session, and is open source with a BSL license. The project also provides components for conditional UI display based on subscription status and helper functions to restrict route access. Additionally, it offers a billing endpoint with various routes for billing operations. Setup involves installing @airbadge/sveltekit, setting up a database provider for Auth.js, adding environment variables, configuring authentication and billing options, and forwarding Stripe events to localhost.

openai-kotlin
OpenAI Kotlin API client is a Kotlin client for OpenAI's API with multiplatform and coroutines capabilities. It allows users to interact with OpenAI's API using Kotlin programming language. The client supports various features such as models, chat, images, embeddings, files, fine-tuning, moderations, audio, assistants, threads, messages, and runs. It also provides guides on getting started, chat & function call, file source guide, and assistants. Sample apps are available for reference, and troubleshooting guides are provided for common issues. The project is open-source and licensed under the MIT license, allowing contributions from the community.

ResumeFlow
ResumeFlow is an automated system that leverages Large Language Models (LLMs) to streamline the job application process. By integrating LLM technology, the tool aims to automate various stages of job hunting, making it easier for users to apply for jobs. Users can access ResumeFlow as a web tool, install it as a Python package, or download the source code from GitHub. The tool requires Python 3.11.6 or above and an LLM API key from OpenAI or Gemini Pro for usage. ResumeFlow offers functionalities such as generating curated resumes and cover letters based on job URLs and user's master resume data.

MetaGPT
MetaGPT is a multi-agent framework that enables GPT to work in a software company, collaborating to tackle more complex tasks. It assigns different roles to GPTs to form a collaborative entity for complex tasks. MetaGPT takes a one-line requirement as input and outputs user stories, competitive analysis, requirements, data structures, APIs, documents, etc. Internally, MetaGPT includes product managers, architects, project managers, and engineers. It provides the entire process of a software company along with carefully orchestrated SOPs. MetaGPT's core philosophy is "Code = SOP(Team)", materializing SOP and applying it to teams composed of LLMs.

distilabel
Distilabel is a framework for synthetic data and AI feedback for AI engineers that require high-quality outputs, full data ownership, and overall efficiency. It helps you synthesize data and provide AI feedback to improve the quality of your AI models. With Distilabel, you can: * **Synthesize data:** Generate synthetic data to train your AI models. This can help you to overcome the challenges of data scarcity and bias. * **Provide AI feedback:** Get feedback from AI models on your data. This can help you to identify errors and improve the quality of your data. * **Improve your AI output quality:** By using Distilabel to synthesize data and provide AI feedback, you can improve the quality of your AI models and get better results.

AirConnect-Synology
AirConnect-Synology is a minimal Synology package that allows users to use AirPlay to stream to UPnP/Sonos & Chromecast devices that do not natively support AirPlay. It is compatible with DSM 7.0 and DSM 7.1, and provides detailed information on installation, configuration, supported devices, troubleshooting, and more. The package automates the installation and usage of AirConnect on Synology devices, ensuring compatibility with various architectures and firmware versions. Users can customize the configuration using the airconnect.conf file and adjust settings for specific speakers like Sonos, Bose SoundTouch, and Pioneer/Phorus/Play-Fi.

laravel-ai-translator
Laravel AI Translator is a powerful tool designed to streamline the localization process in Laravel projects. It automates the task of translating strings across multiple languages using advanced AI models like GPT-4 and Claude. The tool supports custom language styles, preserves variables and nested structures, and ensures consistent tone and style across translations. It integrates seamlessly with Laravel projects, making internationalization easier and more efficient. Users can customize translation rules, handle large language files efficiently, and validate translations for accuracy. The tool offers contextual understanding, linguistic precision, variable handling, smart length adaptation, and tone consistency for intelligent translations.
For similar tasks

playword
PlayWord is a tool designed to supercharge web test automation experience with AI. It provides core features such as enabling browser operations and validations using natural language inputs, as well as monitoring interface to record and dry-run test steps. PlayWord supports multiple AI services including Anthropic, Google, and OpenAI, allowing users to select the appropriate provider based on their requirements. The tool also offers features like assertion handling, frame handling, custom variables, test recordings, and an Observer module to track user interactions on web pages. With PlayWord, users can interact with web pages using natural language commands, reducing the need to worry about element locators and providing AI-powered adaptation to UI changes.

AutoNode
AutoNode is a self-operating computer system designed to automate web interactions and data extraction processes. It leverages advanced technologies like OCR (Optical Character Recognition), YOLO (You Only Look Once) models for object detection, and a custom site-graph to navigate and interact with web pages programmatically. Users can define objectives, create site-graphs, and utilize AutoNode via API to automate tasks on websites. The tool also supports training custom YOLO models for object detection and OCR for text recognition on web pages. AutoNode can be used for tasks such as extracting product details, automating web interactions, and more.
For similar jobs

aiscript
AiScript is a lightweight scripting language that runs on JavaScript. It supports arrays, objects, and functions as first-class citizens, and is easy to write without the need for semicolons or commas. AiScript runs in a secure sandbox environment, preventing infinite loops from freezing the host. It also allows for easy provision of variables and functions from the host.

askui
AskUI is a reliable, automated end-to-end automation tool that only depends on what is shown on your screen instead of the technology or platform you are running on.

bots
The 'bots' repository is a collection of guides, tools, and example bots for programming bots to play video games. It provides resources on running bots live, installing the BotLab client, debugging bots, testing bots in simulated environments, and more. The repository also includes example bots for games like EVE Online, Tribal Wars 2, and Elvenar. Users can learn about developing bots for specific games, syntax of the Elm programming language, and tools for memory reading development. Additionally, there are guides on bot programming, contributing to BotLab, and exploring Elm syntax and core library.

ain
Ain is a terminal HTTP API client designed for scripting input and processing output via pipes. It allows flexible organization of APIs using files and folders, supports shell-scripts and executables for common tasks, handles url-encoding, and enables sharing the resulting curl, wget, or httpie command-line. Users can put things that change in environment variables or .env-files, and pipe the API output for further processing. Ain targets users who work with many APIs using a simple file format and uses curl, wget, or httpie to make the actual calls.

LaVague
LaVague is an open-source Large Action Model framework that uses advanced AI techniques to compile natural language instructions into browser automation code. It leverages Selenium or Playwright for browser actions. Users can interact with LaVague through an interactive Gradio interface to automate web interactions. The tool requires an OpenAI API key for default examples and offers a Playwright integration guide. Contributors can help by working on outlined tasks, submitting PRs, and engaging with the community on Discord. The project roadmap is available to track progress, but users should exercise caution when executing LLM-generated code using 'exec'.

robocorp
Robocorp is a platform that allows users to create, deploy, and operate Python automations and AI actions. It provides an easy way to extend the capabilities of AI agents, assistants, and copilots with custom actions written in Python. Users can create and deploy tools, skills, loaders, and plugins that securely connect any AI Assistant platform to their data and applications. The Robocorp Action Server makes Python scripts compatible with ChatGPT and LangChain by automatically creating and exposing an API based on function declaration, type hints, and docstrings. It simplifies the process of developing and deploying AI actions, enabling users to interact with AI frameworks effortlessly.

Open-Interface
Open Interface is a self-driving software that automates computer tasks by sending user requests to a language model backend (e.g., GPT-4V) and simulating keyboard and mouse inputs to execute the steps. It course-corrects by sending current screenshots to the language models. The tool supports MacOS, Linux, and Windows, and requires setting up the OpenAI API key for access to GPT-4V. It can automate tasks like creating meal plans, setting up custom language model backends, and more. Open Interface is currently not efficient in accurate spatial reasoning, tracking itself in tabular contexts, and navigating complex GUI-rich applications. Future improvements aim to enhance the tool's capabilities with better models trained on video walkthroughs. The tool is cost-effective, with user requests priced between $0.05 - $0.20, and offers features like interrupting the app and primary display visibility in multi-monitor setups.

AI-Case-Sorter-CS7.1
AI-Case-Sorter-CS7.1 is a project focused on building a case sorter using machine vision and machine learning AI to sort cases by headstamp. The repository includes Arduino code and 3D models necessary for the project.