
shortest
QA via natural language AI tests
Stars: 4425

Shortest is an AI-powered natural language end-to-end testing framework built on Playwright. It provides a seamless testing experience by allowing users to write tests in natural language and execute them using Anthropic Claude API. The framework also offers GitHub integration with 2FA support, making it suitable for testing web applications with complex authentication flows. Shortest simplifies the testing process by enabling users to run tests locally or in CI/CD pipelines, ensuring the reliability and efficiency of web applications.
README:
AI-powered natural language end-to-end testing framework.
Your browser does not support the video tag.- Natural language E2E testing framework
- AI-powered test execution using Anthropic Claude API
- Built on Playwright
- GitHub integration with 2FA support
- Email validation with Mailosaur
If helpful, here's a short video!
Use the shortest init
command to streamline the setup process in a new or existing project.
The shortest init
command will:
npx @antiwork/shortest init
This will:
- Automatically install the
@antiwork/shortest
package as a dev dependency if it is not already installed - Create a default
shortest.config.ts
file with boilerplate configuration - Generate a
.env.local
file (unless present) with placeholders for required environment variables, such asANTHROPIC_API_KEY
- Add
.env.local
and.shortest/
to.gitignore
- Determine your test entry and add your Anthropic API key in config file:
shortest.config.ts
import type { ShortestConfig } from "@antiwork/shortest";
export default {
headless: false,
baseUrl: "http://localhost:3000",
testPattern: "**/*.test.ts",
ai: {
provider: "anthropic",
},
} satisfies ShortestConfig;
Anthropic API key will default to SHORTEST_ANTHROPIC_API_KEY
/ ANTHROPIC_API_KEY
environment variables. Can be overwritten via ai.config.apiKey
.
- Create test files using the pattern specified in the config:
app/login.test.ts
import { shortest } from "@antiwork/shortest";
shortest("Login to the app using email and password", {
username: process.env.GITHUB_USERNAME,
password: process.env.GITHUB_PASSWORD,
});
You can also use callback functions to add additional assertions and other logic. AI will execute the callback function after the test execution in browser is completed.
import { shortest } from "@antiwork/shortest";
import { db } from "@/lib/db/drizzle";
import { users } from "@/lib/db/schema";
import { eq } from "drizzle-orm";
shortest("Login to the app using username and password", {
username: process.env.USERNAME,
password: process.env.PASSWORD,
}).after(async ({ page }) => {
// Get current user's clerk ID from the page
const clerkId = await page.evaluate(() => {
return window.localStorage.getItem("clerk-user");
});
if (!clerkId) {
throw new Error("User not found in database");
}
// Query the database
const [user] = await db
.select()
.from(users)
.where(eq(users.clerkId, clerkId))
.limit(1);
expect(user).toBeDefined();
});
You can use lifecycle hooks to run code before and after the test.
import { shortest } from "@antiwork/shortest";
shortest.beforeAll(async ({ page }) => {
await clerkSetup({
frontendApiUrl:
process.env.PLAYWRIGHT_TEST_BASE_URL ?? "http://localhost:3000",
});
});
shortest.beforeEach(async ({ page }) => {
await clerk.signIn({
page,
signInParams: {
strategy: "email_code",
identifier: "[email protected]",
},
});
});
shortest.afterEach(async ({ page }) => {
await page.close();
});
shortest.afterAll(async ({ page }) => {
await clerk.signOut({ page });
});
Shortest supports flexible test chaining patterns:
// Sequential test chain
shortest([
"user can login with email and password",
"user can modify their account-level refund policy",
]);
// Reusable test flows
const loginAsLawyer = "login as lawyer with valid credentials";
const loginAsContractor = "login as contractor with valid credentials";
const allAppActions = ["send invoice to company", "view invoices"];
// Combine flows with spread operator
shortest([loginAsLawyer, ...allAppActions]);
shortest([loginAsContractor, ...allAppActions]);
Test API endpoints using natural language
const req = new APIRequest({
baseURL: API_BASE_URI,
});
shortest(
"Ensure the response contains only active users",
req.fetch({
url: "/users",
method: "GET",
params: new URLSearchParams({
active: true,
}),
}),
);
Or simply:
shortest(`
Test the API GET endpoint ${API_BASE_URI}/users with query parameter { "active": true }
Expect the response to contain only active users
`);
pnpm shortest # Run all tests
pnpm shortest __tests__/login.test.ts # Run specific test
pnpm shortest --headless # Run in headless mode using CLI
You can find example tests in the examples
directory.
You can run Shortest in your CI/CD pipeline by running tests in headless mode. Make sure to add your Anthropic API key to your CI/CD pipeline secrets.
Shortest supports login using GitHub 2FA. For GitHub authentication tests:
- Go to your repository settings
- Navigate to "Password and Authentication"
- Click on "Authenticator App"
- Select "Use your authenticator app"
- Click "Setup key" to obtain the OTP secret
- Add the OTP secret to your
.env.local
file or use the Shortest CLI to add it - Enter the 2FA code displayed in your terminal into Github's Authenticator setup page to complete the process
shortest --github-code --secret=<OTP_SECRET>
Required in .env.local
:
ANTHROPIC_API_KEY=your_api_key
GITHUB_TOTP_SECRET=your_secret # Only for GitHub auth tests
The NPM package is located in packages/shortest/
. See CONTRIBUTING guide.
This guide will help you set up the Shortest web app for local development.
- React >=19.0.0 (if using with Next.js 14+ or Server Actions)
- Next.js >=14.0.0 (if using Server Components/Actions)
[!WARNING] Using this package with React 18 in Next.js 14+ projects may cause type conflicts with Server Actions and
useFormStatus
If you encounter type errors with form actions or React hooks, ensure you're using React 19
- Clone the repository:
git clone https://github.com/anti-work/shortest.git
cd shortest
- Install dependencies:
npm install -g pnpm
pnpm install
Pull Vercel env vars:
pnpm i -g vercel
vercel link
vercel env pull
- Run
pnpm run setup
to configure the environment variables. - The setup wizard will ask you for information. Refer to "Services Configuration" section below for more details.
pnpm drizzle-kit generate
pnpm db:migrate
pnpm db:seed # creates stripe products, currently unused
You'll need to set up the following services for local development. If you're not a Anti-Work Vercel team member, you'll need to either run the setup wizard pnpm run setup
or manually configure each of these services and add the corresponding environment variables to your .env.local
file:
Clerk
- Go to clerk.com and create a new app.
- Name it whatever you like and disable all login methods except GitHub.
- Once created, copy the environment variables to your
.env.local
file. - In the Clerk dashboard, disable the "Require the same device and browser" setting to ensure tests with Mailosaur work properly.
Vercel Postgres
- Go to your dashboard at vercel.com.
- Navigate to the Storage tab and click the
Create Database
button. - Choose
Postgres
from theBrowse Storage
menu. - Copy your environment variables from the
Quickstart
.env.local
tab.
Anthropic
- Go to your dashboard at anthropic.com and grab your API Key.
Stripe
- Go to your
Developers
dashboard at stripe.com. - Turn on
Test mode
. - Go to the
API Keys
tab and copy yourSecret key
. - Go to the terminal of your project and type
pnpm run stripe:webhooks
. It will prompt you to login with a code then give you yourSTRIPE_WEBHOOK_SECRET
.
GitHub OAuth
-
Create a GitHub OAuth App:
- Go to your GitHub account settings.
- Navigate to
Developer settings
>OAuth Apps
>New OAuth App
. - Fill in the application details:
-
Configure Clerk with GitHub OAuth:
Mailosaur
- Sign up for an account with Mailosaur.
- Create a new Inbox/Server.
- Go to API Keys and create a standard key.
- Update the environment variables:
-
MAILOSAUR_API_KEY
: Your API key -
MAILOSAUR_SERVER_ID
: Your server ID
-
The email used to test the login flow will have the format shortest@<MAILOSAUR_SERVER_ID>.mailosaur.net
, where
MAILOSAUR_SERVER_ID
is your server ID.
Make sure to add the email as a new user under the Clerk app.
Run the development server:
pnpm dev
Open http://localhost:3000 in your browser to see the app in action.
For Tasks:
Click tags to check more tools for each tasksFor Jobs:
Alternative AI tools for shortest
Similar Open Source Tools

shortest
Shortest is an AI-powered natural language end-to-end testing framework built on Playwright. It provides a seamless testing experience by allowing users to write tests in natural language and execute them using Anthropic Claude API. The framework also offers GitHub integration with 2FA support, making it suitable for testing web applications with complex authentication flows. Shortest simplifies the testing process by enabling users to run tests locally or in CI/CD pipelines, ensuring the reliability and efficiency of web applications.

mcpdoc
The MCP LLMS-TXT Documentation Server is an open-source server that provides developers full control over tools used by applications like Cursor, Windsurf, and Claude Code/Desktop. It allows users to create a user-defined list of `llms.txt` files and use a `fetch_docs` tool to read URLs within these files, enabling auditing of tool calls and context returned. The server supports various applications and provides a way to connect to them, configure rules, and test tool calls for tasks related to documentation retrieval and processing.

suno-api
Suno AI API is an open-source project that allows developers to integrate the music generation capabilities of Suno.ai into their own applications. The API provides a simple and convenient way to generate music, lyrics, and other audio content using Suno.ai's powerful AI models. With Suno AI API, developers can easily add music generation functionality to their apps, websites, and other projects.

python-tgpt
Python-tgpt is a Python package that enables seamless interaction with over 45 free LLM providers without requiring an API key. It also provides image generation capabilities. The name _python-tgpt_ draws inspiration from its parent project tgpt, which operates on Golang. Through this Python adaptation, users can effortlessly engage with a number of free LLMs available, fostering a smoother AI interaction experience.

deepgram-js-sdk
Deepgram JavaScript SDK. Power your apps with world-class speech and Language AI models.

langserve
LangServe helps developers deploy `LangChain` runnables and chains as a REST API. This library is integrated with FastAPI and uses pydantic for data validation. In addition, it provides a client that can be used to call into runnables deployed on a server. A JavaScript client is available in LangChain.js.

well-architected-iac-analyzer
Well-Architected Infrastructure as Code (IaC) Analyzer is a project demonstrating how generative AI can evaluate infrastructure code for alignment with best practices. It features a modern web application allowing users to upload IaC documents, complete IaC projects, or architecture diagrams for assessment. The tool provides insights into infrastructure code alignment with AWS best practices, offers suggestions for improving cloud architecture designs, and can generate IaC templates from architecture diagrams. Users can analyze CloudFormation, Terraform, or AWS CDK templates, architecture diagrams in PNG or JPEG format, and complete IaC projects with supporting documents. Real-time analysis against Well-Architected best practices, integration with AWS Well-Architected Tool, and export of analysis results and recommendations are included.

hayhooks
Hayhooks is a tool that simplifies the deployment and serving of Haystack pipelines as REST APIs. It allows users to wrap their pipelines with custom logic and expose them via HTTP endpoints, including OpenAI-compatible chat completion endpoints. With Hayhooks, users can easily convert their Haystack pipelines into API services with minimal boilerplate code.

react-native-fast-tflite
A high-performance TensorFlow Lite library for React Native that utilizes JSI for power, zero-copy ArrayBuffers for efficiency, and low-level C/C++ TensorFlow Lite core API for direct memory access. It supports swapping out TensorFlow Models at runtime and GPU-accelerated delegates like CoreML/Metal/OpenGL. Easy VisionCamera integration allows for seamless usage. Users can load TensorFlow Lite models, interpret input and output data, and utilize GPU Delegates for faster computation. The library is suitable for real-time object detection, image classification, and other machine learning tasks in React Native applications.

Gemini-API
Gemini-API is a reverse-engineered asynchronous Python wrapper for Google Gemini web app (formerly Bard). It provides features like persistent cookies, ImageFx support, extension support, classified outputs, official flavor, and asynchronous operation. The tool allows users to generate contents from text or images, have conversations across multiple turns, retrieve images in response, generate images with ImageFx, save images to local files, use Gemini extensions, check and switch reply candidates, and control log level.

lollms
LoLLMs Server is a text generation server based on large language models. It provides a Flask-based API for generating text using various pre-trained language models. This server is designed to be easy to install and use, allowing developers to integrate powerful text generation capabilities into their applications.

pentagi
PentAGI is an innovative tool for automated security testing that leverages cutting-edge artificial intelligence technologies. It is designed for information security professionals, researchers, and enthusiasts who need a powerful and flexible solution for conducting penetration tests. The tool provides secure and isolated operations in a sandboxed Docker environment, fully autonomous AI-powered agent for penetration testing steps, a suite of 20+ professional security tools, smart memory system for storing research results, web intelligence for gathering information, integration with external search systems, team delegation system, comprehensive monitoring and reporting, modern interface, API integration, persistent storage, scalable architecture, self-hosted solution, flexible authentication, and quick deployment through Docker Compose.

docker-cups-airprint
This repository provides a Docker image that acts as an AirPrint bridge for local printers, allowing them to be exposed to iOS/macOS devices. It runs a container with CUPS and Avahi to facilitate this functionality. Users must have CUPS drivers available for their printers. The tool requires a Linux host and a dedicated IP for the container to avoid interference with other services. It supports setting up printers through environment variables and offers options for automated configuration via command line, web interface, or files. The repository includes detailed instructions on setting up and testing the AirPrint bridge.

ppt2desc
ppt2desc is a command-line tool that converts PowerPoint presentations into detailed textual descriptions using vision language models. It interprets and describes visual elements, capturing the full semantic meaning of each slide in a machine-readable format. The tool supports various model providers and offers features like converting PPT/PPTX files to semantic descriptions, processing individual files or directories, visual elements interpretation, rate limiting for API calls, customizable prompts, and JSON output format for easy integration.

js-genai
The Google Gen AI JavaScript SDK is an experimental SDK for TypeScript and JavaScript developers to build applications powered by Gemini. It supports both the Gemini Developer API and Vertex AI. The SDK is designed to work with Gemini 2.0 features. Users can access API features through the GoogleGenAI classes, which provide submodules for querying models, managing caches, creating chats, uploading files, and starting live sessions. The SDK also allows for function calling to interact with external systems. Users can find more samples in the GitHub samples directory.

Flowise
Flowise is a tool that allows users to build customized LLM flows with a drag-and-drop UI. It is open-source and self-hostable, and it supports various deployments, including AWS, Azure, Digital Ocean, GCP, Railway, Render, HuggingFace Spaces, Elestio, Sealos, and RepoCloud. Flowise has three different modules in a single mono repository: server, ui, and components. The server module is a Node backend that serves API logics, the ui module is a React frontend, and the components module contains third-party node integrations. Flowise supports different environment variables to configure your instance, and you can specify these variables in the .env file inside the packages/server folder.
For similar tasks

LLMstudio
LLMstudio by TensorOps is a platform that offers prompt engineering tools for accessing models from providers like OpenAI, VertexAI, and Bedrock. It provides features such as Python Client Gateway, Prompt Editing UI, History Management, and Context Limit Adaptability. Users can track past runs, log costs and latency, and export history to CSV. The tool also supports automatic switching to larger-context models when needed. Coming soon features include side-by-side comparison of LLMs, automated testing, API key administration, project organization, and resilience against rate limits. LLMstudio aims to streamline prompt engineering, provide execution history tracking, and enable effortless data export, offering an evolving environment for teams to experiment with advanced language models.

kaizen
Kaizen is an open-source project that helps teams ensure quality in their software delivery by providing a suite of tools for code review, test generation, and end-to-end testing. It integrates with your existing code repositories and workflows, allowing you to streamline your software development process. Kaizen generates comprehensive end-to-end tests, provides UI testing and review, and automates code review with insightful feedback. The file structure includes components for API server, logic, actors, generators, LLM integrations, documentation, and sample code. Getting started involves installing the Kaizen package, generating tests for websites, and executing tests. The tool also runs an API server for GitHub App actions. Contributions are welcome under the AGPL License.

flux-fine-tuner
This is a Cog training model that creates LoRA-based fine-tunes for the FLUX.1 family of image generation models. It includes features such as automatic image captioning during training, image generation using LoRA, uploading fine-tuned weights to Hugging Face, automated test suite for continuous deployment, and Weights and biases integration. The tool is designed for users to fine-tune Flux models on Replicate for image generation tasks.

shortest
Shortest is an AI-powered natural language end-to-end testing framework built on Playwright. It provides a seamless testing experience by allowing users to write tests in natural language and execute them using Anthropic Claude API. The framework also offers GitHub integration with 2FA support, making it suitable for testing web applications with complex authentication flows. Shortest simplifies the testing process by enabling users to run tests locally or in CI/CD pipelines, ensuring the reliability and efficiency of web applications.

lmstudio-python
LM Studio Python SDK provides a convenient API for interacting with LM Studio instance, including text completion and chat response functionalities. The SDK allows users to manage websocket connections and chat history easily. It also offers tools for code consistency checks, automated testing, and expanding the API.

mastering-github-copilot-for-dotnet-csharp-developers
Enhance coding efficiency with expert-led GitHub Copilot course for C#/.NET developers. Learn to integrate AI-powered coding assistance, automate testing, and boost collaboration using Visual Studio Code and Copilot Chat. From autocompletion to unit testing, cover essential techniques for cleaner, faster, smarter code.

agentql
AgentQL is a suite of tools for extracting data and automating workflows on live web sites featuring an AI-powered query language, Python and JavaScript SDKs, a browser-based debugger, and a REST API endpoint. It uses natural language queries to pinpoint data and elements on any web page, including authenticated and dynamically generated content. Users can define structured data output and apply transforms within queries. AgentQL's natural language selectors find elements intuitively based on the content of the web page and work across similar web sites, self-healing as UI changes over time.

cursor-tools
cursor-tools is a CLI tool designed to enhance AI agents with advanced skills, such as web search, repository context, documentation generation, GitHub integration, Xcode tools, and browser automation. It provides features like Perplexity for web search, Gemini 2.0 for codebase context, and Stagehand for browser operations. The tool requires API keys for Perplexity AI and Google Gemini, and supports global installation for system-wide access. It offers various commands for different tasks and integrates with Cursor Composer for AI agent usage.
For similar jobs

aiscript
AiScript is a lightweight scripting language that runs on JavaScript. It supports arrays, objects, and functions as first-class citizens, and is easy to write without the need for semicolons or commas. AiScript runs in a secure sandbox environment, preventing infinite loops from freezing the host. It also allows for easy provision of variables and functions from the host.

askui
AskUI is a reliable, automated end-to-end automation tool that only depends on what is shown on your screen instead of the technology or platform you are running on.

bots
The 'bots' repository is a collection of guides, tools, and example bots for programming bots to play video games. It provides resources on running bots live, installing the BotLab client, debugging bots, testing bots in simulated environments, and more. The repository also includes example bots for games like EVE Online, Tribal Wars 2, and Elvenar. Users can learn about developing bots for specific games, syntax of the Elm programming language, and tools for memory reading development. Additionally, there are guides on bot programming, contributing to BotLab, and exploring Elm syntax and core library.

ain
Ain is a terminal HTTP API client designed for scripting input and processing output via pipes. It allows flexible organization of APIs using files and folders, supports shell-scripts and executables for common tasks, handles url-encoding, and enables sharing the resulting curl, wget, or httpie command-line. Users can put things that change in environment variables or .env-files, and pipe the API output for further processing. Ain targets users who work with many APIs using a simple file format and uses curl, wget, or httpie to make the actual calls.

LaVague
LaVague is an open-source Large Action Model framework that uses advanced AI techniques to compile natural language instructions into browser automation code. It leverages Selenium or Playwright for browser actions. Users can interact with LaVague through an interactive Gradio interface to automate web interactions. The tool requires an OpenAI API key for default examples and offers a Playwright integration guide. Contributors can help by working on outlined tasks, submitting PRs, and engaging with the community on Discord. The project roadmap is available to track progress, but users should exercise caution when executing LLM-generated code using 'exec'.

robocorp
Robocorp is a platform that allows users to create, deploy, and operate Python automations and AI actions. It provides an easy way to extend the capabilities of AI agents, assistants, and copilots with custom actions written in Python. Users can create and deploy tools, skills, loaders, and plugins that securely connect any AI Assistant platform to their data and applications. The Robocorp Action Server makes Python scripts compatible with ChatGPT and LangChain by automatically creating and exposing an API based on function declaration, type hints, and docstrings. It simplifies the process of developing and deploying AI actions, enabling users to interact with AI frameworks effortlessly.

Open-Interface
Open Interface is a self-driving software that automates computer tasks by sending user requests to a language model backend (e.g., GPT-4V) and simulating keyboard and mouse inputs to execute the steps. It course-corrects by sending current screenshots to the language models. The tool supports MacOS, Linux, and Windows, and requires setting up the OpenAI API key for access to GPT-4V. It can automate tasks like creating meal plans, setting up custom language model backends, and more. Open Interface is currently not efficient in accurate spatial reasoning, tracking itself in tabular contexts, and navigating complex GUI-rich applications. Future improvements aim to enhance the tool's capabilities with better models trained on video walkthroughs. The tool is cost-effective, with user requests priced between $0.05 - $0.20, and offers features like interrupting the app and primary display visibility in multi-monitor setups.

AI-Case-Sorter-CS7.1
AI-Case-Sorter-CS7.1 is a project focused on building a case sorter using machine vision and machine learning AI to sort cases by headstamp. The repository includes Arduino code and 3D models necessary for the project.