
shortest
QA via natural language AI tests
Stars: 4425

Shortest is an AI-powered natural language end-to-end testing framework built on Playwright. It provides a seamless testing experience by allowing users to write tests in natural language and execute them using Anthropic Claude API. The framework also offers GitHub integration with 2FA support, making it suitable for testing web applications with complex authentication flows. Shortest simplifies the testing process by enabling users to run tests locally or in CI/CD pipelines, ensuring the reliability and efficiency of web applications.
README:
AI-powered natural language end-to-end testing framework.
Your browser does not support the video tag.- Natural language E2E testing framework
- AI-powered test execution using Anthropic Claude API
- Built on Playwright
- GitHub integration with 2FA support
- Email validation with Mailosaur
If helpful, here's a short video!
Use the shortest init
command to streamline the setup process in a new or existing project.
The shortest init
command will:
npx @antiwork/shortest init
This will:
- Automatically install the
@antiwork/shortest
package as a dev dependency if it is not already installed - Create a default
shortest.config.ts
file with boilerplate configuration - Generate a
.env.local
file (unless present) with placeholders for required environment variables, such asANTHROPIC_API_KEY
- Add
.env.local
and.shortest/
to.gitignore
- Determine your test entry and add your Anthropic API key in config file:
shortest.config.ts
import type { ShortestConfig } from "@antiwork/shortest";
export default {
headless: false,
baseUrl: "http://localhost:3000",
testPattern: "**/*.test.ts",
ai: {
provider: "anthropic",
},
} satisfies ShortestConfig;
Anthropic API key will default to SHORTEST_ANTHROPIC_API_KEY
/ ANTHROPIC_API_KEY
environment variables. Can be overwritten via ai.config.apiKey
.
- Create test files using the pattern specified in the config:
app/login.test.ts
import { shortest } from "@antiwork/shortest";
shortest("Login to the app using email and password", {
username: process.env.GITHUB_USERNAME,
password: process.env.GITHUB_PASSWORD,
});
You can also use callback functions to add additional assertions and other logic. AI will execute the callback function after the test execution in browser is completed.
import { shortest } from "@antiwork/shortest";
import { db } from "@/lib/db/drizzle";
import { users } from "@/lib/db/schema";
import { eq } from "drizzle-orm";
shortest("Login to the app using username and password", {
username: process.env.USERNAME,
password: process.env.PASSWORD,
}).after(async ({ page }) => {
// Get current user's clerk ID from the page
const clerkId = await page.evaluate(() => {
return window.localStorage.getItem("clerk-user");
});
if (!clerkId) {
throw new Error("User not found in database");
}
// Query the database
const [user] = await db
.select()
.from(users)
.where(eq(users.clerkId, clerkId))
.limit(1);
expect(user).toBeDefined();
});
You can use lifecycle hooks to run code before and after the test.
import { shortest } from "@antiwork/shortest";
shortest.beforeAll(async ({ page }) => {
await clerkSetup({
frontendApiUrl:
process.env.PLAYWRIGHT_TEST_BASE_URL ?? "http://localhost:3000",
});
});
shortest.beforeEach(async ({ page }) => {
await clerk.signIn({
page,
signInParams: {
strategy: "email_code",
identifier: "[email protected]",
},
});
});
shortest.afterEach(async ({ page }) => {
await page.close();
});
shortest.afterAll(async ({ page }) => {
await clerk.signOut({ page });
});
Shortest supports flexible test chaining patterns:
// Sequential test chain
shortest([
"user can login with email and password",
"user can modify their account-level refund policy",
]);
// Reusable test flows
const loginAsLawyer = "login as lawyer with valid credentials";
const loginAsContractor = "login as contractor with valid credentials";
const allAppActions = ["send invoice to company", "view invoices"];
// Combine flows with spread operator
shortest([loginAsLawyer, ...allAppActions]);
shortest([loginAsContractor, ...allAppActions]);
Test API endpoints using natural language
const req = new APIRequest({
baseURL: API_BASE_URI,
});
shortest(
"Ensure the response contains only active users",
req.fetch({
url: "/users",
method: "GET",
params: new URLSearchParams({
active: true,
}),
}),
);
Or simply:
shortest(`
Test the API GET endpoint ${API_BASE_URI}/users with query parameter { "active": true }
Expect the response to contain only active users
`);
pnpm shortest # Run all tests
pnpm shortest __tests__/login.test.ts # Run specific test
pnpm shortest --headless # Run in headless mode using CLI
You can find example tests in the examples
directory.
You can run Shortest in your CI/CD pipeline by running tests in headless mode. Make sure to add your Anthropic API key to your CI/CD pipeline secrets.
Shortest supports login using GitHub 2FA. For GitHub authentication tests:
- Go to your repository settings
- Navigate to "Password and Authentication"
- Click on "Authenticator App"
- Select "Use your authenticator app"
- Click "Setup key" to obtain the OTP secret
- Add the OTP secret to your
.env.local
file or use the Shortest CLI to add it - Enter the 2FA code displayed in your terminal into Github's Authenticator setup page to complete the process
shortest --github-code --secret=<OTP_SECRET>
Required in .env.local
:
ANTHROPIC_API_KEY=your_api_key
GITHUB_TOTP_SECRET=your_secret # Only for GitHub auth tests
The NPM package is located in packages/shortest/
. See CONTRIBUTING guide.
This guide will help you set up the Shortest web app for local development.
- React >=19.0.0 (if using with Next.js 14+ or Server Actions)
- Next.js >=14.0.0 (if using Server Components/Actions)
[!WARNING] Using this package with React 18 in Next.js 14+ projects may cause type conflicts with Server Actions and
useFormStatus
If you encounter type errors with form actions or React hooks, ensure you're using React 19
- Clone the repository:
git clone https://github.com/anti-work/shortest.git
cd shortest
- Install dependencies:
npm install -g pnpm
pnpm install
Pull Vercel env vars:
pnpm i -g vercel
vercel link
vercel env pull
- Run
pnpm run setup
to configure the environment variables. - The setup wizard will ask you for information. Refer to "Services Configuration" section below for more details.
pnpm drizzle-kit generate
pnpm db:migrate
pnpm db:seed # creates stripe products, currently unused
You'll need to set up the following services for local development. If you're not a Anti-Work Vercel team member, you'll need to either run the setup wizard pnpm run setup
or manually configure each of these services and add the corresponding environment variables to your .env.local
file:
Clerk
- Go to clerk.com and create a new app.
- Name it whatever you like and disable all login methods except GitHub.
- Once created, copy the environment variables to your
.env.local
file. - In the Clerk dashboard, disable the "Require the same device and browser" setting to ensure tests with Mailosaur work properly.
Vercel Postgres
- Go to your dashboard at vercel.com.
- Navigate to the Storage tab and click the
Create Database
button. - Choose
Postgres
from theBrowse Storage
menu. - Copy your environment variables from the
Quickstart
.env.local
tab.
Anthropic
- Go to your dashboard at anthropic.com and grab your API Key.
Stripe
- Go to your
Developers
dashboard at stripe.com. - Turn on
Test mode
. - Go to the
API Keys
tab and copy yourSecret key
. - Go to the terminal of your project and type
pnpm run stripe:webhooks
. It will prompt you to login with a code then give you yourSTRIPE_WEBHOOK_SECRET
.
GitHub OAuth
-
Create a GitHub OAuth App:
- Go to your GitHub account settings.
- Navigate to
Developer settings
>OAuth Apps
>New OAuth App
. - Fill in the application details:
-
Configure Clerk with GitHub OAuth:
Mailosaur
- Sign up for an account with Mailosaur.
- Create a new Inbox/Server.
- Go to API Keys and create a standard key.
- Update the environment variables:
-
MAILOSAUR_API_KEY
: Your API key -
MAILOSAUR_SERVER_ID
: Your server ID
-
The email used to test the login flow will have the format shortest@<MAILOSAUR_SERVER_ID>.mailosaur.net
, where
MAILOSAUR_SERVER_ID
is your server ID.
Make sure to add the email as a new user under the Clerk app.
Run the development server:
pnpm dev
Open http://localhost:3000 in your browser to see the app in action.
For Tasks:
Click tags to check more tools for each tasksFor Jobs:
Alternative AI tools for shortest
Similar Open Source Tools

shortest
Shortest is an AI-powered natural language end-to-end testing framework built on Playwright. It provides a seamless testing experience by allowing users to write tests in natural language and execute them using Anthropic Claude API. The framework also offers GitHub integration with 2FA support, making it suitable for testing web applications with complex authentication flows. Shortest simplifies the testing process by enabling users to run tests locally or in CI/CD pipelines, ensuring the reliability and efficiency of web applications.

suno-api
Suno AI API is an open-source project that allows developers to integrate the music generation capabilities of Suno.ai into their own applications. The API provides a simple and convenient way to generate music, lyrics, and other audio content using Suno.ai's powerful AI models. With Suno AI API, developers can easily add music generation functionality to their apps, websites, and other projects.

python-tgpt
Python-tgpt is a Python package that enables seamless interaction with over 45 free LLM providers without requiring an API key. It also provides image generation capabilities. The name _python-tgpt_ draws inspiration from its parent project tgpt, which operates on Golang. Through this Python adaptation, users can effortlessly engage with a number of free LLMs available, fostering a smoother AI interaction experience.

deepgram-js-sdk
Deepgram JavaScript SDK. Power your apps with world-class speech and Language AI models.

llm-vscode
llm-vscode is an extension designed for all things LLM, utilizing llm-ls as its backend. It offers features such as code completion with 'ghost-text' suggestions, the ability to choose models for code generation via HTTP requests, ensuring prompt size fits within the context window, and code attribution checks. Users can configure the backend, suggestion behavior, keybindings, llm-ls settings, and tokenization options. Additionally, the extension supports testing models like Code Llama 13B, Phind/Phind-CodeLlama-34B-v2, and WizardLM/WizardCoder-Python-34B-V1.0. Development involves cloning llm-ls, building it, and setting up the llm-vscode extension for use.

langserve
LangServe helps developers deploy `LangChain` runnables and chains as a REST API. This library is integrated with FastAPI and uses pydantic for data validation. In addition, it provides a client that can be used to call into runnables deployed on a server. A JavaScript client is available in LangChain.js.

hayhooks
Hayhooks is a tool that simplifies the deployment and serving of Haystack pipelines as REST APIs. It allows users to wrap their pipelines with custom logic and expose them via HTTP endpoints, including OpenAI-compatible chat completion endpoints. With Hayhooks, users can easily convert their Haystack pipelines into API services with minimal boilerplate code.

react-native-fast-tflite
A high-performance TensorFlow Lite library for React Native that utilizes JSI for power, zero-copy ArrayBuffers for efficiency, and low-level C/C++ TensorFlow Lite core API for direct memory access. It supports swapping out TensorFlow Models at runtime and GPU-accelerated delegates like CoreML/Metal/OpenGL. Easy VisionCamera integration allows for seamless usage. Users can load TensorFlow Lite models, interpret input and output data, and utilize GPU Delegates for faster computation. The library is suitable for real-time object detection, image classification, and other machine learning tasks in React Native applications.

Gemini-API
Gemini-API is a reverse-engineered asynchronous Python wrapper for Google Gemini web app (formerly Bard). It provides features like persistent cookies, ImageFx support, extension support, classified outputs, official flavor, and asynchronous operation. The tool allows users to generate contents from text or images, have conversations across multiple turns, retrieve images in response, generate images with ImageFx, save images to local files, use Gemini extensions, check and switch reply candidates, and control log level.

aiavatarkit
AIAvatarKit is a tool for building AI-based conversational avatars quickly. It supports various platforms like VRChat and cluster, along with real-world devices. The tool is extensible, allowing unlimited capabilities based on user needs. It requires VOICEVOX API, Google or Azure Speech Services API keys, and Python 3.10. Users can start conversations out of the box and enjoy seamless interactions with the avatars.

ppt2desc
ppt2desc is a command-line tool that converts PowerPoint presentations into detailed textual descriptions using vision language models. It interprets and describes visual elements, capturing the full semantic meaning of each slide in a machine-readable format. The tool supports various model providers and offers features like converting PPT/PPTX files to semantic descriptions, processing individual files or directories, visual elements interpretation, rate limiting for API calls, customizable prompts, and JSON output format for easy integration.

raycast_api_proxy
The Raycast AI Proxy is a tool that acts as a proxy for the Raycast AI application, allowing users to utilize the application without subscribing. It intercepts and forwards Raycast requests to various AI APIs, then reformats the responses for Raycast. The tool supports multiple AI providers and allows for custom model configurations. Users can generate self-signed certificates, add them to the system keychain, and modify DNS settings to redirect requests to the proxy. The tool is designed to work with providers like OpenAI, Azure OpenAI, Google, and more, enabling tasks such as AI chat completions, translations, and image generation.

upgini
Upgini is an intelligent data search engine with a Python library that helps users find and add relevant features to their ML pipeline from various public, community, and premium external data sources. It automates the optimization of connected data sources by generating an optimal set of machine learning features using large language models, GraphNNs, and recurrent neural networks. The tool aims to simplify feature search and enrichment for external data to make it a standard approach in machine learning pipelines. It democratizes access to data sources for the data science community.

aim
Aim is a command-line tool for downloading and uploading files with resume support. It supports various protocols including HTTP, FTP, SFTP, SSH, and S3. Aim features an interactive mode for easy navigation and selection of files, as well as the ability to share folders over HTTP for easy access from other devices. Additionally, it offers customizable progress indicators and output formats, and can be integrated with other commands through piping. Aim can be installed via pre-built binaries or by compiling from source, and is also available as a Docker image for platform-independent usage.

minja
Minja is a minimalistic C++ Jinja templating engine designed specifically for integration with C++ LLM projects, such as llama.cpp or gemma.cpp. It is not a general-purpose tool but focuses on providing a limited set of filters, tests, and language features tailored for chat templates. The library is header-only, requires C++17, and depends only on nlohmann::json. Minja aims to keep the codebase small, easy to understand, and offers decent performance compared to Python. Users should be cautious when using Minja due to potential security risks, and it is not intended for producing HTML or JavaScript output.

airbadge
Airbadge is a Stripe addon for Auth.js that simplifies the process of creating a SaaS site by integrating payment, authentication, gating, self-service account management, webhook handling, trials & free plans, session data, and more. It allows users to launch a SaaS app without writing any authentication or payment code. The project is open source and free to use with optional paid features under the BSL License.
For similar tasks

LLMstudio
LLMstudio by TensorOps is a platform that offers prompt engineering tools for accessing models from providers like OpenAI, VertexAI, and Bedrock. It provides features such as Python Client Gateway, Prompt Editing UI, History Management, and Context Limit Adaptability. Users can track past runs, log costs and latency, and export history to CSV. The tool also supports automatic switching to larger-context models when needed. Coming soon features include side-by-side comparison of LLMs, automated testing, API key administration, project organization, and resilience against rate limits. LLMstudio aims to streamline prompt engineering, provide execution history tracking, and enable effortless data export, offering an evolving environment for teams to experiment with advanced language models.

kaizen
Kaizen is an open-source project that helps teams ensure quality in their software delivery by providing a suite of tools for code review, test generation, and end-to-end testing. It integrates with your existing code repositories and workflows, allowing you to streamline your software development process. Kaizen generates comprehensive end-to-end tests, provides UI testing and review, and automates code review with insightful feedback. The file structure includes components for API server, logic, actors, generators, LLM integrations, documentation, and sample code. Getting started involves installing the Kaizen package, generating tests for websites, and executing tests. The tool also runs an API server for GitHub App actions. Contributions are welcome under the AGPL License.

flux-fine-tuner
This is a Cog training model that creates LoRA-based fine-tunes for the FLUX.1 family of image generation models. It includes features such as automatic image captioning during training, image generation using LoRA, uploading fine-tuned weights to Hugging Face, automated test suite for continuous deployment, and Weights and biases integration. The tool is designed for users to fine-tune Flux models on Replicate for image generation tasks.

shortest
Shortest is an AI-powered natural language end-to-end testing framework built on Playwright. It provides a seamless testing experience by allowing users to write tests in natural language and execute them using Anthropic Claude API. The framework also offers GitHub integration with 2FA support, making it suitable for testing web applications with complex authentication flows. Shortest simplifies the testing process by enabling users to run tests locally or in CI/CD pipelines, ensuring the reliability and efficiency of web applications.

lmstudio-python
LM Studio Python SDK provides a convenient API for interacting with LM Studio instance, including text completion and chat response functionalities. The SDK allows users to manage websocket connections and chat history easily. It also offers tools for code consistency checks, automated testing, and expanding the API.

cursor-tools
cursor-tools is a CLI tool designed to enhance AI agents with advanced skills, such as web search, repository context, documentation generation, GitHub integration, Xcode tools, and browser automation. It provides features like Perplexity for web search, Gemini 2.0 for codebase context, and Stagehand for browser operations. The tool requires API keys for Perplexity AI and Google Gemini, and supports global installation for system-wide access. It offers various commands for different tasks and integrates with Cursor Composer for AI agent usage.

commanddash
Dash AI is an open-source coding assistant for Flutter developers. It is designed to not only write code but also run and debug it, allowing it to assist beyond code completion and automate routine tasks. Dash AI is powered by Gemini, integrated with the Dart Analyzer, and specifically tailored for Flutter engineers. The vision for Dash AI is to create a single-command assistant that can automate tedious development tasks, enabling developers to focus on creativity and innovation. It aims to assist with the entire process of engineering a feature for an app, from breaking down the task into steps to generating exploratory tests and iterating on the code until the feature is complete. To achieve this vision, Dash AI is working on providing LLMs with the same access and information that human developers have, including full contextual knowledge, the latest syntax and dependencies data, and the ability to write, run, and debug code. Dash AI welcomes contributions from the community, including feature requests, issue fixes, and participation in discussions. The project is committed to building a coding assistant that empowers all Flutter developers.

ollama4j
Ollama4j is a Java library that serves as a wrapper or binding for the Ollama server. It facilitates communication with the Ollama server and provides models for deployment. The tool requires Java 11 or higher and can be installed locally or via Docker. Users can integrate Ollama4j into Maven projects by adding the specified dependency. The tool offers API specifications and supports various development tasks such as building, running unit tests, and integration tests. Releases are automated through GitHub Actions CI workflow. Areas of improvement include adhering to Java naming conventions, updating deprecated code, implementing logging, using lombok, and enhancing request body creation. Contributions to the project are encouraged, whether reporting bugs, suggesting enhancements, or contributing code.
For similar jobs

aiscript
AiScript is a lightweight scripting language that runs on JavaScript. It supports arrays, objects, and functions as first-class citizens, and is easy to write without the need for semicolons or commas. AiScript runs in a secure sandbox environment, preventing infinite loops from freezing the host. It also allows for easy provision of variables and functions from the host.

askui
AskUI is a reliable, automated end-to-end automation tool that only depends on what is shown on your screen instead of the technology or platform you are running on.

bots
The 'bots' repository is a collection of guides, tools, and example bots for programming bots to play video games. It provides resources on running bots live, installing the BotLab client, debugging bots, testing bots in simulated environments, and more. The repository also includes example bots for games like EVE Online, Tribal Wars 2, and Elvenar. Users can learn about developing bots for specific games, syntax of the Elm programming language, and tools for memory reading development. Additionally, there are guides on bot programming, contributing to BotLab, and exploring Elm syntax and core library.

ain
Ain is a terminal HTTP API client designed for scripting input and processing output via pipes. It allows flexible organization of APIs using files and folders, supports shell-scripts and executables for common tasks, handles url-encoding, and enables sharing the resulting curl, wget, or httpie command-line. Users can put things that change in environment variables or .env-files, and pipe the API output for further processing. Ain targets users who work with many APIs using a simple file format and uses curl, wget, or httpie to make the actual calls.

LaVague
LaVague is an open-source Large Action Model framework that uses advanced AI techniques to compile natural language instructions into browser automation code. It leverages Selenium or Playwright for browser actions. Users can interact with LaVague through an interactive Gradio interface to automate web interactions. The tool requires an OpenAI API key for default examples and offers a Playwright integration guide. Contributors can help by working on outlined tasks, submitting PRs, and engaging with the community on Discord. The project roadmap is available to track progress, but users should exercise caution when executing LLM-generated code using 'exec'.

robocorp
Robocorp is a platform that allows users to create, deploy, and operate Python automations and AI actions. It provides an easy way to extend the capabilities of AI agents, assistants, and copilots with custom actions written in Python. Users can create and deploy tools, skills, loaders, and plugins that securely connect any AI Assistant platform to their data and applications. The Robocorp Action Server makes Python scripts compatible with ChatGPT and LangChain by automatically creating and exposing an API based on function declaration, type hints, and docstrings. It simplifies the process of developing and deploying AI actions, enabling users to interact with AI frameworks effortlessly.

Open-Interface
Open Interface is a self-driving software that automates computer tasks by sending user requests to a language model backend (e.g., GPT-4V) and simulating keyboard and mouse inputs to execute the steps. It course-corrects by sending current screenshots to the language models. The tool supports MacOS, Linux, and Windows, and requires setting up the OpenAI API key for access to GPT-4V. It can automate tasks like creating meal plans, setting up custom language model backends, and more. Open Interface is currently not efficient in accurate spatial reasoning, tracking itself in tabular contexts, and navigating complex GUI-rich applications. Future improvements aim to enhance the tool's capabilities with better models trained on video walkthroughs. The tool is cost-effective, with user requests priced between $0.05 - $0.20, and offers features like interrupting the app and primary display visibility in multi-monitor setups.

AI-Case-Sorter-CS7.1
AI-Case-Sorter-CS7.1 is a project focused on building a case sorter using machine vision and machine learning AI to sort cases by headstamp. The repository includes Arduino code and 3D models necessary for the project.