auto-playwright
Automating Playwright steps using ChatGPT.
Stars: 298
Auto Playwright is a tool that allows users to run Playwright tests using AI. It eliminates the need for selectors by determining actions at runtime based on plain-text instructions. Users can automate complex scenarios, write tests concurrently with or before functionality development, and benefit from rapid test creation. The tool supports various Playwright actions and offers additional options for debugging and customization. It uses HTML sanitization to reduce costs and improve text quality when interacting with the OpenAI API.
README:
Run Playwright tests using AI.
- Install
auto-playwright
dependency:
npm install auto-playwright -D
- This package relies on talking with OpenAI (https://openai.com/). You must export the API token as an enviroment variable or add it to your
.env
file:
export OPENAI_API_KEY='sk-..."
- Import and use the
auto
function:
import { test, expect } from "@playwright/test";
import { auto } from "auto-playwright";
test("auto Playwright example", async ({ page }) => {
await page.goto("/");
// `auto` can query data
// In this case, the result is plain-text contents of the header
const headerText = await auto("get the header text", { page, test });
// `auto` can perform actions
// In this case, auto will find and fill in the search text input
await auto(`Type "${headerText}" in the search box`, { page, test });
// `auto` can assert the state of the website
// In this case, the result is a boolean outcome
const searchInputHasHeaderText = await auto(`Is the contents of the search box equal to "${headerText}"?`, { page, test });
expect(searchInputHasHeaderText).toBe(true);
});
Include the StepOptions type with the values needed for connecting to Azure OpenAI.
import { test, expect } from "@playwright/test";
import { auto } from "auto-playwright";
import { StepOptions } from "../src/types";
const apiKey = "apikey";
const resource = "azure-resource-name";
const model = "model-deployment-name";
const options: StepOptions = {
model: model,
openaiApiKey: apiKey,
openaiBaseUrl: `https://${resource}.openai.azure.com/openai/deployments/${model}`,
openaiDefaultQuery: { 'api-version': "2023-07-01-preview" },
openaiDefaultHeaders: { 'api-key': apiKey }
};
test("auto Playwright example", async ({ page }) => {
await page.goto("/");
// `auto` can query data
// In this case, the result is plain-text contents of the header
const headerText = await auto("get the header text", { page, test }, options);
// `auto` can perform actions
// In this case, auto will find and fill in the search text input
await auto(`Type "${headerText}" in the search box`, { page, test }, options);
// `auto` can assert the state of the website
// In this case, the result is a boolean outcome
const searchInputHasHeaderText = await auto(`Is the contents of the search box equal to "${headerText}"?`, { page, test }, options);
expect(searchInputHasHeaderText).toBe(true);
});
At minimum, the auto
function requires a plain text prompt and an argument that contains your page
and test
(optional) objects.
auto("<your prompt>", { page, test });
Running without the test
parameter:
import { chromium } from "playwright";
import { auto } from "auto-playwright";
(async () => {
const browser = await chromium.launch({ headless: true });
const context = await browser.newContext();
const page = await context.newPage();
// Navigate to a website
await page.goto("https://www.example.com");
// `auto` can query data
// In this case, the result is plain-text contents of the header
const res = await auto("get the header text", { page });
// use res.query to get a query result.
console.log(res);
await page.close();
})();
You may pass a debug
attribute as the third parameter to the auto
function. This will print the prompt and the commands executed by OpenAI.
await auto("get the header text", { page, test }, { debug: true });
You may also set environment variable AUTO_PLAYWRIGHT_DEBUG=true
, which will enable debugging for all auto
calls.
export AUTO_PLAYWRIGHT_DEBUG=true
Every browser that Playwright supports.
There are additional options you can pass as a third argument:
const options = {
// If true, debugging information is printed in the console.
debug: boolean,
// The OpenAI model (https://platform.openai.com/docs/models/overview)
model: "gpt-4-1106-preview",
// The OpenAI API key
openaiApiKey: 'sk-...',
};
auto("<your prompt>", { page, test }, options);
Depending on the type
of action (inferred by the auto
function), there are different behaviors and return types.
An action (e.g. "click") is some simulated user interaction with the page, e.g. a click on a link. Actions will return `undefined`` if they were successful and will throw an error if they failed, e.g.
try {
await auto("click the link", { page, test });
} catch (e) {
console.error("failed to click the link");
}
A query will return requested data from the page as a string, e.g.
const linkText = await auto("Get the text of the first link", { page, test });
console.log("The link text is", linkText);
An assertion is a question that will return true
or false
, e.g.
const thereAreThreeLinks = await auto("Are there 3 links on the page?", {
page,
test,
});
console.log(`"There are 3 links" is a ${thereAreThreeLinks} statement`);
Aspect | Conventional Approach | Testing with Auto Playwright |
---|---|---|
Coupling with Markup | Strongly linked to the application's markup. | Eliminates the use of selectors; actions are determined by the AI assistant at runtime. |
Speed of Implementation | Slower implementation due to the need for precise code translation for each action. | Rapid test creation using simple, plain text instructions for actions and assertions. |
Handling Complex Scenarios | Automating complex scenarios is challenging and prone to frequent failures. | Facilitates testing of complex scenarios by focusing on the intended test outcomes. |
Test Writing Timing | Can only write tests after the complete development of the functionality. | Enables a Test-Driven Development (TDD) approach, allowing test writing concurrent with or before functionality development. |
locator.blur
locator.boundingBox
locator.check
locator.clear
locator.click
locator.count
locator.fill
locator.getAttribute
locator.innerHTML
locator.innerText
locator.inputValue
locator.isChecked
locator.isEditable
locator.isEnabled
locator.isVisible
locator.textContent
locator.uncheck
page.goto
Adding new actions is easy: just update the functions
in src/completeTask.ts
.
This library is free. However, there are costs associated with using OpenAI. You can find more information about pricing here: https://openai.com/pricing/.
Example
Using https://ray.run/ as an example, the cost of running a test step is approximately $0.01 using GPT-4 Turbo (and $0.001 using GPT-3.5 Turbo).
The low cost is in part because auto-playwright
uses HTML sanitization to reduce the payload size, e.g. What follows is the payload that would be submitted for https://ray.run/.
Naturally, the price will vary dramatically depending on the payload.
<div class="cYdhWw dKnOgO geGbZz bGoBgk jkEels">
<div class="kSmiQp fPSBzf bnYmbW dXscgu xJzwH jTWvec gzBMzy">
<h1 class="fwYeZS fwlORb pdjVK bccLBY fsAQjR fyszFl WNJim fzozfU">
Learn Playwright
</h1>
<h2 class="cakMWc ptfck bBmAxp hSiiri xJzwS gnfYng jTWvec fzozfU">
Resources for learning end-to-end testing using Playwright automation
framework
</h2>
<div
class="bLTbYS gvHvKe cHEBuD ddgODW jsxhGC kdTEUJ ilCTXp iQHbtH yuxBn ilIXfy gPeiPq ivcdqp isDTsq jyZWmS ivdkBK cERSkX hdAwi ezvbLT jNrAaV jsxhGJ fzozCb"
></div>
</div>
<div class="cYdhWw dpjphg cqUdSC fasMpP">
<a
class="gacSWM dCgFix conipm knkqUc bddCnd dTKJOB leOtqz hEzNkW fNBBKe jTWvec fIMbrO fzozfU group"
href="/blog"
><div class="plfYl bccLBY hSiiri fNBpvX">Blog</div>
<div class="jqqjPD fWDXZB pKTba bBmAxp hSiiri evbPEu">
<p>Learn in depth subjects about end-to-end testing.</p>
</div></a
><a
class="gacSWM dCgFix conipm knkqUc bddCnd dTKJOB leOtqz hEzNkW fNBBKe jTWvec fIMbrO fzozfU group"
href="/ask"
><div class="plfYl bccLBY hSiiri fNBpvX">Ask AI</div>
<div class="jqqjPD fWDXZB pKTba bBmAxp hSiiri evbPEu">
<p>Ask ChatGPT Playwright questions.</p>
</div></a
><a
class="gacSWM dCgFix conipm knkqUc bddCnd dTKJOB leOtqz hEzNkW fNBBKe jTWvec fIMbrO fzozfU group"
href="/tools"
><div class="plfYl bccLBY hSiiri fNBpvX">Dev Tools</div>
<div class="jqqjPD fWDXZB pKTba bBmAxp hSiiri evbPEu">
<p>All-in-one toolbox for QA engineers.</p>
</div></a
><a
class="gacSWM dCgFix conipm knkqUc bddCnd dTKJOB leOtqz hEzNkW fNBBKe jTWvec fIMbrO fzozfU group"
href="/jobs"
><div class="plfYl bccLBY hSiiri fNBpvX">QA Jobs</div>
<div class="jqqjPD fWDXZB pKTba bBmAxp hSiiri evbPEu">
<p>Handpicked QA and Automation opportunities.</p>
</div></a
><a
class="gacSWM dCgFix conipm knkqUc bddCnd dTKJOB leOtqz hEzNkW fNBBKe jTWvec fIMbrO fzozfU group"
href="/questions"
><div class="plfYl bccLBY hSiiri fNBpvX">Questions</div>
<div class="jqqjPD fWDXZB pKTba bBmAxp hSiiri evbPEu">
<p>Ask AI answered questions about Playwright.</p>
</div></a
><a
class="gacSWM dCgFix conipm knkqUc bddCnd dTKJOB leOtqz hEzNkW fNBBKe jTWvec fIMbrO fzozfU group"
href="/discord-forum"
><div class="plfYl bccLBY hSiiri fNBpvX">Discord Forum</div>
<div class="jqqjPD fWDXZB pKTba bBmAxp hSiiri evbPEu">
<p>Archive of Discord Forum posts about Playwright.</p>
</div></a
><a
class="gacSWM dCgFix conipm knkqUc bddCnd dTKJOB leOtqz hEzNkW fNBBKe jTWvec fIMbrO fzozfU group"
href="/videos"
><div class="plfYl bccLBY hSiiri fNBpvX">Videos</div>
<div class="jqqjPD fWDXZB pKTba bBmAxp hSiiri evbPEu">
<p>Tutorials, conference talks, and release videos.</p>
</div></a
><a
class="gacSWM dCgFix conipm knkqUc bddCnd dTKJOB leOtqz hEzNkW fNBBKe jTWvec fIMbrO fzozfU group"
href="/browser-extension"
><div class="plfYl bccLBY hSiiri fNBpvX">Browser Extension</div>
<div class="jqqjPD fWDXZB pKTba bBmAxp hSiiri evbPEu">
<p>GUI for generating Playwright locators.</p>
</div></a
><a
class="gacSWM dCgFix conipm knkqUc bddCnd dTKJOB leOtqz hEzNkW fNBBKe jTWvec fIMbrO fzozfU group"
href="/wiki"
><div class="plfYl bccLBY hSiiri fNBpvX">QA Wiki</div>
<div class="jqqjPD fWDXZB pKTba bBmAxp hSiiri evbPEu">
<p>Definitions of common end-to-end testing terms.</p>
</div></a
>
</div>
<div
class="kSmiQp fPSBzf pKTba eTDpsp legDhJ hSiiri hdaZLM jTWvec gzBMzy bGySga fzoybr"
>
<p class="dXhlDK leOtqz glpWRZ fNCcFz">
Use <kbd class="bWhrAL XAzZz cakMWc bUyOMB bmOrOm fyszFl dTmriP">⌘</kbd> +
<kbd>k</kbd> + "Tools" to quickly access all tools.
</p>
</div>
</div>
The auto
function uses sanitize-html to sanitize the HTML of the page before sending it to OpenAI. This is done to reduce cost and improve the quality of the generated text.
This project draws its inspiration from ZeroStep. ZeroStep offers a similar API but with a more robust implementation through its proprietary backend. Auto Playwright was created with the aim of exploring the underlying technology of ZeroStep and establishing a basis for an open-source version of their software. For production environments, I suggest opting for ZeroStep.
Here's a side-by-side comparison of Auto Playwright and ZeroStep:
Criteria | Auto Playwright | ZeroStep |
---|---|---|
Uses OpenAI API | Yes | No1 |
Uses plain-text prompts | Yes | No |
Uses functions SDK |
Yes | No |
Uses HTML sanitization | Yes | No |
Uses Playwright API | Yes | No2 |
Uses screenshots | No | Yes |
Uses queue | No | Yes |
Uses WebSockets | No | Yes |
Snapshots | HTML | DOM |
Implements parallelism | No | Yes |
Allows scrolling | No | Yes |
Provides fixtures | No | Yes |
License | MIT | MIT |
Zero Step License
MIT License
Copyright (c) 2023 Reflect Software Inc
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
For Tasks:
Click tags to check more tools for each tasksFor Jobs:
Alternative AI tools for auto-playwright
Similar Open Source Tools
auto-playwright
Auto Playwright is a tool that allows users to run Playwright tests using AI. It eliminates the need for selectors by determining actions at runtime based on plain-text instructions. Users can automate complex scenarios, write tests concurrently with or before functionality development, and benefit from rapid test creation. The tool supports various Playwright actions and offers additional options for debugging and customization. It uses HTML sanitization to reduce costs and improve text quality when interacting with the OpenAI API.
monacopilot
Monacopilot is a powerful and customizable AI auto-completion plugin for the Monaco Editor. It supports multiple AI providers such as Anthropic, OpenAI, Groq, and Google, providing real-time code completions with an efficient caching system. The plugin offers context-aware suggestions, customizable completion behavior, and framework agnostic features. Users can also customize the model support and trigger completions manually. Monacopilot is designed to enhance coding productivity by providing accurate and contextually appropriate completions in daily spoken language.
magentic
Easily integrate Large Language Models into your Python code. Simply use the `@prompt` and `@chatprompt` decorators to create functions that return structured output from the LLM. Mix LLM queries and function calling with regular Python code to create complex logic.
gen.nvim
gen.nvim is a tool that allows users to generate text using Language Models (LLMs) with customizable prompts. It requires Ollama with models like `llama3`, `mistral`, or `zephyr`, along with Curl for installation. Users can use the `Gen` command to generate text based on predefined or custom prompts. The tool provides key maps for easy invocation and allows for follow-up questions during conversations. Additionally, users can select a model from a list of installed models and customize prompts as needed.
SpeziLLM
The Spezi LLM Swift Package includes modules that help integrate LLM-related functionality in applications. It provides tools for local LLM execution, usage of remote OpenAI-based LLMs, and LLMs running on Fog node resources within the local network. The package contains targets like SpeziLLM, SpeziLLMLocal, SpeziLLMLocalDownload, SpeziLLMOpenAI, and SpeziLLMFog for different LLM functionalities. Users can configure and interact with local LLMs, OpenAI LLMs, and Fog LLMs using the provided APIs and platforms within the Spezi ecosystem.
neocodeium
NeoCodeium is a free AI completion plugin powered by Codeium, designed for Neovim users. It aims to provide a smoother experience by eliminating flickering suggestions and allowing for repeatable completions using the `.` key. The plugin offers performance improvements through cache techniques, displays suggestion count labels, and supports Lua scripting. Users can customize keymaps, manage suggestions, and interact with the AI chat feature. NeoCodeium enhances code completion in Neovim, making it a valuable tool for developers seeking efficient coding assistance.
datadreamer
DataDreamer is an advanced toolkit designed to facilitate the development of edge AI models by enabling synthetic data generation, knowledge extraction from pre-trained models, and creation of efficient and potent models. It eliminates the need for extensive datasets by generating synthetic datasets, leverages latent knowledge from pre-trained models, and focuses on creating compact models suitable for integration into any device and performance for specialized tasks. The toolkit offers features like prompt generation, image generation, dataset annotation, and tools for training small-scale neural networks for edge deployment. It provides hardware requirements, usage instructions, available models, and limitations to consider while using the library.
stagehand
Stagehand is an AI web browsing framework that simplifies and extends web automation using three simple APIs: act, extract, and observe. It aims to provide a lightweight, configurable framework without complex abstractions, allowing users to automate web tasks reliably. The tool generates Playwright code based on atomic instructions provided by the user, enabling natural language-driven web automation. Stagehand is open source, maintained by the Browserbase team, and supports different models and model providers for flexibility in automation tasks.
extractor
Extractor is an AI-powered data extraction library for Laravel that leverages OpenAI's capabilities to effortlessly extract structured data from various sources, including images, PDFs, and emails. It features a convenient wrapper around OpenAI Chat and Completion endpoints, supports multiple input formats, includes a flexible Field Extractor for arbitrary data extraction, and integrates with Textract for OCR functionality. Extractor utilizes JSON Mode from the latest GPT-3.5 and GPT-4 models, providing accurate and efficient data extraction.
crb
CRB (Composable Runtime Blocks) is a unique framework that implements hybrid workloads by seamlessly combining synchronous and asynchronous activities, state machines, routines, the actor model, and supervisors. It is ideal for building massive applications and serves as a low-level framework for creating custom frameworks, such as AI-agents. The core idea is to ensure high compatibility among all blocks, enabling significant code reuse. The framework allows for the implementation of algorithms with complex branching, making it suitable for building large-scale applications or implementing complex workflows, such as AI pipelines. It provides flexibility in defining structures, implementing traits, and managing execution flow, allowing users to create robust and nonlinear algorithms easily.
aire
Aire is a modern Laravel form builder with a focus on expressive and beautiful code. It allows easy configuration of form components using fluent method calls or Blade components. Aire supports customization through config files and custom views, data binding with Eloquent models or arrays, method spoofing, CSRF token injection, server-side and client-side validation, and translations. It is designed to run on Laravel 5.8.28 and higher, with support for PHP 7.1 and higher. Aire is actively maintained and under consideration for additional features like read-only plain text, cross-browser support for custom checkboxes and radio buttons, support for Choices.js or similar libraries, improved file input handling, and better support for content prepending or appending to inputs.
ice-score
ICE-Score is a tool designed to instruct large language models to evaluate code. It provides a minimum viable product (MVP) for evaluating generated code snippets using inputs such as problem, output, task, aspect, and model. Users can also evaluate with reference code and enable zero-shot chain-of-thought evaluation. The tool is built on codegen-metrics and code-bert-score repositories and includes datasets like CoNaLa and HumanEval. ICE-Score has been accepted to EACL 2024.
mflux
MFLUX is a line-by-line port of the FLUX implementation in the Huggingface Diffusers library to Apple MLX. It aims to run powerful FLUX models from Black Forest Labs locally on Mac machines. The codebase is minimal and explicit, prioritizing readability over generality and performance. Models are implemented from scratch in MLX, with tokenizers from the Huggingface Transformers library. Dependencies include Numpy and Pillow for image post-processing. Installation can be done using `uv tool` or classic virtual environment setup. Command-line arguments allow for image generation with specified models, prompts, and optional parameters. Quantization options for speed and memory reduction are available. LoRA adapters can be loaded for fine-tuning image generation. Controlnet support provides more control over image generation with reference images. Current limitations include generating images one by one, lack of support for negative prompts, and some LoRA adapters not working.
phidata
Phidata is a framework for building AI Assistants with memory, knowledge, and tools. It enables LLMs to have long-term conversations by storing chat history in a database, provides them with business context by storing information in a vector database, and enables them to take actions like pulling data from an API, sending emails, or querying a database. Memory and knowledge make LLMs smarter, while tools make them autonomous.
deepgram-js-sdk
Deepgram JavaScript SDK. Power your apps with world-class speech and Language AI models.
raft
RAFT (Reusable Accelerated Functions and Tools) is a C++ header-only template library with an optional shared library that contains fundamental widely-used algorithms and primitives for machine learning and information retrieval. The algorithms are CUDA-accelerated and form building blocks for more easily writing high performance applications.
For similar tasks
auto-playwright
Auto Playwright is a tool that allows users to run Playwright tests using AI. It eliminates the need for selectors by determining actions at runtime based on plain-text instructions. Users can automate complex scenarios, write tests concurrently with or before functionality development, and benefit from rapid test creation. The tool supports various Playwright actions and offers additional options for debugging and customization. It uses HTML sanitization to reduce costs and improve text quality when interacting with the OpenAI API.
awesome-mcp-servers
Awesome MCP Servers is a curated list of Model Context Protocol (MCP) servers that enable AI models to securely interact with local and remote resources through standardized server implementations. The list includes production-ready and experimental servers that extend AI capabilities through file access, database connections, API integrations, and other contextual services.
For similar jobs
auto-playwright
Auto Playwright is a tool that allows users to run Playwright tests using AI. It eliminates the need for selectors by determining actions at runtime based on plain-text instructions. Users can automate complex scenarios, write tests concurrently with or before functionality development, and benefit from rapid test creation. The tool supports various Playwright actions and offers additional options for debugging and customization. It uses HTML sanitization to reduce costs and improve text quality when interacting with the OpenAI API.
CodebaseToPrompt
CodebaseToPrompt is a simple tool that converts a local directory into a structured prompt for Large Language Models (LLMs). It allows users to select specific files for code review, analysis, or documentation by exploring and filtering through the file tree in a browser-based interface. The tool generates a formatted output that can be directly used with AI tools, provides token count estimates, and supports local storage for saving selections. Users can easily copy the selected files in the desired format for further use.
PromptFuzz
**Description:** PromptFuzz is an automated tool that generates high-quality fuzz drivers for libraries via a fuzz loop constructed on mutating LLMs' prompts. The fuzz loop of PromptFuzz aims to guide the mutation of LLMs' prompts to generate programs that cover more reachable code and explore complex API interrelationships, which are effective for fuzzing. **Features:** * **Multiply LLM support** : Supports the general LLMs: Codex, Inocder, ChatGPT, and GPT4 (Currently tested on ChatGPT). * **Context-based Prompt** : Construct LLM prompts with the automatically extracted library context. * **Powerful Sanitization** : The program's syntax, semantics, behavior, and coverage are thoroughly analyzed to sanitize the problematic programs. * **Prioritized Mutation** : Prioritizes mutating the library API combinations within LLM's prompts to explore complex interrelationships, guided by code coverage. * **Fuzz Driver Exploitation** : Infers API constraints using statistics and extends fixed API arguments to receive random bytes from fuzzers. * **Fuzz engine integration** : Integrates with grey-box fuzz engine: LibFuzzer. **Benefits:** * **High branch coverage:** The fuzz drivers generated by PromptFuzz achieved a branch coverage of 40.12% on the tested libraries, which is 1.61x greater than _OSS-Fuzz_ and 1.67x greater than _Hopper_. * **Bug detection:** PromptFuzz detected 33 valid security bugs from 49 unique crashes. * **Wide range of bugs:** The fuzz drivers generated by PromptFuzz can detect a wide range of bugs, most of which are security bugs. * **Unique bugs:** PromptFuzz detects uniquely interesting bugs that other fuzzers may miss. **Usage:** 1. Build the library using the provided build scripts. 2. Export the LLM API KEY if using ChatGPT or GPT4. 3. Generate fuzz drivers using the `fuzzer` command. 4. Run the fuzz drivers using the `harness` command. 5. Deduplicate and analyze the reported crashes. **Future Works:** * **Custom LLMs suport:** Support custom LLMs. * **Close-source libraries:** Apply PromptFuzz to close-source libraries by fine tuning LLMs on private code corpus. * **Performance** : Reduce the huge time cost required in erroneous program elimination.
code-review-gpt
Code Review GPT uses Large Language Models to review code in your CI/CD pipeline. It helps streamline the code review process by providing feedback on code that may have issues or areas for improvement. It should pick up on common issues such as exposed secrets, slow or inefficient code, and unreadable code. It can also be run locally in your command line to review staged files. Code Review GPT is in alpha and should be used for fun only. It may provide useful feedback but please check any suggestions thoroughly.
aiverify
AI Verify is an AI governance testing framework and software toolkit that validates the performance of AI systems against a set of internationally recognised principles through standardised tests. AI Verify is consistent with international AI governance frameworks such as those from European Union, OECD and Singapore. It is a single integrated toolkit that operates within an enterprise environment. It can perform technical tests on common supervised learning classification and regression models for most tabular and image datasets. It however does not define AI ethical standards and does not guarantee that any AI system tested will be free from risks or biases or is completely safe.
cover-agent
CodiumAI Cover Agent is a tool designed to help increase code coverage by automatically generating qualified tests to enhance existing test suites. It utilizes Generative AI to streamline development workflows and is part of a suite of utilities aimed at automating the creation of unit tests for software projects. The system includes components like Test Runner, Coverage Parser, Prompt Builder, and AI Caller to simplify and expedite the testing process, ensuring high-quality software development. Cover Agent can be run via a terminal and is planned to be integrated into popular CI platforms. The tool outputs debug files locally, such as generated_prompt.md, run.log, and test_results.html, providing detailed information on generated tests and their status. It supports multiple LLMs and allows users to specify the model to use for test generation.
momentum-core
Momentum is an open-source behavioral auditor for backend code that helps developers generate powerful insights into their codebase. It analyzes code behavior, tests it at every git push, and ensures readiness for production. Momentum understands backend code, visualizes dependencies, identifies behaviors, generates test code, runs code in the local environment, and provides debugging solutions. It aims to improve code quality, streamline testing processes, and enhance developer productivity.
mutahunter
Mutahunter is an open-source language-agnostic mutation testing tool maintained by CodeIntegrity. It leverages LLM models to inject context-aware faults into codebase, ensuring comprehensive testing. The tool aims to empower companies and developers to enhance test suites and improve software quality by verifying the effectiveness of test cases through creating mutants in the code and checking if the test cases can catch these changes. Mutahunter provides detailed reports on mutation coverage, killed mutants, and survived mutants, enabling users to identify potential weaknesses in their test suites.