auto-playwright

Automating Playwright steps using ChatGPT.

Stars: 298

Visit

Auto Playwright is a tool that allows users to run Playwright tests using AI. It eliminates the need for selectors by determining actions at runtime based on plain-text instructions. Users can automate complex scenarios, write tests concurrently with or before functionality development, and benefit from rapid test creation. The tool supports various Playwright actions and offers additional options for debugging and customization. It uses HTML sanitization to reduce costs and improve text quality when interacting with the OpenAI API.

README:

Auto Playwright

Run Playwright tests using AI.

Setup

Install auto-playwright dependency:

npm install auto-playwright -D

This package relies on talking with OpenAI (https://openai.com/). You must export the API token as an enviroment variable or add it to your .env file:

export OPENAI_API_KEY='sk-..."

Import and use the auto function:

import { test, expect } from "@playwright/test";
import { auto } from "auto-playwright";

test("auto Playwright example", async ({ page }) => {
  await page.goto("/");

  // `auto` can query data
  // In this case, the result is plain-text contents of the header
  const headerText = await auto("get the header text", { page, test });

  // `auto` can perform actions
  // In this case, auto will find and fill in the search text input
  await auto(`Type "${headerText}" in the search box`, { page, test });

  // `auto` can assert the state of the website
  // In this case, the result is a boolean outcome
  const searchInputHasHeaderText = await auto(`Is the contents of the search box equal to "${headerText}"?`, { page, test });

  expect(searchInputHasHeaderText).toBe(true);
});

Setup with Azure OpenAI

Include the StepOptions type with the values needed for connecting to Azure OpenAI.

import { test, expect } from "@playwright/test";
import { auto } from "auto-playwright";
import { StepOptions } from "../src/types";

const apiKey = "apikey";
const resource = "azure-resource-name";
const model = "model-deployment-name";

const options: StepOptions = {
   model: model,
   openaiApiKey: apiKey,
   openaiBaseUrl: `https://${resource}.openai.azure.com/openai/deployments/${model}`,
   openaiDefaultQuery: { 'api-version': "2023-07-01-preview" },
   openaiDefaultHeaders: { 'api-key': apiKey }
};

test("auto Playwright example", async ({ page }) => {
  await page.goto("/");

  // `auto` can query data
  // In this case, the result is plain-text contents of the header
  const headerText = await auto("get the header text", { page, test }, options);

  // `auto` can perform actions
  // In this case, auto will find and fill in the search text input
  await auto(`Type "${headerText}" in the search box`, { page, test }, options);

  // `auto` can assert the state of the website
  // In this case, the result is a boolean outcome
  const searchInputHasHeaderText = await auto(`Is the contents of the search box equal to "${headerText}"?`, { page, test }, options);

  expect(searchInputHasHeaderText).toBe(true);
});

Usage

At minimum, the auto function requires a plain text prompt and an argument that contains your page and test (optional) objects.

auto("<your prompt>", { page, test });

Browser automation

Running without the test parameter:

import { chromium } from "playwright";
import { auto } from "auto-playwright";

(async () => {
  const browser = await chromium.launch({ headless: true });
  const context = await browser.newContext();
  const page = await context.newPage();
  // Navigate to a website
  await page.goto("https://www.example.com");

  // `auto` can query data
  // In this case, the result is plain-text contents of the header
  const res = await auto("get the header text", { page });

  // use res.query to get a query result.
  console.log(res);
  await page.close();
})();

Debug

You may pass a debug attribute as the third parameter to the auto function. This will print the prompt and the commands executed by OpenAI.

await auto("get the header text", { page, test }, { debug: true });

You may also set environment variable AUTO_PLAYWRIGHT_DEBUG=true, which will enable debugging for all auto calls.

export AUTO_PLAYWRIGHT_DEBUG=true

Supported Browsers

Every browser that Playwright supports.

Additional Options

There are additional options you can pass as a third argument:

const options = {
  // If true, debugging information is printed in the console.
  debug: boolean,
  // The OpenAI model (https://platform.openai.com/docs/models/overview)
  model: "gpt-4-1106-preview",
  // The OpenAI API key
  openaiApiKey: 'sk-...',
};

auto("<your prompt>", { page, test }, options);

Supported Actions & Return Values

Depending on the type of action (inferred by the auto function), there are different behaviors and return types.

Action

An action (e.g. "click") is some simulated user interaction with the page, e.g. a click on a link. Actions will return `undefined`` if they were successful and will throw an error if they failed, e.g.

try {
  await auto("click the link", { page, test });
} catch (e) {
  console.error("failed to click the link");
}

Query

A query will return requested data from the page as a string, e.g.

const linkText = await auto("Get the text of the first link", { page, test });

console.log("The link text is", linkText);

Assert

An assertion is a question that will return true or false, e.g.

const thereAreThreeLinks = await auto("Are there 3 links on the page?", {
  page,
  test,
});

console.log(`"There are 3 links" is a ${thereAreThreeLinks} statement`);

Why use Auto Playwright?

Aspect	Conventional Approach	Testing with Auto Playwright
Coupling with Markup	Strongly linked to the application's markup.	Eliminates the use of selectors; actions are determined by the AI assistant at runtime.
Speed of Implementation	Slower implementation due to the need for precise code translation for each action.	Rapid test creation using simple, plain text instructions for actions and assertions.
Handling Complex Scenarios	Automating complex scenarios is challenging and prone to frequent failures.	Facilitates testing of complex scenarios by focusing on the intended test outcomes.
Test Writing Timing	Can only write tests after the complete development of the functionality.	Enables a Test-Driven Development (TDD) approach, allowing test writing concurrent with or before functionality development.

Supported Playwright Actions

locator.blur
locator.boundingBox
locator.check
locator.clear
locator.click
locator.count
locator.fill
locator.getAttribute
locator.innerHTML
locator.innerText
locator.inputValue
locator.isChecked
locator.isEditable
locator.isEnabled
locator.isVisible
locator.textContent
locator.uncheck
page.goto

Adding new actions is easy: just update the functions in src/completeTask.ts.

Pricing

This library is free. However, there are costs associated with using OpenAI. You can find more information about pricing here: https://openai.com/pricing/.

Example

Using https://ray.run/ as an example, the cost of running a test step is approximately $0.01 using GPT-4 Turbo (and $0.001 using GPT-3.5 Turbo).

The low cost is in part because auto-playwright uses HTML sanitization to reduce the payload size, e.g. What follows is the payload that would be submitted for https://ray.run/.

Naturally, the price will vary dramatically depending on the payload.

<div class="cYdhWw dKnOgO geGbZz bGoBgk jkEels">
  <div class="kSmiQp fPSBzf bnYmbW dXscgu xJzwH jTWvec gzBMzy">
    <h1 class="fwYeZS fwlORb pdjVK bccLBY fsAQjR fyszFl WNJim fzozfU">
      Learn Playwright
    </h1>
    <h2 class="cakMWc ptfck bBmAxp hSiiri xJzwS gnfYng jTWvec fzozfU">
      Resources for learning end-to-end testing using Playwright automation
      framework
    </h2>
    <div
      class="bLTbYS gvHvKe cHEBuD ddgODW jsxhGC kdTEUJ ilCTXp iQHbtH yuxBn ilIXfy gPeiPq ivcdqp isDTsq jyZWmS ivdkBK cERSkX hdAwi ezvbLT jNrAaV jsxhGJ fzozCb"
    ></div>
  </div>
  <div class="cYdhWw dpjphg cqUdSC fasMpP">
    <a
      class="gacSWM dCgFix conipm knkqUc bddCnd dTKJOB leOtqz hEzNkW fNBBKe jTWvec fIMbrO fzozfU group"
      href="/blog"
      ><div class="plfYl bccLBY hSiiri fNBpvX">Blog</div>
      <div class="jqqjPD fWDXZB pKTba bBmAxp hSiiri evbPEu">
        <p>Learn in depth subjects about end-to-end testing.</p>
      </div></a
    ><a
      class="gacSWM dCgFix conipm knkqUc bddCnd dTKJOB leOtqz hEzNkW fNBBKe jTWvec fIMbrO fzozfU group"
      href="/ask"
      ><div class="plfYl bccLBY hSiiri fNBpvX">Ask AI</div>
      <div class="jqqjPD fWDXZB pKTba bBmAxp hSiiri evbPEu">
        <p>Ask ChatGPT Playwright questions.</p>
      </div></a
    ><a
      class="gacSWM dCgFix conipm knkqUc bddCnd dTKJOB leOtqz hEzNkW fNBBKe jTWvec fIMbrO fzozfU group"
      href="/tools"
      ><div class="plfYl bccLBY hSiiri fNBpvX">Dev Tools</div>
      <div class="jqqjPD fWDXZB pKTba bBmAxp hSiiri evbPEu">
        <p>All-in-one toolbox for QA engineers.</p>
      </div></a
    ><a
      class="gacSWM dCgFix conipm knkqUc bddCnd dTKJOB leOtqz hEzNkW fNBBKe jTWvec fIMbrO fzozfU group"
      href="/jobs"
      ><div class="plfYl bccLBY hSiiri fNBpvX">QA Jobs</div>
      <div class="jqqjPD fWDXZB pKTba bBmAxp hSiiri evbPEu">
        <p>Handpicked QA and Automation opportunities.</p>
      </div></a
    ><a
      class="gacSWM dCgFix conipm knkqUc bddCnd dTKJOB leOtqz hEzNkW fNBBKe jTWvec fIMbrO fzozfU group"
      href="/questions"
      ><div class="plfYl bccLBY hSiiri fNBpvX">Questions</div>
      <div class="jqqjPD fWDXZB pKTba bBmAxp hSiiri evbPEu">
        <p>Ask AI answered questions about Playwright.</p>
      </div></a
    ><a
      class="gacSWM dCgFix conipm knkqUc bddCnd dTKJOB leOtqz hEzNkW fNBBKe jTWvec fIMbrO fzozfU group"
      href="/discord-forum"
      ><div class="plfYl bccLBY hSiiri fNBpvX">Discord Forum</div>
      <div class="jqqjPD fWDXZB pKTba bBmAxp hSiiri evbPEu">
        <p>Archive of Discord Forum posts about Playwright.</p>
      </div></a
    ><a
      class="gacSWM dCgFix conipm knkqUc bddCnd dTKJOB leOtqz hEzNkW fNBBKe jTWvec fIMbrO fzozfU group"
      href="/videos"
      ><div class="plfYl bccLBY hSiiri fNBpvX">Videos</div>
      <div class="jqqjPD fWDXZB pKTba bBmAxp hSiiri evbPEu">
        <p>Tutorials, conference talks, and release videos.</p>
      </div></a
    ><a
      class="gacSWM dCgFix conipm knkqUc bddCnd dTKJOB leOtqz hEzNkW fNBBKe jTWvec fIMbrO fzozfU group"
      href="/browser-extension"
      ><div class="plfYl bccLBY hSiiri fNBpvX">Browser Extension</div>
      <div class="jqqjPD fWDXZB pKTba bBmAxp hSiiri evbPEu">
        <p>GUI for generating Playwright locators.</p>
      </div></a
    ><a
      class="gacSWM dCgFix conipm knkqUc bddCnd dTKJOB leOtqz hEzNkW fNBBKe jTWvec fIMbrO fzozfU group"
      href="/wiki"
      ><div class="plfYl bccLBY hSiiri fNBpvX">QA Wiki</div>
      <div class="jqqjPD fWDXZB pKTba bBmAxp hSiiri evbPEu">
        <p>Definitions of common end-to-end testing terms.</p>
      </div></a
    >
  </div>
  <div
    class="kSmiQp fPSBzf pKTba eTDpsp legDhJ hSiiri hdaZLM jTWvec gzBMzy bGySga fzoybr"
  >
    <p class="dXhlDK leOtqz glpWRZ fNCcFz">
      Use <kbd class="bWhrAL XAzZz cakMWc bUyOMB bmOrOm fyszFl dTmriP">⌘</kbd> +
      <kbd>k</kbd> + "Tools" to quickly access all tools.
    </p>
  </div>
</div>

Implementation

HTML Sanitization

The auto function uses sanitize-html to sanitize the HTML of the page before sending it to OpenAI. This is done to reduce cost and improve the quality of the generated text.

ZeroStep

This project draws its inspiration from ZeroStep. ZeroStep offers a similar API but with a more robust implementation through its proprietary backend. Auto Playwright was created with the aim of exploring the underlying technology of ZeroStep and establishing a basis for an open-source version of their software. For production environments, I suggest opting for ZeroStep.

Here's a side-by-side comparison of Auto Playwright and ZeroStep:

Criteria	Auto Playwright	ZeroStep
Uses OpenAI API	Yes	No¹
Uses plain-text prompts	Yes	No
Uses `functions` SDK	Yes	No
Uses HTML sanitization	Yes	No
Uses Playwright API	Yes	No²
Uses screenshots	No	Yes
Uses queue	No	Yes
Uses WebSockets	No	Yes
Snapshots	HTML	DOM
Implements parallelism	No	Yes
Allows scrolling	No	Yes
Provides fixtures	No	Yes
License	MIT	MIT

Zero Step License

MIT License

Copyright (c) 2023 Reflect Software Inc

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

Uses ZeroStep proprietary API. ↩
Uses some Playwright API, but predominantly relies on Chrome DevTools Protocol (CDP). ↩

For Tasks:

Click tags to check more tools for each tasks

automate browser actions create tests quickly debug test scripts handle complex scenarios write tests before development

For Jobs:

quality assurance engineer automation tester software developer qa analyst test automation engineer

Alternative AI tools for auto-playwright

Similar Open Source Tools

auto-playwright

github

: 298

monacopilot

Monacopilot is a powerful and customizable AI auto-completion plugin for the Monaco Editor. It supports multiple AI providers such as Anthropic, OpenAI, Groq, and Google, providing real-time code completions with an efficient caching system. The plugin offers context-aware suggestions, customizable completion behavior, and framework agnostic features. Users can also customize the model support and trigger completions manually. Monacopilot is designed to enhance coding productivity by providing accurate and contextually appropriate completions in daily spoken language.

github

: 111

magentic

Easily integrate Large Language Models into your Python code. Simply use the `@prompt` and `@chatprompt` decorators to create functions that return structured output from the LLM. Mix LLM queries and function calling with regular Python code to create complex logic.

github

: 2.1k

gen.nvim

gen.nvim is a tool that allows users to generate text using Language Models (LLMs) with customizable prompts. It requires Ollama with models like `llama3`, `mistral`, or `zephyr`, along with Curl for installation. Users can use the `Gen` command to generate text based on predefined or custom prompts. The tool provides key maps for easy invocation and allows for follow-up questions during conversations. Additionally, users can select a model from a list of installed models and customize prompts as needed.

github

: 1.1k

SpeziLLM

The Spezi LLM Swift Package includes modules that help integrate LLM-related functionality in applications. It provides tools for local LLM execution, usage of remote OpenAI-based LLMs, and LLMs running on Fog node resources within the local network. The package contains targets like SpeziLLM, SpeziLLMLocal, SpeziLLMLocalDownload, SpeziLLMOpenAI, and SpeziLLMFog for different LLM functionalities. Users can configure and interact with local LLMs, OpenAI LLMs, and Fog LLMs using the provided APIs and platforms within the Spezi ecosystem.

github

: 131

neocodeium

NeoCodeium is a free AI completion plugin powered by Codeium, designed for Neovim users. It aims to provide a smoother experience by eliminating flickering suggestions and allowing for repeatable completions using the `.` key. The plugin offers performance improvements through cache techniques, displays suggestion count labels, and supports Lua scripting. Users can customize keymaps, manage suggestions, and interact with the AI chat feature. NeoCodeium enhances code completion in Neovim, making it a valuable tool for developers seeking efficient coding assistance.

github

: 160

upgini

Upgini is an intelligent data search engine with a Python library that helps users find and add relevant features to their ML pipeline from various public, community, and premium external data sources. It automates the optimization of connected data sources by generating an optimal set of machine learning features using large language models, GraphNNs, and recurrent neural networks. The tool aims to simplify feature search and enrichment for external data to make it a standard approach in machine learning pipelines. It democratizes access to data sources for the data science community.

github

: 322

datadreamer

DataDreamer is an advanced toolkit designed to facilitate the development of edge AI models by enabling synthetic data generation, knowledge extraction from pre-trained models, and creation of efficient and potent models. It eliminates the need for extensive datasets by generating synthetic datasets, leverages latent knowledge from pre-trained models, and focuses on creating compact models suitable for integration into any device and performance for specialized tasks. The toolkit offers features like prompt generation, image generation, dataset annotation, and tools for training small-scale neural networks for edge deployment. It provides hardware requirements, usage instructions, available models, and limitations to consider while using the library.

github

: 77

minja

Minja is a minimalistic C++ Jinja templating engine designed specifically for integration with C++ LLM projects, such as llama.cpp or gemma.cpp. It is not a general-purpose tool but focuses on providing a limited set of filters, tests, and language features tailored for chat templates. The library is header-only, requires C++17, and depends only on nlohmann::json. Minja aims to keep the codebase small, easy to understand, and offers decent performance compared to Python. Users should be cautious when using Minja due to potential security risks, and it is not intended for producing HTML or JavaScript output.

github

: 102

aire

Aire is a modern Laravel form builder with a focus on expressive and beautiful code. It allows easy configuration of form components using fluent method calls or Blade components. Aire supports customization through config files and custom views, data binding with Eloquent models or arrays, method spoofing, CSRF token injection, server-side and client-side validation, and translations. It is designed to run on Laravel 5.8.28 and higher, with support for PHP 7.1 and higher. Aire is actively maintained and under consideration for additional features like read-only plain text, cross-browser support for custom checkboxes and radio buttons, support for Choices.js or similar libraries, improved file input handling, and better support for content prepending or appending to inputs.

github

: 537

ice-score

ICE-Score is a tool designed to instruct large language models to evaluate code. It provides a minimum viable product (MVP) for evaluating generated code snippets using inputs such as problem, output, task, aspect, and model. Users can also evaluate with reference code and enable zero-shot chain-of-thought evaluation. The tool is built on codegen-metrics and code-bert-score repositories and includes datasets like CoNaLa and HumanEval. ICE-Score has been accepted to EACL 2024.

github

: 62

mflux

MFLUX is a line-by-line port of the FLUX implementation in the Huggingface Diffusers library to Apple MLX. It aims to run powerful FLUX models from Black Forest Labs locally on Mac machines. The codebase is minimal and explicit, prioritizing readability over generality and performance. Models are implemented from scratch in MLX, with tokenizers from the Huggingface Transformers library. Dependencies include Numpy and Pillow for image post-processing. Installation can be done using `uv tool` or classic virtual environment setup. Command-line arguments allow for image generation with specified models, prompts, and optional parameters. Quantization options for speed and memory reduction are available. LoRA adapters can be loaded for fine-tuning image generation. Controlnet support provides more control over image generation with reference images. Current limitations include generating images one by one, lack of support for negative prompts, and some LoRA adapters not working.

github

: 1.1k

avatar

AvaTaR is a novel and automatic framework that optimizes an LLM agent to effectively use provided tools and improve performance on a given task/domain. It designs a comparator module to provide insightful prompts to the LLM agent via reasoning between positive and negative examples from training data.

github

: 143

phidata

Phidata is a framework for building AI Assistants with memory, knowledge, and tools. It enables LLMs to have long-term conversations by storing chat history in a database, provides them with business context by storing information in a vector database, and enables them to take actions like pulling data from an API, sending emails, or querying a database. Memory and knowledge make LLMs smarter, while tools make them autonomous.

github

: 18.2k

langserve

LangServe helps developers deploy `LangChain` runnables and chains as a REST API. This library is integrated with FastAPI and uses pydantic for data validation. In addition, it provides a client that can be used to call into runnables deployed on a server. A JavaScript client is available in LangChain.js.

github

: 1.9k

godot-llm

Godot LLM is a plugin that enables the utilization of large language models (LLM) for generating content in games. It provides functionality for text generation, text embedding, multimodal text generation, and vector database management within the Godot game engine. The plugin supports features like Retrieval Augmented Generation (RAG) and integrates llama.cpp-based functionalities for text generation, embedding, and multimodal capabilities. It offers support for various platforms and allows users to experiment with LLM models in their game development projects.

github

: 80

For similar tasks

auto-playwright

github

: 298

awesome-mcp-servers

Awesome MCP Servers is a curated list of Model Context Protocol (MCP) servers that enable AI models to securely interact with local and remote resources through standardized server implementations. The list includes production-ready and experimental servers that extend AI capabilities through file access, database connections, API integrations, and other contextual services.

github

: 1.8k

For similar jobs

auto-playwright

github

: 298

CodebaseToPrompt

CodebaseToPrompt is a simple tool that converts a local directory into a structured prompt for Large Language Models (LLMs). It allows users to select specific files for code review, analysis, or documentation by exploring and filtering through the file tree in a browser-based interface. The tool generates a formatted output that can be directly used with AI tools, provides token count estimates, and supports local storage for saving selections. Users can easily copy the selected files in the desired format for further use.

github

: 139

CodebaseToPrompt

CodebaseToPrompt is a tool that converts a local directory into a structured prompt for Large Language Models (LLMs). It allows users to select specific files for code review, analysis, or documentation by exploring and filtering through the file tree in an interactive interface. The tool generates a formatted output that can be directly used with LLMs, estimates token count, and supports flexible text selection. Users can deploy the tool using Docker for self-contained usage and can contribute to the project by opening issues or submitting pull requests.

github

: 156

PromptFuzz

**Description:** PromptFuzz is an automated tool that generates high-quality fuzz drivers for libraries via a fuzz loop constructed on mutating LLMs' prompts. The fuzz loop of PromptFuzz aims to guide the mutation of LLMs' prompts to generate programs that cover more reachable code and explore complex API interrelationships, which are effective for fuzzing. **Features:** * **Multiply LLM support** : Supports the general LLMs: Codex, Inocder, ChatGPT, and GPT4 (Currently tested on ChatGPT). * **Context-based Prompt** : Construct LLM prompts with the automatically extracted library context. * **Powerful Sanitization** : The program's syntax, semantics, behavior, and coverage are thoroughly analyzed to sanitize the problematic programs. * **Prioritized Mutation** : Prioritizes mutating the library API combinations within LLM's prompts to explore complex interrelationships, guided by code coverage. * **Fuzz Driver Exploitation** : Infers API constraints using statistics and extends fixed API arguments to receive random bytes from fuzzers. * **Fuzz engine integration** : Integrates with grey-box fuzz engine: LibFuzzer. **Benefits:** * **High branch coverage:** The fuzz drivers generated by PromptFuzz achieved a branch coverage of 40.12% on the tested libraries, which is 1.61x greater than _OSS-Fuzz_ and 1.67x greater than _Hopper_. * **Bug detection:** PromptFuzz detected 33 valid security bugs from 49 unique crashes. * **Wide range of bugs:** The fuzz drivers generated by PromptFuzz can detect a wide range of bugs, most of which are security bugs. * **Unique bugs:** PromptFuzz detects uniquely interesting bugs that other fuzzers may miss. **Usage:** 1. Build the library using the provided build scripts. 2. Export the LLM API KEY if using ChatGPT or GPT4. 3. Generate fuzz drivers using the `fuzzer` command. 4. Run the fuzz drivers using the `harness` command. 5. Deduplicate and analyze the reported crashes. **Future Works:** * **Custom LLMs suport:** Support custom LLMs. * **Close-source libraries:** Apply PromptFuzz to close-source libraries by fine tuning LLMs on private code corpus. * **Performance** : Reduce the huge time cost required in erroneous program elimination.

github

: 170

code-review-gpt

Code Review GPT uses Large Language Models to review code in your CI/CD pipeline. It helps streamline the code review process by providing feedback on code that may have issues or areas for improvement. It should pick up on common issues such as exposed secrets, slow or inefficient code, and unreadable code. It can also be run locally in your command line to review staged files. Code Review GPT is in alpha and should be used for fun only. It may provide useful feedback but please check any suggestions thoroughly.

github

: 1.8k

aiverify

AI Verify is an AI governance testing framework and software toolkit that validates the performance of AI systems against a set of internationally recognised principles through standardised tests. AI Verify is consistent with international AI governance frameworks such as those from European Union, OECD and Singapore. It is a single integrated toolkit that operates within an enterprise environment. It can perform technical tests on common supervised learning classification and regression models for most tabular and image datasets. It however does not define AI ethical standards and does not guarantee that any AI system tested will be free from risks or biases or is completely safe.

github

: 75

cover-agent

CodiumAI Cover Agent is a tool designed to help increase code coverage by automatically generating qualified tests to enhance existing test suites. It utilizes Generative AI to streamline development workflows and is part of a suite of utilities aimed at automating the creation of unit tests for software projects. The system includes components like Test Runner, Coverage Parser, Prompt Builder, and AI Caller to simplify and expedite the testing process, ensuring high-quality software development. Cover Agent can be run via a terminal and is planned to be integrated into popular CI platforms. The tool outputs debug files locally, such as generated_prompt.md, run.log, and test_results.html, providing detailed information on generated tests and their status. It supports multiple LLMs and allows users to specify the model to use for test generation.

github

: 4.2k

momentum-core

Momentum is an open-source behavioral auditor for backend code that helps developers generate powerful insights into their codebase. It analyzes code behavior, tests it at every git push, and ensures readiness for production. Momentum understands backend code, visualizes dependencies, identifies behaviors, generates test code, runs code in the local environment, and provides debugging solutions. It aims to improve code quality, streamline testing processes, and enhance developer productivity.

github

: 130