Best AI tools for< Test In The Wild >
20 - AI tool Sites
Cradle
Cradle is a protein engineering platform that uses machine learning to design improved protein sequences. It allows users to import assay data, generate new sequences, test them in the lab, and import the results to improve the model. Cradle can be used to optimize multiple properties of a protein simultaneously, and it has been used by leading biotech teams to accelerate new and ongoing projects.
Applitools
Applitools is an AI-powered test automation platform that helps businesses improve the quality of their digital experiences. It uses visual AI to validate user interfaces across any type of screen or device, and it can be deployed on-prem, in the cloud, or as a SaaS solution. Applitools integrates with all of the major development tools and workflows, and it offers a wide range of features and advantages that can help businesses save time and money while improving the quality of their software.
Quizbot
Quizbot.ai is an advanced AI question generator designed to revolutionize the process of question and exam development. It offers a cutting-edge artificial intelligence system that can generate various types of questions from different sources like PDFs, Word documents, videos, images, and more. Quizbot.ai is a versatile tool that caters to multiple languages and question types, providing a personalized and engaging learning experience for users across various industries. The platform ensures scalability, flexibility, and personalized assessments, along with detailed analytics and insights to track learner performance. Quizbot.ai is secure, user-friendly, and offers a range of subscription plans to suit different needs.
Unspam
Unspam is an email spam checker and deliverability test tool that helps businesses ensure their emails land in the inbox, not spam. It offers a range of features including email spam checking, deliverability testing, email preview, AI eye-tracking heatmap, SPF, DKIM, DMARC, and more. Unspam's mission is to help businesses improve their email deliverability and engagement, and its tools and insights are designed to help users optimize their email campaigns for maximum impact.
MacWhisper
MacWhisper is a native macOS application that utilizes OpenAI's Whisper technology for transcribing audio files into text. It offers a user-friendly interface for recording, transcribing, and editing audio, making it suitable for various use cases such as transcribing meetings, lectures, interviews, and podcasts. The application is designed to protect user privacy by performing all transcriptions locally on the device, ensuring that no data leaves the user's machine.
Cerebium
Cerebium is a serverless AI infrastructure platform that allows teams to build, test, and deploy AI applications quickly and efficiently. With a focus on speed, performance, and cost optimization, Cerebium offers a range of features and tools to simplify the development and deployment of AI projects. The platform ensures high reliability, security, and compliance while providing real-time logging, cost tracking, and observability tools. Cerebium also offers GPU variety and effortless autoscaling to meet the diverse needs of developers and businesses.
Cirrascale Cloud Services
Cirrascale Cloud Services is an AI tool that offers cloud solutions for Artificial Intelligence applications. The platform provides a range of cloud services and products tailored for AI innovation, including NVIDIA GPU Cloud, AMD Instinct Series Cloud, Qualcomm Cloud, Graphcore, Cerebras, and SambaNova. Cirrascale's AI Innovation Cloud enables users to test and deploy on leading AI accelerators in one cloud, democratizing AI by delivering high-performance AI compute and scalable deep learning solutions. The platform also offers professional and managed services, tailored multi-GPU server options, and high-throughput storage and networking solutions to accelerate development, training, and inference workloads.
ABtesting.ai
ABtesting.ai is an AI-powered A/B testing software that helps businesses optimize their landing pages for conversions. It uses GPT-3 to generate automated text suggestions for headlines, copy, and call to actions, saving businesses time and effort. The software also automatically chooses the best combinations of elements to show to users, boosting conversion rates in the process. ABtesting.ai is easy to use and requires no manual work, making it a great option for businesses of all sizes.
Devath
Devath is the world's first AI-powered SmartHome platform that revolutionizes the way users interact with their smart devices. It eliminates the need for writing extensive lines of code by allowing users to simply give instructions to the AI for seamless device control. With features like splash resistance and responsive design, Devath offers a user-friendly experience for managing smart home functionalities. The platform also enables developers to preview and test their apps before submission, providing a 99% faster publishing process. Devath is continuously evolving with user feedback and aims to enhance the SmartHome experience through AI copilots and customizable features. With Devath, users can control their devices from the web and enjoy free unlimited access to the AI era of SmartHome.
VIDOC
VIDOC is an AI-powered security engineer that automates code review and penetration testing. It continuously scans and reviews code to detect and fix security issues, helping developers deliver secure software faster. VIDOC is easy to use, requiring only two lines of code to be added to a GitHub Actions workflow. It then takes care of the rest, providing developers with a tailored code solution to fix any issues found.
Yippity
Yippity is an AI-powered question generator that helps educators and trainers create engaging and interactive assessments. With Yippity, you can easily create multiple choice, true/false, fill-in-the-blank, and short answer questions. You can also add images, videos, and audio to your questions to make them more engaging. Yippity is a great tool for creating formative assessments, quizzes, and tests. It can also be used to create practice questions for students who are preparing for standardized tests.
CreateMyTest
CreateMyTest is an online tool that uses artificial intelligence to automatically convert documents and YouTube videos into tests. It offers various question types, including multiple choice, true/false, matching, and fill in the blank. The platform aims to enhance studying by helping users retain knowledge through practice testing and reduce test anxiety.
ConsoleX
ConsoleX is an advanced AI tool that offers a wide range of functionalities to unlock infinite possibilities in the field of artificial intelligence. It provides users with a powerful platform to develop, test, and deploy AI models with ease. With cutting-edge features and intuitive interface, ConsoleX is designed to cater to the needs of both beginners and experts in the AI domain. Whether you are a data scientist, researcher, or developer, ConsoleX empowers you to explore the full potential of AI technology and drive innovation in your projects.
AI Generated Test Cases
AI Generated Test Cases is an innovative tool that leverages artificial intelligence to automatically generate test cases for software applications. By utilizing advanced algorithms and machine learning techniques, this tool can efficiently create a comprehensive set of test scenarios to ensure the quality and reliability of software products. With AI Generated Test Cases, software development teams can save time and effort in the testing phase, leading to faster release cycles and improved overall productivity.
MAIHEM
MAIHEM is an AI-powered quality assurance platform that helps businesses test and improve the performance and safety of their AI applications. It automates the testing process, generates realistic test cases, and provides comprehensive analytics to help businesses identify and fix potential issues. MAIHEM is used by a variety of businesses, including those in the customer support, healthcare, education, and sales industries.
U-xer
U-xer is an innovative automation tool developed by Quality Museum Software Testing Services. It is designed to meet a broad range of needs, including Robotic Process Automation (RPA), test automation, and bot development. Crafted with user flexibility in mind, U-xer aims to be a user-friendly solution for your automation requirements! U-xer's unique screen recognition models interpret screens in the same way that humans do. This enables non-technical users to automate simple tasks, while allowing advanced users to tackle more complex tasks with ease. With U-xer, you can automate anything, anywhere, whether it's Web or Desktop. U-xer works seamlessly across all platforms with just a screenshot. Unlike other tools, U-xer interprets screens just like a human does, enabling more natural and accurate automation of a wide range of tasks.
Unprompted
Unprompted is an AI image guessing game where players guess the words used to create AI-generated images. Players type words into the text box and submit their guesses. Correct guesses will replace blanks in the image. The game offers three new images to try every day, and players can check the answers under the 'Yesterday' tab. Unprompted provides a fun and interactive way to engage with AI technology and test your creativity and imagination.
TalkToMe.AI
TalkToMe.AI is a comprehensive platform dedicated to artificial intelligence, offering a wide range of resources for enthusiasts and professionals alike. From interactive quizzes on various AI topics to in-depth articles on machine learning algorithms and neural networks, the website aims to educate and inspire individuals interested in the field of AI. With a focus on demystifying complex concepts and keeping users updated on the latest advancements, TalkToMe.AI serves as a trusted companion for anyone looking to explore the fascinating realm of artificial intelligence.
Keploy
Keploy is an AI tool designed for developers to generate API tests efficiently. It is an open-source platform that converts API calls to test cases with data mocks. Keploy simplifies testing by capturing network interactions and generating automated tests, helping teams accelerate development with streamlined testing processes. The tool allows users to record and replay complex API flows, find duplicate tests, and seamlessly integrate with popular testing libraries like JUnit, PyTest, Jest, and Go-Test in CI/CD pipelines.
MagikKraft
MagikKraft is an AI-powered platform that simplifies complex controls by enabling users to create personalized sequences and actions for programmable devices like drones, automated appliances, and self-driving vehicles. Users can craft customized recipes, test them in a virtual environment, and deploy them seamlessly into the real world. MagikKraft prioritizes privacy, user control, and creative freedom, aiming to enhance technology's potential while ensuring no harm or unforeseen consequences.
20 - Open Source AI Tools
Synthetic-Voice-Detection-Vocoder-Artifacts
The Synthetic-Voice-Detection-Vocoder-Artifacts repository provides the LibriSeVoc dataset containing self-vocoding samples created with six state-of-the-art vocoders to expose and exploit vocoder artifacts. It also introduces a new approach for detecting synthetic human voices by identifying signal artifacts left by neural vocoders and enhancing the RawNet2 baseline. The repository includes a paper and dataset for further reference and offers instructions for training the model and testing it in the wild.
Awesome_papers_on_LLMs_detection
This repository is a curated list of papers focused on the detection of Large Language Models (LLMs)-generated content. It includes the latest research papers covering detection methods, datasets, attacks, and more. The repository is regularly updated to include the most recent papers in the field.
Awesome-Code-LLM
Analyze the following text from a github repository (name and readme text at end) . Then, generate a JSON object with the following keys and provide the corresponding information for each key, in lowercase letters: 'description' (detailed description of the repo, must be less than 400 wordsοΌEnsure that no line breaks and quotation marks.),'for_jobs' (List 5 jobs suitable for this tool,in lowercase letters), 'ai_keywords' (keywords of the tool,user may use those keyword to find the tool,in lowercase letters), 'for_tasks' (list of 5 specific tasks user can use this tool to do,in lowercase letters), 'answer' (in english languages)
Awesome-LLM4Cybersecurity
The repository 'Awesome-LLM4Cybersecurity' provides a comprehensive overview of the applications of Large Language Models (LLMs) in cybersecurity. It includes a systematic literature review covering topics such as constructing cybersecurity-oriented domain LLMs, potential applications of LLMs in cybersecurity, and research directions in the field. The repository analyzes various benchmarks, datasets, and applications of LLMs in cybersecurity tasks like threat intelligence, fuzzing, vulnerabilities detection, insecure code generation, program repair, anomaly detection, and LLM-assisted attacks.
baml
BAML is a config file format for declaring LLM functions that you can then use in TypeScript or Python. With BAML you can Classify or Extract any structured data using Anthropic, OpenAI or local models (using Ollama) ## Resources ![](https://img.shields.io/discord/1119368998161752075.svg?logo=discord&label=Discord%20Community) [Discord Community](https://discord.gg/boundaryml) ![](https://img.shields.io/twitter/follow/boundaryml?style=social) [Follow us on Twitter](https://twitter.com/boundaryml) * Discord Office Hours - Come ask us anything! We hold office hours most days (9am - 12pm PST). * Documentation - Learn BAML * Documentation - BAML Syntax Reference * Documentation - Prompt engineering tips * Boundary Studio - Observability and more #### Starter projects * BAML + NextJS 14 * BAML + FastAPI + Streaming ## Motivation Calling LLMs in your code is frustrating: * your code uses types everywhere: classes, enums, and arrays * but LLMs speak English, not types BAML makes calling LLMs easy by taking a type-first approach that lives fully in your codebase: 1. Define what your LLM output type is in a .baml file, with rich syntax to describe any field (even enum values) 2. Declare your prompt in the .baml config using those types 3. Add additional LLM config like retries or redundancy 4. Transpile the .baml files to a callable Python or TS function with a type-safe interface. (VSCode extension does this for you automatically). We were inspired by similar patterns for type safety: protobuf and OpenAPI for RPCs, Prisma and SQLAlchemy for databases. BAML guarantees type safety for LLMs and comes with tools to give you a great developer experience: ![](docs/images/v3/prompt_view.gif) Jump to BAML code or how Flexible Parsing works without additional LLM calls. | BAML Tooling | Capabilities | | ----------------------------------------------------------------------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | BAML Compiler install | Transpiles BAML code to a native Python / Typescript library (you only need it for development, never for releases) Works on Mac, Windows, Linux ![](https://img.shields.io/badge/Python-3.8+-default?logo=python)![](https://img.shields.io/badge/Typescript-Node_18+-default?logo=typescript) | | VSCode Extension install | Syntax highlighting for BAML files Real-time prompt preview Testing UI | | Boundary Studio open (not open source) | Type-safe observability Labeling |
EmbodiedScan
EmbodiedScan is a holistic multi-modal 3D perception suite designed for embodied AI. It introduces a multi-modal, ego-centric 3D perception dataset and benchmark for holistic 3D scene understanding. The dataset includes over 5k scans with 1M ego-centric RGB-D views, 1M language prompts, 160k 3D-oriented boxes spanning 760 categories, and dense semantic occupancy with 80 common categories. The suite includes a baseline framework named Embodied Perceptron, capable of processing multi-modal inputs for 3D perception tasks and language-grounded tasks.
llm-course
The LLM course is divided into three parts: 1. 𧩠**LLM Fundamentals** covers essential knowledge about mathematics, Python, and neural networks. 2. π§βπ¬ **The LLM Scientist** focuses on building the best possible LLMs using the latest techniques. 3. π· **The LLM Engineer** focuses on creating LLM-based applications and deploying them. For an interactive version of this course, I created two **LLM assistants** that will answer questions and test your knowledge in a personalized way: * π€ **HuggingChat Assistant**: Free version using Mixtral-8x7B. * π€ **ChatGPT Assistant**: Requires a premium account. ## π Notebooks A list of notebooks and articles related to large language models. ### Tools | Notebook | Description | Notebook | |----------|-------------|----------| | π§ LLM AutoEval | Automatically evaluate your LLMs using RunPod | ![Open In Colab](img/colab.svg) | | π₯± LazyMergekit | Easily merge models using MergeKit in one click. | ![Open In Colab](img/colab.svg) | | π¦ LazyAxolotl | Fine-tune models in the cloud using Axolotl in one click. | ![Open In Colab](img/colab.svg) | | β‘ AutoQuant | Quantize LLMs in GGUF, GPTQ, EXL2, AWQ, and HQQ formats in one click. | ![Open In Colab](img/colab.svg) | | π³ Model Family Tree | Visualize the family tree of merged models. | ![Open In Colab](img/colab.svg) | | π ZeroSpace | Automatically create a Gradio chat interface using a free ZeroGPU. | ![Open In Colab](img/colab.svg) |
WildBench
WildBench is a tool designed for benchmarking Large Language Models (LLMs) with challenging tasks sourced from real users in the wild. It provides a platform for evaluating the performance of various models on a range of tasks. Users can easily add new models to the benchmark by following the provided guidelines. The tool supports models from Hugging Face and other APIs, allowing for comprehensive evaluation and comparison. WildBench facilitates running inference and evaluation scripts, enabling users to contribute to the benchmark and collaborate on improving model performance.
awesome-hallucination-detection
This repository provides a curated list of papers, datasets, and resources related to the detection and mitigation of hallucinations in large language models (LLMs). Hallucinations refer to the generation of factually incorrect or nonsensical text by LLMs, which can be a significant challenge for their use in real-world applications. The resources in this repository aim to help researchers and practitioners better understand and address this issue.
llama3.java
Llama3.java is a practical Llama 3 inference tool implemented in a single Java file. It serves as the successor of llama2.java and is designed for testing and tuning compiler optimizations and features on the JVM, especially for the Graal compiler. The tool features a GGUF format parser, Llama 3 tokenizer, Grouped-Query Attention inference, support for Q8_0 and Q4_0 quantizations, fast matrix-vector multiplication routines using Java's Vector API, and a simple CLI with 'chat' and 'instruct' modes. Users can download quantized .gguf files from huggingface.co for model usage and can also manually quantize to pure 'Q4_0'. The tool requires Java 21+ and supports running from source or building a JAR file for execution. Performance benchmarks show varying tokens/s rates for different models and implementations on different hardware setups.
LLMEvaluation
The LLMEvaluation repository is a comprehensive compendium of evaluation methods for Large Language Models (LLMs) and LLM-based systems. It aims to assist academics and industry professionals in creating effective evaluation suites tailored to their specific needs by reviewing industry practices for assessing LLMs and their applications. The repository covers a wide range of evaluation techniques, benchmarks, and studies related to LLMs, including areas such as embeddings, question answering, multi-turn dialogues, reasoning, multi-lingual tasks, ethical AI, biases, safe AI, code generation, summarization, software performance, agent LLM architectures, long text generation, graph understanding, and various unclassified tasks. It also includes evaluations for LLM systems in conversational systems, copilots, search and recommendation engines, task utility, and verticals like healthcare, law, science, financial, and others. The repository provides a wealth of resources for evaluating and understanding the capabilities of LLMs in different domains.
llms-tools
The 'llms-tools' repository is a comprehensive collection of AI tools, open-source projects, and research related to Large Language Models (LLMs) and Chatbots. It covers a wide range of topics such as AI in various domains, open-source models, chats & assistants, visual language models, evaluation tools, libraries, devices, income models, text-to-image, computer vision, audio & speech, code & math, games, robotics, typography, bio & med, military, climate, finance, and presentation. The repository provides valuable resources for researchers, developers, and enthusiasts interested in exploring the capabilities of LLMs and related technologies.
20 - OpenAI Gpts
The Enigmancer
Put your prompt engineering skills to the ultimate test! Embark on a journey to outwit a mythical guardian of ancient secrets. Try to extract the secret passphrase hidden in the system prompt and enter it in chat when you think you have it and claim your glory. Good luck!
GMAT Tutor
Get 1-on-1 tutoring. Trained from official questions only (verbal, quant, data insights). Score in the 90th percentile! π
Battery Expert
Professional experts in the battery field will answer your battery questions in the simplest way. At the same time, the database is constantly updated, and it has now been updated to 2024/01/10
Dieselpunk Heists, a text adventure game
Make a diesel-powered getaway in this gritty heist game. Let me entertain you with this interactive heist game, lovingly illustrated in the style of gritty, cinematic dieselpunk.
Cute Little Aliens, a text adventure game
What's cute, little, and on their way to invade earth? Let me entertain you with this interactive alien invasion game, lovingly illustrated in the style of ultra-cute little 3D kawaii dioramas.
Aliens de los Muertos, a text adventure game
Unexpected guests attend a fiesta of galactic proportions. Let me entertain you with this interactive alien invasion game, lovingly illustrated in the style of artisanal handicrafts celebrating la Dia de los Muertos.
Dieselpunk Time Travellers, a text adventure game
Travel through time to stop a dieselpunk conspiracy. Let me entertain you with this interactive repair-the-timeline game, lovingly illustrated in the style of gritty, cinematic dieselpunk.
Cute Little Time Travellers, a text adventure game
Protect your cute little timeline. Let me entertain you with this interactive repair-the-timeline game, lovingly illustrated in the style of ultra-cute little 3D kawaii dioramas.
Anime Aliens, a text adventure game
Worlds collide. Aliens vs anime. Let me entertain you with this interactive alien invasion game, lovingly illustrated in the style of elegant Shojo anime.
INSIGHT Business SIM
The future of business education: Generate and test ideas in a complex global market simulation, populated by autonomous agents. Powered by the MANNS engine for unparalleled entity autonomy and simulated market forces
Exam Solver
Upload a screenshot or a picture of a question in an exam paper, I'll give you the answer in seconds!
Simplify learning πππ
Discover how to make learning easy and fun! π You can ask the question in any language π
IQ Test
IQ Test is designed to simulate an IQ testing environment. It provides a formal and objective experience, delivering questions and processing answers in a straightforward manner.
TuringGPT
The Turing Test, first named the imitation game by Alan Turing in 1950, is a measure of a machine's capacity to demonstrate intelligence that's either equal to or indistinguishable from human intelligence.
React Native Testing Library Owl
Assists in writing React Native tests using the React Native Testing Library.
Mockito Mentor
Java testing consultant specializing in Mockito, based on the book Mockito Made Clear and related blog posts by Ken Kousen.
LabGPT
The main objective of a personalized ChatGPT for reading laboratory tests is to evaluate laboratory test results and create a spreadsheet with the evaluation results and possible solutions.
Pieter Levels: Startup Adventures
A text-based adventure game taking place in a fictional version of π³π± Netherlands, πΉπ Thailand, ποΈ Bali and other places. The player assumes the role of Pieter Levels, a young digital nomad who stays in hostel dorms with a dream of becoming a startup millionaire.