Best AI tools for< Run Agent Experiments >
20 - AI tool Sites
FARSPEAK.AI
FARSPEAK.AI is an AI application that offers RESTful AI for databases, allowing users to query databases using natural language and deploy AI agents to enhance data processing. The application supports MongoDB Atlas, provides up-to-date embeddings, and offers both structured and unstructured data support. FARSPEAK simplifies work for AI engineers, app & web developers, and product designers by enabling faster AI feature development, natural language querying, and insights generation from data.
Tely
Tely is an autonomous AI agent that helps businesses run B2B content marketing. It uses machine learning to understand your product, build domain expertise, run SEO optimization, and create a content plan. Tely can also personalize your content with infographics, code snippets, experts' quotes, and call to actions. With Tely, you can drive sales with expert-level content on autopilot, reduce customer acquisition cost, increase conversion rate, and save money on marketing expenses.
Cerebium
Cerebium is a serverless AI infrastructure platform that allows teams to build, test, and deploy AI applications quickly and efficiently. With a focus on speed, performance, and cost optimization, Cerebium offers a range of features and tools to simplify the development and deployment of AI projects. The platform ensures high reliability, security, and compliance while providing real-time logging, cost tracking, and observability tools. Cerebium also offers GPU variety and effortless autoscaling to meet the diverse needs of developers and businesses.
Stackbear
Stackbear is an AI tool that allows users to easily create custom AI chatbots for their websites. The platform offers features such as natural language understanding, knowledge base integration, human escalation, personalization, and multilingual support. Users can build and deploy chatbots quickly, engage with website visitors, capture leads, automate tasks, and improve customer support and sales productivity. Stackbear provides a user-friendly interface for creating and managing chatbots without the need for coding skills.
ChatGPT for Outlook
ChatGPT for Outlook is an AI tool that integrates the power of ChatGPT into Microsoft Outlook, allowing users to run ChatGPT on emails to generate summaries or highlights based on custom prompts. Users can manage multiple configurations, process specific parts of emails, and update email importance based on generated output. Blueberry Consultants, the developer, offers custom versions for businesses to resell with unique prompts, turning ideas into profitable ventures.
Machinet
Machinet is an AI Agent designed for full-stack software developers. It serves as an AI-based IDE that assists developers in various tasks, such as code generation, terminal access, front-end debugging, architecture suggestions, refactoring, and mentoring. The tool aims to enhance productivity and streamline the development workflow by providing intelligent assistance and support throughout the coding process. Machinet prioritizes security and privacy, ensuring that user data is encrypted, secure, and never stored for training purposes.
PropertySimple
PropertySimple is an AI-driven real estate marketing and follow-up platform that helps agents automate their social media, listing ads, and lead follow-up. With PropertySimple, agents can save time and effort while increasing their visibility and lead generation. The platform includes features such as automated content creation and scheduling, automagic ads, a CRM with calling, text, and email powered by ChatGPT4, and the ability to claim ZIP codes on the national property portal.
GPT Prompt Tuner
GPT Prompt Tuner is an AI tool designed to enhance ChatGPT prompts by generating variations and running conversations in parallel. It enables users to explore the field of 'Prompt Engineering' and potentially earn up to $335k/yr. The tool offers a fully customizable experience, allowing users to input their own prompts or let the AI generate them. With flexible plans and the ability to use OpenAI API keys, GPT Prompt Tuner aims to streamline the prompt tuning process and improve chat interactions.
Marketsy.ai
Marketsy.ai is an all-in-one platform for digital content creators to sell their unique works online with the help of AI. It provides a simpler solution for hosting and selling digital products, allowing users to create a pop-up digital store in minutes. The platform offers features like instant store generation, rapid product upload, one-click store publishing, AI-driven marketing, and seamless store management. Marketsy.ai was born out of the founders' desire to improve the eCommerce industry using AI, specifically in store generation.
Google Play
The website is a platform for Android apps available on Google Play. Users can explore and download various games, movies, books, and kids' content. It features a wide range of apps, including popular titles like Roblox, Minecraft, and Super Mario Run. Users can also access special offers, in-app purchases, and updates for different apps. The site provides a convenient way for users to discover and enjoy entertainment content on their Android devices.
Synthetic Users
Synthetic Users is an AI-powered platform that revolutionizes user research by providing human-like AI participants for user and market research. It leverages advanced AI architecture to create accurate synthetic interviews and surveys, enabling users to run quantitative research at scale in minutes. The platform offers a multi-agent framework for dynamic interactions, continuous learning, and adaptation to mimic real human behaviors, providing valuable insights for various applications.
Unfetch
Unfetch is an online IDE that enables users to generate, deploy, and run AI agents to automate various tasks. It combines coding capabilities with an online deployment platform, making it easy to create AI agents. Unfetch agents are designed specifically for AI tasks and are compatible with tools like Open AI GPT Store and Langchain. Users can build and deploy AI agents to solve a wide range of tasks efficiently.
SWE Kit
SWE Kit is an open-source headless IDE designed for building custom coding agents with state-of-the-art performance. It offers AI-native tools to streamline the coding review process, enhance code quality, and optimize development efficiency. The application supports various agentic frameworks and LLM inference providers, providing a flexible runtime environment for seamless codebase interaction. With features like code analysis, code indexing, and third-party service integrations, SWE Kit empowers developers to create and run coding agents effortlessly.
Runday
Runday is a business-to-business software as a service (SaaS) company that provides artificial intelligence (AI)-powered agents to help businesses with sales, service, marketing, and recruiting. Runday's AI agents can be used to answer customer questions, book appointments, collect payments, and more. The company's software is used by over 30,000 businesses and has generated over $1 billion in sales for its customers.
Sema4.ai
Sema4.ai is an AI application that enables enterprises to build, run, and manage intelligent AI Agents to transform how people work. It helps teams create powerful agents that act against enterprise systems using LLMs. Sema4.ai aims to close the automation gap by combining actions, intelligence, and enterprise context to revolutionize work processes and collaboration.
QRev
QRev is an AI-powered sales tool designed to revolutionize the sales process by automating mundane tasks and allowing salespeople to focus on closing deals. It offers features such as researching prospects, scaling outreach, competitive research, and personalized email campaigns. QRev aims to save time and increase efficiency for sales teams, providing modern CRM capabilities and insights into sales calls. With QRev, users can access leads from a massive database, run research 24/7, and personalize campaign sequences at scale.
AnythingLLM
AnythingLLM is an all-in-one AI application designed for everyone. It offers a suite of tools for working with LLM (Large Language Models), documents, and agents in a fully private environment. Users can install AnythingLLM on their desktop for Windows, MacOS, and Linux, enabling flexible one-click installation and secure, fully private operation without internet connectivity. The application supports custom models, including enterprise models like GPT-4, custom fine-tuned models, and open-source models like Llama and Mistral. AnythingLLM allows users to work with various document formats, such as PDFs and word documents, providing tailored solutions with locally running defaults for privacy.
Lunary
Lunary is an AI developer platform designed to bring AI applications to production. It offers a comprehensive set of tools to manage, improve, and protect LLM apps. With features like Logs, Metrics, Prompts, Evaluations, and Threads, Lunary empowers users to monitor and optimize their AI agents effectively. The platform supports tasks such as tracing errors, labeling data for fine-tuning, optimizing costs, running benchmarks, and testing open-source models. Lunary also facilitates collaboration with non-technical teammates through features like A/B testing, versioning, and clean source-code management.
Awan LLM
Awan LLM is an AI tool that offers an Unlimited Tokens, Unrestricted, and Cost-Effective LLM Inference API Platform for Power Users and Developers. It allows users to generate unlimited tokens, use LLM models without constraints, and pay per month instead of per token. The platform features an AI Assistant, AI Agents, Roleplay with AI companions, Data Processing, Code Completion, and Applications for profitable AI-powered applications.
NexLvL
NexLvL is an AI-powered CRM and all-in-one sales & marketing platform that offers a comprehensive solution for businesses to automate support, marketing, and sales processes. It integrates 14 essential business applications into one platform, providing users with a seamless experience to run their business efficiently. With features like Ai-messaging agents, lead qualification automation, and personalized SMS marketing, NexLvL aims to revolutionize the way businesses operate by leveraging AI technology and automated workflows.
20 - Open Source AI Tools
AgentPoison
AgentPoison is a repository that provides the official PyTorch implementation of the paper 'AgentPoison: Red-teaming LLM Agents via Memory or Knowledge Base Backdoor Poisoning'. It offers tools for red-teaming LLM agents by poisoning memory or knowledge bases. The repository includes trigger optimization algorithms, agent experiments, and evaluation scripts for Agent-Driver, ReAct-StrategyQA, and EHRAgent. Users can fine-tune motion planners, inject queries with triggers, and evaluate red-teaming performance. The codebase supports multiple RAG embedders and provides a unified dataset access for all three agents.
AgentLab
AgentLab is an open, easy-to-use, and extensible framework designed to accelerate web agent research. It provides features for developing and evaluating agents on various benchmarks supported by BrowserGym. The framework allows for large-scale parallel agent experiments using ray, building blocks for creating agents over BrowserGym, and a unified LLM API for OpenRouter, OpenAI, Azure, or self-hosted using TGI. AgentLab also offers reproducibility features, a unified LeaderBoard, and supports multiple benchmarks like WebArena, WorkArena, WebLinx, VisualWebArena, AssistantBench, GAIA, Mind2Web-live, and MiniWoB.
OSWorld
OSWorld is a benchmarking tool designed to evaluate multimodal agents for open-ended tasks in real computer environments. It provides a platform for running experiments, setting up virtual machines, and interacting with the environment using Python scripts. Users can install the tool on their desktop or server, manage dependencies with Conda, and run benchmark tasks. The tool supports actions like executing commands, checking for specific results, and evaluating agent performance. OSWorld aims to facilitate research in AI by providing a standardized environment for testing and comparing different agent baselines.
MachineSoM
MachineSoM is a code repository for the paper 'Exploring Collaboration Mechanisms for LLM Agents: A Social Psychology View'. It focuses on the emergence of intelligence from collaborative and communicative computational modules, enabling effective completion of complex tasks. The repository includes code for societies of LLM agents with different traits, collaboration processes such as debate and self-reflection, and interaction strategies for determining when and with whom to interact. It provides a coding framework compatible with various inference services like Replicate, OpenAI, Dashscope, and Anyscale, supporting models like Qwen and GPT. Users can run experiments, evaluate results, and draw figures based on the paper's content, with available datasets for MMLU, Math, and Chess Move Validity.
AI4U
AI4U is a tool that provides a framework for modeling virtual reality and game environments. It offers an alternative approach to modeling Non-Player Characters (NPCs) in Godot Game Engine. AI4U defines an agent living in an environment and interacting with it through sensors and actuators. Sensors provide data to the agent's brain, while actuators send actions from the agent to the environment. The brain processes the sensor data and makes decisions (selects an action by time). AI4U can also be used in other situations, such as modeling environments for artificial intelligence experiments.
clearml-server
ClearML Server is a backend service infrastructure for ClearML, facilitating collaboration and experiment management. It includes a web app, RESTful API, and file server for storing images and models. Users can deploy ClearML Server using Docker, AWS EC2 AMI, or Kubernetes. The system design supports single IP or sub-domain configurations with specific open ports. ClearML-Agent Services container allows launching long-lasting jobs and various use cases like auto-scaler service, controllers, optimizer, and applications. Advanced functionality includes web login authentication and non-responsive experiments watchdog. Upgrading ClearML Server involves stopping containers, backing up data, downloading the latest docker-compose.yml file, configuring ClearML-Agent Services, and spinning up docker containers. Community support is available through ClearML FAQ, Stack Overflow, GitHub issues, and email contact.
GPTSwarm
GPTSwarm is a graph-based framework for LLM-based agents that enables the creation of LLM-based agents from graphs and facilitates the customized and automatic self-organization of agent swarms with self-improvement capabilities. The library includes components for domain-specific operations, graph-related functions, LLM backend selection, memory management, and optimization algorithms to enhance agent performance and swarm efficiency. Users can quickly run predefined swarms or utilize tools like the file analyzer. GPTSwarm supports local LM inference via LM Studio, allowing users to run with a local LLM model. The framework has been accepted by ICML2024 and offers advanced features for experimentation and customization.
WindowsAgentArena
Windows Agent Arena (WAA) is a scalable Windows AI agent platform designed for testing and benchmarking multi-modal, desktop AI agents. It provides researchers and developers with a reproducible and realistic Windows OS environment for AI research, enabling testing of agentic AI workflows across various tasks. WAA supports deploying agents at scale using Azure ML cloud infrastructure, allowing parallel running of multiple agents and delivering quick benchmark results for hundreds of tasks in minutes.
Co-LLM-Agents
This repository contains code for building cooperative embodied agents modularly with large language models. The agents are trained to perform tasks in two different environments: ThreeDWorld Multi-Agent Transport (TDW-MAT) and Communicative Watch-And-Help (C-WAH). TDW-MAT is a multi-agent environment where agents must transport objects to a goal position using containers. C-WAH is an extension of the Watch-And-Help challenge, which enables agents to send messages to each other. The code in this repository can be used to train agents to perform tasks in both of these environments.
chat-your-doc
Chat Your Doc is an experimental project exploring various applications based on LLM technology. It goes beyond being just a chatbot project, focusing on researching LLM applications using tools like LangChain and LlamaIndex. The project delves into UX, computer vision, and offers a range of examples in the 'Lab Apps' section. It includes links to different apps, descriptions, launch commands, and demos, aiming to showcase the versatility and potential of LLM applications.
RD-Agent
RD-Agent is a tool designed to automate critical aspects of industrial R&D processes, focusing on data-driven scenarios to streamline model and data development. It aims to propose new ideas ('R') and implement them ('D') automatically, leading to solutions of significant industrial value. The tool supports scenarios like Automated Quantitative Trading, Data Mining Agent, Research Copilot, and more, with a framework to push the boundaries of research in data science. Users can create a Conda environment, install the RDAgent package from PyPI, configure GPT model, and run various applications for tasks like quantitative trading, model evolution, medical prediction, and more. The tool is intended to enhance R&D processes and boost productivity in industrial settings.
llm-colosseum
llm-colosseum is a tool designed to evaluate Language Model Models (LLMs) in real-time by making them fight each other in Street Fighter III. The tool assesses LLMs based on speed, strategic thinking, adaptability, out-of-the-box thinking, and resilience. It provides a benchmark for LLMs to understand their environment and take context-based actions. Users can analyze the performance of different LLMs through ELO rankings and win rate matrices. The tool allows users to run experiments, test different LLM models, and customize prompts for LLM interactions. It offers installation instructions, test mode options, logging configurations, and the ability to run the tool with local models. Users can also contribute their own LLM models for evaluation and ranking.
Avalon-LLM
Avalon-LLM is a repository containing the official code for AvalonBench and the Avalon agent Strategist. AvalonBench evaluates Large Language Models (LLMs) playing The Resistance: Avalon, a board game requiring deductive reasoning, coordination, collaboration, and deception skills. Strategist utilizes LLMs to learn strategic skills through self-improvement, including high-level strategic evaluation and low-level execution guidance. The repository provides instructions for running AvalonBench, setting up Strategist, and conducting experiments with different agents in the game environment.
agentneo
AgentNeo is a Python package that provides functionalities for project, trace, dataset, experiment management. It allows users to authenticate, create projects, trace agents and LangGraph graphs, manage datasets, and run experiments with metrics. The tool aims to streamline AI project management and analysis by offering a comprehensive set of features.
goodai-ltm-benchmark
This repository contains code and data for replicating experiments on Long-Term Memory (LTM) abilities of conversational agents. It includes a benchmark for testing agents' memory performance over long conversations, evaluating tasks requiring dynamic memory upkeep and information integration. The repository supports various models, datasets, and configurations for benchmarking and reporting results.
SheetCopilot
SheetCopilot is an assistant agent that manipulates spreadsheets by following user commands. It leverages Large Language Models (LLMs) to interact with spreadsheets like a human expert, enabling non-expert users to complete tasks on complex software such as Google Sheets and Excel via a language interface. The tool observes spreadsheet states, polishes generated solutions based on external action documents and error feedback, and aims to improve success rate and efficiency. SheetCopilot offers a dataset with diverse task categories and operations, supporting operations like entry & manipulation, management, formatting, charts, and pivot tables. Users can interact with SheetCopilot in Excel or Google Sheets, executing tasks like calculating revenue, creating pivot tables, and plotting charts. The tool's evaluation includes performance comparisons with leading LLMs and VBA-based methods on specific datasets, showcasing its capabilities in controlling various aspects of a spreadsheet.
home-llm
Home LLM is a project that provides the necessary components to control your Home Assistant installation with a completely local Large Language Model acting as a personal assistant. The goal is to provide a drop-in solution to be used as a "conversation agent" component by Home Assistant. The 2 main pieces of this solution are Home LLM and Llama Conversation. Home LLM is a fine-tuning of the Phi model series from Microsoft and the StableLM model series from StabilityAI. The model is able to control devices in the user's house as well as perform basic question and answering. The fine-tuning dataset is a custom synthetic dataset designed to teach the model function calling based on the device information in the context. Llama Conversation is a custom component that exposes the locally running LLM as a "conversation agent" in Home Assistant. This component can be interacted with in a few ways: using a chat interface, integrating with Speech-to-Text and Text-to-Speech addons, or running the oobabooga/text-generation-webui project to provide access to the LLM via an API interface.
FinRobot
FinRobot is an open-source AI agent platform designed for financial applications using large language models. It transcends the scope of FinGPT, offering a comprehensive solution that integrates a diverse array of AI technologies. The platform's versatility and adaptability cater to the multifaceted needs of the financial industry. FinRobot's ecosystem is organized into four layers, including Financial AI Agents Layer, Financial LLMs Algorithms Layer, LLMOps and DataOps Layers, and Multi-source LLM Foundation Models Layer. The platform's agent workflow involves Perception, Brain, and Action modules to capture, process, and execute financial data and insights. The Smart Scheduler optimizes model diversity and selection for tasks, managed by components like Director Agent, Agent Registration, Agent Adaptor, and Task Manager. The tool provides a structured file organization with subfolders for agents, data sources, and functional modules, along with installation instructions and hands-on tutorials.
testzeus-hercules
Hercules is the world’s first open-source testing agent designed to handle the toughest testing tasks for modern web applications. It turns simple Gherkin steps into fully automated end-to-end tests, making testing simple, reliable, and efficient. Hercules adapts to various platforms like Salesforce and is suitable for CI/CD pipelines. It aims to democratize and disrupt test automation, making top-tier testing accessible to everyone. The tool is transparent, reliable, and community-driven, empowering teams to deliver better software. Hercules offers multiple ways to get started, including using PyPI package, Docker, or building and running from source code. It supports various AI models, provides detailed installation and usage instructions, and integrates with Nuclei for security testing and WCAG for accessibility testing. The tool is production-ready, open core, and open source, with plans for enhanced LLM support, advanced tooling, improved DOM distillation, community contributions, extensive documentation, and a bounty program.
llmware
LLMWare is a framework for quickly developing LLM-based applications including Retrieval Augmented Generation (RAG) and Multi-Step Orchestration of Agent Workflows. This project provides a comprehensive set of tools that anyone can use - from a beginner to the most sophisticated AI developer - to rapidly build industrial-grade, knowledge-based enterprise LLM applications. Our specific focus is on making it easy to integrate open source small specialized models and connecting enterprise knowledge safely and securely.
20 - OpenAI Gpts
Consulting & Investment Banking Interview Prep GPT
Run mock interviews, review content and get tips to ace strategy consulting and investment banking interviews
Dungeon Master's Assistant
Your new DM's screen: helping Dungeon Masters to craft & run amazing D&D adventures.
Database Builder
Hosts a real SQLite database and helps you create tables, make schema changes, and run SQL queries, ideal for all levels of database administration.
Restaurant Startup Guide
Meet the Restaurant Startup Guide GPT: your friendly guide in the restaurant biz. It offers casual, approachable advice to help you start and run your own restaurant with ease.
Community Design™
A community-building GPT based on the wildly popular Community Design™ framework from Mighty Networks. Start creating communities that run themselves.
Code Helper for Web Application Development
Friendly web assistant for efficient code. Ask the wizard to create an application and you will get the HTML, CSS and Javascript code ready to run your web application.
Creative Director GPT
I'm your brainstorm muse in marketing and advertising; the creativity machine you need to sharpen the skills, land the job, generate the ideas, win the pitches, build the brands, ace the awards, or even run your own agency. Psst... don't let your clients find out about me! 😉
Pace Assistant
Provides running splits for Strava Routes, accounting for distance and elevation changes
Design Sprint Coach (beta)
A helpful coach for guiding teams through Design Sprints with a touch of sass.