
bedrock-claude-chat
AWS-native chatbot using Bedrock + Claude (+Nova and Mistral)
Stars: 1067

This repository is a sample chatbot using the Anthropic company's LLM Claude, one of the foundational models provided by Amazon Bedrock for generative AI. It allows users to have basic conversations with the chatbot, personalize it with their own instructions and external knowledge, and analyze usage for each user/bot on the administrator dashboard. The chatbot supports various languages, including English, Japanese, Korean, Chinese, French, German, and Spanish. Deployment is straightforward and can be done via the command line or by using AWS CDK. The architecture is built on AWS managed services, eliminating the need for infrastructure management and ensuring scalability, reliability, and security.
README:
English | 日本語 | 한국어 | 中文 | Français | Deutsch | Español | Italian | Norsk | ไทย | Bahasa Indonesia | Bahasa Melayu | Tiếng Việt | Polski
[!Warning]
V2 released. To update, please carefully review the migration guide. Without any care, BOTS FROM V1 WILL BECOME UNUSABLE.
A multilingual chatbot using LLM models provided by Amazon Bedrock for generative AI.
Add your own instruction and give external knowledge as URL or files (a.k.a RAG. The bot can be shared among application users. The customized bot also can be published as stand-alone API (See the detail).
[!Important] For governance reasons, only allowed users are able to create customized bots. To allow the creation of customized bots, the user must be a member of group called
CreatingBotAllowed
, which can be set up via the management console > Amazon Cognito User pools or aws cli. Note that the user pool id can be referred by accessing CloudFormation > BedrockChatStack > Outputs >AuthUserPoolIdxxxx
.
LLM-powered Agent
By using the Agent functionality, your chatbot can automatically handle more complex tasks. For example, to answer a user's question, the Agent can retrieve necessary information from external tools or break down the task into multiple steps for processing.
- In the us-east-1 region, open Bedrock Model access >
Manage model access
> Check all ofAnthropic / Claude 3
, all ofAmazon / Nova
,Amazon / Titan Text Embeddings V2
andCohere / Embed Multilingual
thenSave changes
.
- Open CloudShell at the region where you want to deploy
- Run deployment via following commands. If you want to specify the version to deploy or need to apply security policies, please specify the appropriate parameters from Optional Parameters.
git clone https://github.com/aws-samples/bedrock-claude-chat.git
cd bedrock-claude-chat
chmod +x bin.sh
./bin.sh
- You will be asked if a new user or using v2. If you are not a continuing user from v0, please enter
y
.
You can specify the following parameters during deployment to enhance security and customization:
- --disable-self-register: Disable self-registration (default: enabled). If this flag is set, you will need to create all users on cognito and it will not allow users to self register their accounts.
- --enable-lambda-snapstart: Enable Lambda SnapStart (default: disabled). If this flag is set, improves cold start times for Lambda functions, providing faster response times for better user experience.
- --ipv4-ranges: Comma-separated list of allowed IPv4 ranges. (default: allow all ipv4 addresses)
- --ipv6-ranges: Comma-separated list of allowed IPv6 ranges. (default: allow all ipv6 addresses)
- --disable-ipv6: Disable connections over IPv6. (default: enabled)
- --allowed-signup-email-domains: Comma-separated list of allowed email domains for sign-up. (default: no domain restriction)
- --bedrock-region: Define the region where bedrock is available. (default: us-east-1)
- --repo-url: The custom repo of Bedrock Claude Chat to deploy, if forked or custom source control. (default: https://github.com/aws-samples/bedrock-claude-chat.git)
- --version: The version of Bedrock Claude Chat to deploy. (default: latest version in development)
- --cdk-json-override: You can override any CDK context values during deployment using the override JSON block. This allows you to modify the configuration without editing the cdk.json file directly.
Example usage:
./bin.sh --cdk-json-override '{
"context": {
"selfSignUpEnabled": false,
"enableLambdaSnapStart": true,
"allowedIpV4AddressRanges": ["192.168.1.0/24"],
"allowedSignUpEmailDomains": ["example.com"]
}
}'
The override JSON must follow the same structure as cdk.json. You can override any context values including:
selfSignUpEnabled
enableLambdaSnapStart
allowedIpV4AddressRanges
allowedIpV6AddressRanges
allowedSignUpEmailDomains
bedrockRegion
enableRagReplicas
enableBedrockCrossRegionInference
- And other context values defined in cdk.json
[!Note] The override values will be merged with the existing cdk.json configuration during the deployment time in the AWS code build. Values specified in the override will take precedence over the values in cdk.json.
./bin.sh --disable-self-register --ipv4-ranges "192.0.2.0/25,192.0.2.128/25" --ipv6-ranges "2001:db8:1:2::/64,2001:db8:1:3::/64" --allowed-signup-email-domains "example.com,anotherexample.com" --bedrock-region "us-west-2" --version "v1.2.6"
- After about 35 minutes, you will get the following output, which you can access from your browser
Frontend URL: https://xxxxxxxxx.cloudfront.net
The sign-up screen will appear as shown above, where you can register your email and log in.
[!Important] Without setting the optional parameter, this deployment method allows anyone who knows the URL to sign up. For production use, it is strongly recommended to add IP address restrictions and disable self-signup to mitigate security risks (you can define allowed-signup-email-domains to restrict users so that only email addresses from your company’s domain can sign up). Use both ipv4-ranges and ipv6-ranges for IP address restrictions, and disable self-signup by using disable-self-register when executing ./bin.
[!TIP] If the
Frontend URL
does not appear or Bedrock Claude Chat does not work properly, it may be a problem with the latest version. In this case, please add--version "v1.2.6"
to the parameters and try deployment again.
It's an architecture built on AWS managed services, eliminating the need for infrastructure management. Utilizing Amazon Bedrock, there's no need to communicate with APIs outside of AWS. This enables deploying scalable, reliable, and secure applications.
- Amazon DynamoDB: NoSQL database for conversation history storage
- Amazon API Gateway + AWS Lambda: Backend API endpoint (AWS Lambda Web Adapter, FastAPI)
- Amazon CloudFront + S3: Frontend application delivery (React, Tailwind CSS)
- AWS WAF: IP address restriction
- Amazon Cognito: User authentication
- Amazon Bedrock: Managed service to utilize foundational models via APIs
- Amazon Bedrock Knowledge Bases: Provides a managed interface for Retrieval-Augmented Generation (RAG), offering services for embedding and parsing documents
- Amazon EventBridge Pipes: Receiving event from DynamoDB stream and launching Step Functions to embed external knowledge
- AWS Step Functions: Orchestrating ingestion pipeline to embed external knowledge into Bedrock Knowledge Bases
- Amazon OpenSearch Serverless: Serves as the backend database for Bedrock Knowledge Bases, providing full-text search and vector search capabilities, enabling accurate retrieval of relevant information
- Amazon Athena: Query service to analyze S3 bucket
Super-easy Deployment uses AWS CodeBuild to perform deployment by CDK internally. This section describes the procedure for deploying directly with CDK.
- Please have UNIX, Docker and a Node.js runtime environment. If not, you can also use Cloud9
[!Important] If there is insufficient storage space in the local environment during deployment, CDK bootstrapping may result in an error. If you are running in Cloud9 etc., we recommend expanding the volume size of the instance before deploying.
- Clone this repository
git clone https://github.com/aws-samples/bedrock-claude-chat
- Install npm packages
cd bedrock-claude-chat
cd cdk
npm ci
-
If necessary, edit the following entries in cdk.json if necessary.
-
bedrockRegion
: Region where Bedrock is available. NOTE: Bedrock does NOT support all regions for now. -
allowedIpV4AddressRanges
,allowedIpV6AddressRanges
: Allowed IP Address range. -
enableLambdaSnapStart
: Defaults to true. Set to false if deploying to a region that doesn't support Lambda SnapStart for Python functions.
-
-
Before deploying the CDK, you will need to work with Bootstrap once for the region you are deploying to.
npx cdk bootstrap
- Deploy this sample project
npx cdk deploy --require-approval never --all
- You will get output similar to the following. The URL of the web app will be output in
BedrockChatStack.FrontendURL
, so please access it from your browser.
✅ BedrockChatStack
✨ Deployment time: 78.57s
Outputs:
BedrockChatStack.AuthUserPoolClientIdXXXXX = xxxxxxx
BedrockChatStack.AuthUserPoolIdXXXXXX = ap-northeast-1_XXXX
BedrockChatStack.BackendApiBackendApiUrlXXXXX = https://xxxxx.execute-api.ap-northeast-1.amazonaws.com
BedrockChatStack.FrontendURL = https://xxxxx.cloudfront.net
You can define parameters for your deployment in two ways: using cdk.json
or using the type-safe parameter.ts
file.
The traditional way to configure parameters is by editing the cdk.json
file. This approach is simple but lacks type checking:
{
"app": "npx ts-node --prefer-ts-exts bin/bedrock-chat.ts",
"context": {
"bedrockRegion": "us-east-1",
"allowedIpV4AddressRanges": ["0.0.0.0/1", "128.0.0.0/1"],
"enableMistral": false,
"selfSignUpEnabled": true
}
}
For better type safety and developer experience, you can use the parameter.ts
file to define your parameters:
// Define parameters for the default environment
bedrockChatParams.set("default", {
bedrockRegion: "us-east-1",
allowedIpV4AddressRanges: ["192.168.0.0/16"],
enableMistral: false,
selfSignUpEnabled: true,
});
// Define parameters for additional environments
bedrockChatParams.set("dev", {
bedrockRegion: "us-west-2",
allowedIpV4AddressRanges: ["10.0.0.0/8"],
enableRagReplicas: false, // Cost-saving for dev environment
});
bedrockChatParams.set("prod", {
bedrockRegion: "us-east-1",
allowedIpV4AddressRanges: ["172.16.0.0/12"],
enableLambdaSnapStart: true,
enableRagReplicas: true, // Enhanced availability for production
});
[!Note] Existing users can continue using
cdk.json
without any changes. Theparameter.ts
approach is recommended for new deployments or when you need to manage multiple environments.
You can deploy multiple environments from the same codebase using the parameter.ts
file and the -c envName
option.
- Define your environments in
parameter.ts
as shown above - Each environment will have its own set of resources with environment-specific prefixes
To deploy a specific environment:
# Deploy the dev environment
npx cdk deploy --all -c envName=dev
# Deploy the prod environment
npx cdk deploy --all -c envName=prod
If no environment is specified, the "default" environment is used:
# Deploy the default environment
npx cdk deploy --all
-
Stack Naming:
- The main stacks for each environment will be prefixed with the environment name (e.g.,
dev-BedrockChatStack
,prod-BedrockChatStack
) - However, custom bot stacks (
BrChatKbStack*
) and API publishing stacks (ApiPublishmentStack*
) do not receive environment prefixes as they are created dynamically at runtime
- The main stacks for each environment will be prefixed with the environment name (e.g.,
-
Resource Naming:
- Only some resources receive environment prefixes in their names (e.g.,
dev_ddb_export
table,dev-FrontendWebAcl
) - Most resources maintain their original names but are isolated by being in different stacks
- Only some resources receive environment prefixes in their names (e.g.,
-
Environment Identification:
- All resources are tagged with a
CDKEnvironment
tag containing the environment name - You can use this tag to identify which environment a resource belongs to
- Example:
CDKEnvironment: dev
orCDKEnvironment: prod
- All resources are tagged with a
-
Default Environment Override: If you define a "default" environment in
parameter.ts
, it will override the settings incdk.json
. To continue usingcdk.json
, don't define a "default" environment inparameter.ts
. -
Environment Requirements: To create environments other than "default", you must use
parameter.ts
. The-c envName
option alone is not sufficient without corresponding environment definitions. -
Resource Isolation: Each environment creates its own set of resources, allowing you to have development, testing, and production environments in the same AWS account without conflicts.
Update enableMistral
to true
in cdk.json, and run npx cdk deploy
.
...
"enableMistral": true,
[!Important] This project focus on Anthropic Claude models, the Mistral models are limited supported. For example, prompt examples are based on Claude models. This is a Mistral-only option, once you toggled to enable Mistral models, you can only use Mistral models for all the chat features, NOT both Claude and Mistral models.
Users can adjust the text generation parameters from the custom bot creation screen. If the bot is not used, the default parameters set in config.py will be used.
DEFAULT_GENERATION_CONFIG = {
"max_tokens": 2000,
"top_k": 250,
"top_p": 0.999,
"temperature": 0.6,
"stop_sequences": ["Human: ", "Assistant: "],
}
If using cli and CDK, please npx cdk destroy
. If not, access CloudFormation and then delete BedrockChatStack
and FrontendWafStack
manually. Please note that FrontendWafStack
is in us-east-1
region.
This asset automatically detects the language using i18next-browser-languageDetector. You can switch languages from the application menu. Alternatively, you can use Query String to set the language as shown below.
https://example.com?lng=ja
This sample has self sign up enabled by default. To disable self sign up, open cdk.json and switch selfSignUpEnabled
as false
. If you configure external identity provider, the value will be ignored and automatically disabled.
By default, this sample does not restrict the domains for sign-up email addresses. To allow sign-ups only from specific domains, open cdk.json
and specify the domains as a list in allowedSignUpEmailDomains
.
"allowedSignUpEmailDomains": ["example.com"],
This sample supports external identity provider. Currently we support Google and custom OIDC provider.
This sample has the following groups to give permissions to users:
If you want newly created users to automatically join groups, you can specify them in cdk.json.
"autoJoinUserGroups": ["CreatingBotAllowed"],
By default, newly created users will be joined to the CreatingBotAllowed
group.
enableRagReplicas
is an option in cdk.json that controls the replica settings for the RAG database, specifically the Knowledge Bases using Amazon OpenSearch Serverless.
- Default: true
- true: Enhances availability by enabling additional replicas, making it suitable for production environments but increasing costs.
- false: Reduces costs by using fewer replicas, making it suitable for development and testing.
This is an account/region-level setting, affecting the entire application rather than individual bots.
[!Note] As of June 2024, Amazon OpenSearch Serverless supports 0.5 OCU, lowering entry costs for small-scale workloads. Production deployments can start with 2 OCUs, while dev/test workloads can use 1 OCU. OpenSearch Serverless automatically scales based on workload demands. For more detail, visit announcement.
Cross-region inference allows Amazon Bedrock to dynamically route model inference requests across multiple AWS regions, enhancing throughput and resilience during peak demand periods. To configure, edit cdk.json
.
"enableBedrockCrossRegionInference": true
Lambda SnapStart improves cold start times for Lambda functions, providing faster response times for better user experience. On the other hand, for Python functions, there is a charge depending on cache size and not available in some regions currently. To disable SnapStart, edit cdk.json
.
"enableLambdaSnapStart": false
You can configure a custom domain for the CloudFront distribution by setting the following parameters in cdk.json:
{
"alternateDomainName": "chat.example.com",
"hostedZoneId": "Z0123456789ABCDEF"
}
-
alternateDomainName
: The custom domain name for your chat application (e.g., chat.example.com) -
hostedZoneId
: The ID of your Route 53 hosted zone where the domain records will be created
When these parameters are provided, the deployment will automatically:
- Create an ACM certificate with DNS validation in us-east-1 region
- Create the necessary DNS records in your Route 53 hosted zone
- Configure CloudFront to use your custom domain
[!Note] The domain must be managed by Route 53 in your AWS account. The hosted zone ID can be found in the Route 53 console.
See LOCAL DEVELOPMENT.
Thank you for considering contributing to this repository! We welcome bug fixes, language translations (i18n), feature enhancements, agent tools, and other improvements.
For feature enhancements and other improvements, before creating a Pull Request, we would greatly appreciate it if you could create a Feature Request Issue to discuss the implementation approach and details. For bug fixes and language translations (i18n), proceed with creating a Pull Request directly.
Please also take a look at the following guidelines before contributing:
This library is licensed under the MIT-0 License. See the LICENSE file.
For Tasks:
Click tags to check more tools for each tasksFor Jobs:
Alternative AI tools for bedrock-claude-chat
Similar Open Source Tools

bedrock-claude-chat
This repository is a sample chatbot using the Anthropic company's LLM Claude, one of the foundational models provided by Amazon Bedrock for generative AI. It allows users to have basic conversations with the chatbot, personalize it with their own instructions and external knowledge, and analyze usage for each user/bot on the administrator dashboard. The chatbot supports various languages, including English, Japanese, Korean, Chinese, French, German, and Spanish. Deployment is straightforward and can be done via the command line or by using AWS CDK. The architecture is built on AWS managed services, eliminating the need for infrastructure management and ensuring scalability, reliability, and security.

hayhooks
Hayhooks is a tool that simplifies the deployment and serving of Haystack pipelines as REST APIs. It allows users to wrap their pipelines with custom logic and expose them via HTTP endpoints, including OpenAI-compatible chat completion endpoints. With Hayhooks, users can easily convert their Haystack pipelines into API services with minimal boilerplate code.

langserve
LangServe helps developers deploy `LangChain` runnables and chains as a REST API. This library is integrated with FastAPI and uses pydantic for data validation. In addition, it provides a client that can be used to call into runnables deployed on a server. A JavaScript client is available in LangChain.js.

OpenAI-sublime-text
The OpenAI Completion plugin for Sublime Text provides first-class code assistant support within the editor. It utilizes LLM models to manipulate code, engage in chat mode, and perform various tasks. The plugin supports OpenAI, llama.cpp, and ollama models, allowing users to customize their AI assistant experience. It offers separated chat histories and assistant settings for different projects, enabling context-specific interactions. Additionally, the plugin supports Markdown syntax with code language syntax highlighting, server-side streaming for faster response times, and proxy support for secure connections. Users can configure the plugin's settings to set their OpenAI API key, adjust assistant modes, and manage chat history. Overall, the OpenAI Completion plugin enhances the Sublime Text editor with powerful AI capabilities, streamlining coding workflows and fostering collaboration with AI assistants.

june
june-va is a local voice chatbot that combines Ollama for language model capabilities, Hugging Face Transformers for speech recognition, and the Coqui TTS Toolkit for text-to-speech synthesis. It provides a flexible, privacy-focused solution for voice-assisted interactions on your local machine, ensuring that no data is sent to external servers. The tool supports various interaction modes including text input/output, voice input/text output, text input/audio output, and voice input/audio output. Users can customize the tool's behavior with a JSON configuration file and utilize voice conversion features for voice cloning. The application can be further customized using a configuration file with attributes for language model, speech-to-text model, and text-to-speech model configurations.

magic-cli
Magic CLI is a command line utility that leverages Large Language Models (LLMs) to enhance command line efficiency. It is inspired by projects like Amazon Q and GitHub Copilot for CLI. The tool allows users to suggest commands, search across command history, and generate commands for specific tasks using local or remote LLM providers. Magic CLI also provides configuration options for LLM selection and response generation. The project is still in early development, so users should expect breaking changes and bugs.

kwaak
Kwaak is a tool that allows users to run a team of autonomous AI agents locally from their own machine. It enables users to write code, improve test coverage, update documentation, and enhance code quality while focusing on building innovative projects. Kwaak is designed to run multiple agents in parallel, interact with codebases, answer questions about code, find examples, write and execute code, create pull requests, and more. It is free and open-source, allowing users to bring their own API keys or models via Ollama. Kwaak is part of the bosun.ai project, aiming to be a platform for autonomous code improvement.

llm-vscode
llm-vscode is an extension designed for all things LLM, utilizing llm-ls as its backend. It offers features such as code completion with 'ghost-text' suggestions, the ability to choose models for code generation via HTTP requests, ensuring prompt size fits within the context window, and code attribution checks. Users can configure the backend, suggestion behavior, keybindings, llm-ls settings, and tokenization options. Additionally, the extension supports testing models like Code Llama 13B, Phind/Phind-CodeLlama-34B-v2, and WizardLM/WizardCoder-Python-34B-V1.0. Development involves cloning llm-ls, building it, and setting up the llm-vscode extension for use.

lexido
Lexido is an innovative assistant for the Linux command line, designed to boost your productivity and efficiency. Powered by Gemini Pro 1.0 and utilizing the free API, Lexido offers smart suggestions for commands based on your prompts and importantly your current environment. Whether you're installing software, managing files, or configuring system settings, Lexido streamlines the process, making it faster and more intuitive.

cursor-tools
cursor-tools is a CLI tool designed to enhance AI agents with advanced skills, such as web search, repository context, documentation generation, GitHub integration, Xcode tools, and browser automation. It provides features like Perplexity for web search, Gemini 2.0 for codebase context, and Stagehand for browser operations. The tool requires API keys for Perplexity AI and Google Gemini, and supports global installation for system-wide access. It offers various commands for different tasks and integrates with Cursor Composer for AI agent usage.

mycoder
An open-source mono-repository containing the MyCoder agent and CLI. It leverages Anthropic's Claude API for intelligent decision making, has a modular architecture with various tool categories, supports parallel execution with sub-agents, can modify code by writing itself, features a smart logging system for clear output, and is human-compatible using README.md, project files, and shell commands to build its own context.

opencharacter
OpenCharacter is an open-source tool that allows users to create and run characters locally with local models or use the hosted version. The stack includes Next.js for frontend, TailwindCSS for styling, Drizzle ORM for database access, NextAuth for authentication, Cloudflare D1 for serverless databases, Cloudflare Pages for hosting, and ShadcnUI as the component library. Users can integrate OpenCharacter with OpenRouter by configuring the OpenRouter API key. The tool is fully scalable, composable, and cost-effective, with powerful tools like Wrangler for database management and migrations. No environment variables are needed, making it easy to use and deploy.

Bard-API
The Bard API is a Python package that returns responses from Google Bard through the value of a cookie. It is an unofficial API that operates through reverse-engineering, utilizing cookie values to interact with Google Bard for users struggling with frequent authentication problems or unable to authenticate via Google Authentication. The Bard API is not a free service, but rather a tool provided to assist developers with testing certain functionalities due to the delayed development and release of Google Bard's API. It has been designed with a lightweight structure that can easily adapt to the emergence of an official API. Therefore, using it for any other purposes is strongly discouraged. If you have access to a reliable official PaLM-2 API or Google Generative AI API, replace the provided response with the corresponding official code. Check out https://github.com/dsdanielpark/Bard-API/issues/262.

shellChatGPT
ShellChatGPT is a shell wrapper for OpenAI's ChatGPT, DALL-E, Whisper, and TTS, featuring integration with LocalAI, Ollama, Gemini, Mistral, Groq, and GitHub Models. It provides text and chat completions, vision, reasoning, and audio models, voice-in and voice-out chatting mode, text editor interface, markdown rendering support, session management, instruction prompt manager, integration with various service providers, command line completion, file picker dialogs, color scheme personalization, stdin and text file input support, and compatibility with Linux, FreeBSD, MacOS, and Termux for a responsive experience.

allms
allms is a versatile and powerful library designed to streamline the process of querying Large Language Models (LLMs). Developed by Allegro engineers, it simplifies working with LLM applications by providing a user-friendly interface, asynchronous querying, automatic retrying mechanism, error handling, and output parsing. It supports various LLM families hosted on different platforms like OpenAI, Google, Azure, and GCP. The library offers features for configuring endpoint credentials, batch querying with symbolic variables, and forcing structured output format. It also provides documentation, quickstart guides, and instructions for local development, testing, updating documentation, and making new releases.

tonic_validate
Tonic Validate is a framework for the evaluation of LLM outputs, such as Retrieval Augmented Generation (RAG) pipelines. Validate makes it easy to evaluate, track, and monitor your LLM and RAG applications. Validate allows you to evaluate your LLM outputs through the use of our provided metrics which measure everything from answer correctness to LLM hallucination. Additionally, Validate has an optional UI to visualize your evaluation results for easy tracking and monitoring.
For similar tasks

bedrock-claude-chat
This repository is a sample chatbot using the Anthropic company's LLM Claude, one of the foundational models provided by Amazon Bedrock for generative AI. It allows users to have basic conversations with the chatbot, personalize it with their own instructions and external knowledge, and analyze usage for each user/bot on the administrator dashboard. The chatbot supports various languages, including English, Japanese, Korean, Chinese, French, German, and Spanish. Deployment is straightforward and can be done via the command line or by using AWS CDK. The architecture is built on AWS managed services, eliminating the need for infrastructure management and ensuring scalability, reliability, and security.

MITSUHA
OneReality is a virtual waifu/assistant that you can speak to through your mic and it'll speak back to you! It has many features such as: * You can speak to her with a mic * It can speak back to you * Has short-term memory and long-term memory * Can open apps * Smarter than you * Fluent in English, Japanese, Korean, and Chinese * Can control your smart home like Alexa if you set up Tuya (more info in Prerequisites) It is built with Python, Llama-cpp-python, Whisper, SpeechRecognition, PocketSphinx, VITS-fast-fine-tuning, VITS-simple-api, HyperDB, Sentence Transformers, and Tuya Cloud IoT.
For similar jobs

weave
Weave is a toolkit for developing Generative AI applications, built by Weights & Biases. With Weave, you can log and debug language model inputs, outputs, and traces; build rigorous, apples-to-apples evaluations for language model use cases; and organize all the information generated across the LLM workflow, from experimentation to evaluations to production. Weave aims to bring rigor, best-practices, and composability to the inherently experimental process of developing Generative AI software, without introducing cognitive overhead.

LLMStack
LLMStack is a no-code platform for building generative AI agents, workflows, and chatbots. It allows users to connect their own data, internal tools, and GPT-powered models without any coding experience. LLMStack can be deployed to the cloud or on-premise and can be accessed via HTTP API or triggered from Slack or Discord.

VisionCraft
The VisionCraft API is a free API for using over 100 different AI models. From images to sound.

kaito
Kaito is an operator that automates the AI/ML inference model deployment in a Kubernetes cluster. It manages large model files using container images, avoids tuning deployment parameters to fit GPU hardware by providing preset configurations, auto-provisions GPU nodes based on model requirements, and hosts large model images in the public Microsoft Container Registry (MCR) if the license allows. Using Kaito, the workflow of onboarding large AI inference models in Kubernetes is largely simplified.

PyRIT
PyRIT is an open access automation framework designed to empower security professionals and ML engineers to red team foundation models and their applications. It automates AI Red Teaming tasks to allow operators to focus on more complicated and time-consuming tasks and can also identify security harms such as misuse (e.g., malware generation, jailbreaking), and privacy harms (e.g., identity theft). The goal is to allow researchers to have a baseline of how well their model and entire inference pipeline is doing against different harm categories and to be able to compare that baseline to future iterations of their model. This allows them to have empirical data on how well their model is doing today, and detect any degradation of performance based on future improvements.

tabby
Tabby is a self-hosted AI coding assistant, offering an open-source and on-premises alternative to GitHub Copilot. It boasts several key features: * Self-contained, with no need for a DBMS or cloud service. * OpenAPI interface, easy to integrate with existing infrastructure (e.g Cloud IDE). * Supports consumer-grade GPUs.

spear
SPEAR (Simulator for Photorealistic Embodied AI Research) is a powerful tool for training embodied agents. It features 300 unique virtual indoor environments with 2,566 unique rooms and 17,234 unique objects that can be manipulated individually. Each environment is designed by a professional artist and features detailed geometry, photorealistic materials, and a unique floor plan and object layout. SPEAR is implemented as Unreal Engine assets and provides an OpenAI Gym interface for interacting with the environments via Python.

Magick
Magick is a groundbreaking visual AIDE (Artificial Intelligence Development Environment) for no-code data pipelines and multimodal agents. Magick can connect to other services and comes with nodes and templates well-suited for intelligent agents, chatbots, complex reasoning systems and realistic characters.