bedrock-claude-chat
AWS-native chatbot using Bedrock + Claude (+Mistral)
Stars: 816
This repository is a sample chatbot using the Anthropic company's LLM Claude, one of the foundational models provided by Amazon Bedrock for generative AI. It allows users to have basic conversations with the chatbot, personalize it with their own instructions and external knowledge, and analyze usage for each user/bot on the administrator dashboard. The chatbot supports various languages, including English, Japanese, Korean, Chinese, French, German, and Spanish. Deployment is straightforward and can be done via the command line or by using AWS CDK. The architecture is built on AWS managed services, eliminating the need for infrastructure management and ensuring scalability, reliability, and security.
README:
[!Warning] A major update to v2 is planned soon. v2 will not be backward compatible with v1, and existing RAG bots will no longer be usable. For more details, please refer to the migration guide.
[!Warning] If you are using old version (e.g.
v0.4.x
) and wish to use the latest version, refer migration guide. Without any care, ALL DATA IN Aurora cluster WILL BE DESTROYED, and NO LONGER USERS CANNOT USE EXISTING BOTS WITH KNOWLEDGE AND CREATE NEW BOTS.
This repository is a sample chatbot using the Anthropic company's LLM Claude, one of the foundational models provided by Amazon Bedrock for generative AI.
Not only text but also images are available with Anthropic's Claude 3. Currently we support Haiku
, Sonnet
and Opus
.
Add your own instruction and give external knowledge as URL or files (a.k.a RAG). The bot can be shared among application users. The customized bot also can be published as stand-alone API (See the detail).
[!Important] For governance reasons, only allowed users are able to create customized bots. To allow the creation of customized bots, the user must be a member of group called
CreatingBotAllowed
, which can be set up via the management console > Amazon Cognito User pools or aws cli. Note that the user pool id can be referred by accessing CloudFormation > BedrockChatStack > Outputs >AuthUserPoolIdxxxx
.
LLM-powered Agent
By using the Agent functionality, your chatbot can automatically handle more complex tasks. For example, to answer a user's question, the Agent can retrieve necessary information from external tools or break down the task into multiple steps for processing.
- English π¬
- ζ₯ζ¬θͺ π¬ (γγγ₯γ‘γ³γγ―γγ‘γ)
- νκ΅μ΄ π¬
- δΈζ π¬
- FranΓ§ais π¬
- Deutsch π¬
- EspaΓ±ol π¬
- Italian π¬
- In the us-east-1 region, open Bedrock Model access >
Manage model access
> CheckAnthropic / Claude 3 Haiku
,Anthropic / Claude 3 Sonnet
,Anthropic / Claude 3.5 Sonnet
andCohere / Embed Multilingual
thenSave changes
.
- Open CloudShell at the region where you want to deploy
- Run deployment via following commands. If you want to specify the version to deploy or need to apply security policies, please specify the appropriate parameters from Optional Parameters.
git clone https://github.com/aws-samples/bedrock-claude-chat.git
cd bedrock-claude-chat
chmod +x bin.sh
./bin.sh
- You will be asked if a new user or using v1. If you are not a continuing user from v0, please enter
y
.
You can specify the following parameters during deployment to enhance security and customization:
- --disable-self-register: Disable self-registration (default: enabled). If this flag is set, you will need to create all users on cognito and it will not allow users to self register their accounts.
- --ipv4-ranges: Comma-separated list of allowed IPv4 ranges. (default: allow all ipv4 addresses)
- --ipv6-ranges: Comma-separated list of allowed IPv6 ranges. (default: allow all ipv6 addresses)
- --disable-ipv6: Disable connections over IPv6. (default: enabled)
- --allowed-signup-email-domains: Comma-separated list of allowed email domains for sign-up. (default: no domain restriction)
- --bedrock-region: Define the region where bedrock is available. (default: us-east-1)
- --version: The version of Bedrock Claude Chat to deploy. (default: latest version in development)
./bin.sh --disable-self-register --ipv4-ranges "192.0.2.0/25,192.0.2.128/25" --ipv6-ranges "2001:db8:1:2::/64,2001:db8:1:3::/64" --allowed-signup-email-domains "example.com,anotherexample.com" --bedrock-region "us-west-2" --version "v1.2.6"
- After about 35 minutes, you will get the following output, which you can access from your browser
Frontend URL: https://xxxxxxxxx.cloudfront.net
The sign-up screen will appear as shown above, where you can register your email and log in.
[!Important] Without setting the optional parameter, this deployment method allows anyone who knows the URL to sign up. For production use, it is strongly recommended to add IP address restrictions and disable self-signup to mitigate security risks (you can define allowed-signup-email-domains to restrict users so that only email addresses from your companyβs domain can sign up). Use both ipv4-ranges and ipv6-ranges for IP address restrictions, and disable self-signup by using disable-self-register when executing ./bin.
[!TIP] If the
Frontend URL
does not appear or Bedrock Claude Chat does not work properly, it may be a problem with the latest version. In this case, please add--version "v1.2.6"
to the parameters and try deployment again.
It's an architecture built on AWS managed services, eliminating the need for infrastructure management. Utilizing Amazon Bedrock, there's no need to communicate with APIs outside of AWS. This enables deploying scalable, reliable, and secure applications.
- Amazon DynamoDB: NoSQL database for conversation history storage
- Amazon API Gateway + AWS Lambda: Backend API endpoint (AWS Lambda Web Adapter, FastAPI)
- Amazon CloudFront + S3: Frontend application delivery (React, Tailwind CSS)
- AWS WAF: IP address restriction
- Amazon Cognito: User authentication
- Amazon Bedrock: Managed service to utilize foundational models via APIs. Claude is used for chat response and Cohere for vector embedding
- Amazon EventBridge Pipes: Receiving event from DynamoDB stream and launching ECS task to embed external knowledge
- Amazon Elastic Container Service: Run crawling, parsing and embedding tasks. Cohere Multilingual is the model used for embedding.
- Amazon Aurora PostgreSQL: Scalable vector store with pgvector plugin
- Amazon Athena: Query service to analyze S3 bucket
Super-easy Deployment uses AWS CodeBuild to perform deployment by CDK internally. This section describes the procedure for deploying directly with CDK.
- Please have UNIX, Docker and a Node.js runtime environment. If not, you can also use Cloud9
[!Important] If there is insufficient storage space in the local environment during deployment, CDK bootstrapping may result in an error. If you are running in Cloud9 etc., we recommend expanding the volume size of the instance before deploying.
- Clone this repository
git clone https://github.com/aws-samples/bedrock-claude-chat
- Install npm packages
cd bedrock-claude-chat
cd cdk
npm ci
- Install AWS CDK
npm i -g aws-cdk
- Before deploying the CDK, you will need to work with Bootstrap once for the region you are deploying to. In this example, we will deploy to the us-east-1 region. Please replace your account id into
<account id>
.
cdk bootstrap aws://<account id>/us-east-1
-
If necessary, edit the following entries in cdk.json if necessary.
-
bedrockRegion
: Region where Bedrock is available. NOTE: Bedrock does NOT support all regions for now. -
allowedIpV4AddressRanges
,allowedIpV6AddressRanges
: Allowed IP Address range.
-
-
Deploy this sample project
cdk deploy --require-approval never --all
- You will get output similar to the following. The URL of the web app will be output in
BedrockChatStack.FrontendURL
, so please access it from your browser.
β
BedrockChatStack
β¨ Deployment time: 78.57s
Outputs:
BedrockChatStack.AuthUserPoolClientIdXXXXX = xxxxxxx
BedrockChatStack.AuthUserPoolIdXXXXXX = ap-northeast-1_XXXX
BedrockChatStack.BackendApiBackendApiUrlXXXXX = https://xxxxx.execute-api.ap-northeast-1.amazonaws.com
BedrockChatStack.FrontendURL = https://xxxxx.cloudfront.net
Update enableMistral
to true
in cdk.json, and run cdk deploy
.
...
"enableMistral": true,
[!Important] This project focus on Anthropic Claude models, the Mistral models are limited supported. For example, prompt examples are based on Claude models. This is a Mistral-only option, once you toggled to enable Mistral models, you can only use Mistral models for all the chat features, NOT both Claude and Mistral models.
Users can adjust the text generation parameters from the custom bot creation screen. If the bot is not used, the default parameters set in config.py will be used.
DEFAULT_GENERATION_CONFIG = {
"max_tokens": 2000,
"top_k": 250,
"top_p": 0.999,
"temperature": 0.6,
"stop_sequences": ["Human: ", "Assistant: "],
}
If using cli and CDK, please cdk destroy
. If not, access CloudFormation and then delete BedrockChatStack
and FrontendWafStack
manually. Please note that FrontendWafStack
is in us-east-1
region.
By setting cdk.json in the following CRON format, you can stop and restart Aurora Serverless resources created by the VectorStore construct. Applying this setting can reduce operating costs. By default, Aurora Serverless is always running. Note that it will be executed in UTC time.
...
"rdbSchedules": {
"stop": {
"minute": "50",
"hour": "10",
"day": "*",
"month": "*",
"year": "*"
},
"start": {
"minute": "40",
"hour": "2",
"day": "*",
"month": "*",
"year": "*"
}
}
This asset automatically detects the language using i18next-browser-languageDetector. You can switch languages from the application menu. Alternatively, you can use Query String to set the language as shown below.
https://example.com?lng=ja
This sample has self sign up enabled by default. To disable self sign up, open cdk.json and switch selfSignUpEnabled
as false
. If you configure external identity provider, the value will be ignored and automatically disabled.
By default, this sample does not restrict the domains for sign-up email addresses. To allow sign-ups only from specific domains, open cdk.json
and specify the domains as a list in allowedSignUpEmailDomains
.
"allowedSignUpEmailDomains": ["example.com"],
By default, this sample deploys 2 NAT gateways, but you can change the number of NAT gateways if you don't need 2 NAT gateways to reduce costs. Open cdk.json
and change this parameter 'number of NAT gateways'.
"natgatewayCount": 2
This sample supports external identity provider. Currently we support Google and custom OIDC provider.
This sample has the following groups to give permissions to users:
If you want newly created users to automatically join groups, you can specify them in cdk.json.
"autoJoinUserGroups": ["CreatingBotAllowed"],
By default, newly created users will be joined to the CreatingBotAllowed
group.
See LOCAL DEVELOPMENT.
Thank you for considering contributing to this repository! We welcome bug fixes, language translations (i18n), feature enhancements, agent tools, and other improvements.
For feature enhancements and other improvements, before creating a Pull Request, we would greatly appreciate it if you could create a Feature Request Issue to discuss the implementation approach and details. For bug fixes and language translations (i18n), proceed with creating a Pull Request directly.
Please also take a look at the following guidelines before contributing:
See here.
This library is licensed under the MIT-0 License. See the LICENSE file.
For Tasks:
Click tags to check more tools for each tasksFor Jobs:
Alternative AI tools for bedrock-claude-chat
Similar Open Source Tools
bedrock-claude-chat
This repository is a sample chatbot using the Anthropic company's LLM Claude, one of the foundational models provided by Amazon Bedrock for generative AI. It allows users to have basic conversations with the chatbot, personalize it with their own instructions and external knowledge, and analyze usage for each user/bot on the administrator dashboard. The chatbot supports various languages, including English, Japanese, Korean, Chinese, French, German, and Spanish. Deployment is straightforward and can be done via the command line or by using AWS CDK. The architecture is built on AWS managed services, eliminating the need for infrastructure management and ensuring scalability, reliability, and security.
lexido
Lexido is an innovative assistant for the Linux command line, designed to boost your productivity and efficiency. Powered by Gemini Pro 1.0 and utilizing the free API, Lexido offers smart suggestions for commands based on your prompts and importantly your current environment. Whether you're installing software, managing files, or configuring system settings, Lexido streamlines the process, making it faster and more intuitive.
june
june-va is a local voice chatbot that combines Ollama for language model capabilities, Hugging Face Transformers for speech recognition, and the Coqui TTS Toolkit for text-to-speech synthesis. It provides a flexible, privacy-focused solution for voice-assisted interactions on your local machine, ensuring that no data is sent to external servers. The tool supports various interaction modes including text input/output, voice input/text output, text input/audio output, and voice input/audio output. Users can customize the tool's behavior with a JSON configuration file and utilize voice conversion features for voice cloning. The application can be further customized using a configuration file with attributes for language model, speech-to-text model, and text-to-speech model configurations.
AIOS
AIOS, a Large Language Model (LLM) Agent operating system, embeds large language model into Operating Systems (OS) as the brain of the OS, enabling an operating system "with soul" -- an important step towards AGI. AIOS is designed to optimize resource allocation, facilitate context switch across agents, enable concurrent execution of agents, provide tool service for agents, maintain access control for agents, and provide a rich set of toolkits for LLM Agent developers.
aiconfig
AIConfig is a framework that makes it easy to build generative AI applications for production. It manages generative AI prompts, models and model parameters as JSON-serializable configs that can be version controlled, evaluated, monitored and opened in a local editor for rapid prototyping. It allows you to store and iterate on generative AI behavior separately from your application code, offering a streamlined AI development workflow.
magic-cli
Magic CLI is a command line utility that leverages Large Language Models (LLMs) to enhance command line efficiency. It is inspired by projects like Amazon Q and GitHub Copilot for CLI. The tool allows users to suggest commands, search across command history, and generate commands for specific tasks using local or remote LLM providers. Magic CLI also provides configuration options for LLM selection and response generation. The project is still in early development, so users should expect breaking changes and bugs.
aimeos-typo3
Aimeos is a professional, full-featured, and high-performance e-commerce extension for TYPO3. It can be installed in an existing TYPO3 website within 5 minutes and can be adapted, extended, overwritten, and customized to meet specific needs.
patchwork
PatchWork is an open-source framework designed for automating development tasks using large language models. It enables users to automate workflows such as PR reviews, bug fixing, security patching, and more through a self-hosted CLI agent and preferred LLMs. The framework consists of reusable atomic actions called Steps, customizable LLM prompts known as Prompt Templates, and LLM-assisted automations called Patchflows. Users can run Patchflows locally in their CLI/IDE or as part of CI/CD pipelines. PatchWork offers predefined patchflows like AutoFix, PRReview, GenerateREADME, DependencyUpgrade, and ResolveIssue, with the flexibility to create custom patchflows. Prompt templates are used to pass queries to LLMs and can be customized. Contributions to new patchflows, steps, and the core framework are encouraged, with chat assistants available to aid in the process. The roadmap includes expanding the patchflow library, introducing a debugger and validation module, supporting large-scale code embeddings, parallelization, fine-tuned models, and an open-source GUI. PatchWork is licensed under AGPL-3.0 terms, while custom patchflows and steps can be shared using the Apache-2.0 licensed patchwork template repository.
chatgpt-vscode
ChatGPT-VSCode is a Visual Studio Code integration that allows users to prompt OpenAI's GPT-4, GPT-3.5, GPT-3, and Codex models within the editor. It offers features like using improved models via OpenAI API Key, Azure OpenAI Service deployments, generating commit messages, storing conversation history, explaining and suggesting fixes for compile-time errors, viewing code differences, and more. Users can customize prompts, quick fix problems, save conversations, and export conversation history. The extension is designed to enhance developer experience by providing AI-powered assistance directly within VS Code.
langserve
LangServe helps developers deploy `LangChain` runnables and chains as a REST API. This library is integrated with FastAPI and uses pydantic for data validation. In addition, it provides a client that can be used to call into runnables deployed on a server. A JavaScript client is available in LangChain.js.
cognee
Cognee is an open-source framework designed for creating self-improving deterministic outputs for Large Language Models (LLMs) using graphs, LLMs, and vector retrieval. It provides a platform for AI engineers to enhance their models and generate more accurate results. Users can leverage Cognee to add new information, utilize LLMs for knowledge creation, and query the system for relevant knowledge. The tool supports various LLM providers and offers flexibility in adding different data types, such as text files or directories. Cognee aims to streamline the process of working with LLMs and improving AI models for better performance and efficiency.
sdfx
SDFX is the ultimate no-code platform for building and sharing AI apps with beautiful UI. It enables the creation of user-friendly interfaces for complex workflows by combining Comfy workflow with a UI. The tool is designed to merge the benefits of form-based UI and graph-node based UI, allowing users to create intricate graphs with a high-level UI overlay. SDFX is fully compatible with ComfyUI, abstracting the need for installing ComfyUI. It offers features like animated graph navigation, node bookmarks, UI debugger, custom nodes manager, app and template export, image and mask editor, and more. The tool compiles as a native app or web app, making it easy to maintain and add new features.
giskard
Giskard is an open-source Python library that automatically detects performance, bias & security issues in AI applications. The library covers LLM-based applications such as RAG agents, all the way to traditional ML models for tabular data.
llama-cpp-agent
The llama-cpp-agent framework is a tool designed for easy interaction with Large Language Models (LLMs). Allowing users to chat with LLM models, execute structured function calls and get structured output (objects). It provides a simple yet robust interface and supports llama-cpp-python and OpenAI endpoints with GBNF grammar support (like the llama-cpp-python server) and the llama.cpp backend server. It works by generating a formal GGML-BNF grammar of the user defined structures and functions, which is then used by llama.cpp to generate text valid to that grammar. In contrast to most GBNF grammar generators it also supports nested objects, dictionaries, enums and lists of them.
comfy-cli
Comfy-cli is a command line tool designed to facilitate the installation and management of ComfyUI, an open-source machine learning framework. Users can easily set up ComfyUI, install packages, and manage custom nodes directly from the terminal. The tool offers features such as easy installation, seamless package management, custom node management, checkpoint downloads, cross-platform compatibility, and comprehensive documentation. Comfy-cli simplifies the process of working with ComfyUI, making it convenient for users to handle various tasks related to the framework.
BentoML
BentoML is an open-source model serving library for building performant and scalable AI applications with Python. It comes with everything you need for serving optimization, model packaging, and production deployment.
For similar tasks
bedrock-claude-chat
This repository is a sample chatbot using the Anthropic company's LLM Claude, one of the foundational models provided by Amazon Bedrock for generative AI. It allows users to have basic conversations with the chatbot, personalize it with their own instructions and external knowledge, and analyze usage for each user/bot on the administrator dashboard. The chatbot supports various languages, including English, Japanese, Korean, Chinese, French, German, and Spanish. Deployment is straightforward and can be done via the command line or by using AWS CDK. The architecture is built on AWS managed services, eliminating the need for infrastructure management and ensuring scalability, reliability, and security.
MITSUHA
OneReality is a virtual waifu/assistant that you can speak to through your mic and it'll speak back to you! It has many features such as: * You can speak to her with a mic * It can speak back to you * Has short-term memory and long-term memory * Can open apps * Smarter than you * Fluent in English, Japanese, Korean, and Chinese * Can control your smart home like Alexa if you set up Tuya (more info in Prerequisites) It is built with Python, Llama-cpp-python, Whisper, SpeechRecognition, PocketSphinx, VITS-fast-fine-tuning, VITS-simple-api, HyperDB, Sentence Transformers, and Tuya Cloud IoT.
For similar jobs
weave
Weave is a toolkit for developing Generative AI applications, built by Weights & Biases. With Weave, you can log and debug language model inputs, outputs, and traces; build rigorous, apples-to-apples evaluations for language model use cases; and organize all the information generated across the LLM workflow, from experimentation to evaluations to production. Weave aims to bring rigor, best-practices, and composability to the inherently experimental process of developing Generative AI software, without introducing cognitive overhead.
LLMStack
LLMStack is a no-code platform for building generative AI agents, workflows, and chatbots. It allows users to connect their own data, internal tools, and GPT-powered models without any coding experience. LLMStack can be deployed to the cloud or on-premise and can be accessed via HTTP API or triggered from Slack or Discord.
VisionCraft
The VisionCraft API is a free API for using over 100 different AI models. From images to sound.
kaito
Kaito is an operator that automates the AI/ML inference model deployment in a Kubernetes cluster. It manages large model files using container images, avoids tuning deployment parameters to fit GPU hardware by providing preset configurations, auto-provisions GPU nodes based on model requirements, and hosts large model images in the public Microsoft Container Registry (MCR) if the license allows. Using Kaito, the workflow of onboarding large AI inference models in Kubernetes is largely simplified.
PyRIT
PyRIT is an open access automation framework designed to empower security professionals and ML engineers to red team foundation models and their applications. It automates AI Red Teaming tasks to allow operators to focus on more complicated and time-consuming tasks and can also identify security harms such as misuse (e.g., malware generation, jailbreaking), and privacy harms (e.g., identity theft). The goal is to allow researchers to have a baseline of how well their model and entire inference pipeline is doing against different harm categories and to be able to compare that baseline to future iterations of their model. This allows them to have empirical data on how well their model is doing today, and detect any degradation of performance based on future improvements.
tabby
Tabby is a self-hosted AI coding assistant, offering an open-source and on-premises alternative to GitHub Copilot. It boasts several key features: * Self-contained, with no need for a DBMS or cloud service. * OpenAPI interface, easy to integrate with existing infrastructure (e.g Cloud IDE). * Supports consumer-grade GPUs.
spear
SPEAR (Simulator for Photorealistic Embodied AI Research) is a powerful tool for training embodied agents. It features 300 unique virtual indoor environments with 2,566 unique rooms and 17,234 unique objects that can be manipulated individually. Each environment is designed by a professional artist and features detailed geometry, photorealistic materials, and a unique floor plan and object layout. SPEAR is implemented as Unreal Engine assets and provides an OpenAI Gym interface for interacting with the environments via Python.
Magick
Magick is a groundbreaking visual AIDE (Artificial Intelligence Development Environment) for no-code data pipelines and multimodal agents. Magick can connect to other services and comes with nodes and templates well-suited for intelligent agents, chatbots, complex reasoning systems and realistic characters.