generative-ai-cdk-constructs

AWS Generative AI CDK Constructs are sample implementations of AWS CDK for common generative AI patterns.

Stars: 444

Visit

The AWS Generative AI Constructs Library is an open-source extension of the AWS Cloud Development Kit (AWS CDK) that provides multi-service, well-architected patterns for quickly defining solutions in code to create predictable and repeatable infrastructure, called constructs. The goal of AWS Generative AI CDK Constructs is to help developers build generative AI solutions using pattern-based definitions for their architecture. The patterns defined in AWS Generative AI CDK Constructs are high level, multi-service abstractions of AWS CDK constructs that have default configurations based on well-architected best practices. The library is organized into logical modules using object-oriented techniques to create each architectural pattern model.

README:

AWS Generative AI CDK Constructs

All classes are under active development and subject to non-backward compatible changes or removal in any future version. These are not subject to the Semantic Versioning model. This means that while you may use them, you may need to update your source code when upgrading to a newer version of this package.

Introduction
CDK Versions
Contributing
Design guidelines and Development guide
Getting Started
Catalog
Sample Use Cases
Additional Resources
Contributors
Operational Metrics Collection
Roadmap
Deprecation
License
Legal Disclaimer

Introduction

The patterns defined in AWS Generative AI CDK Constructs are high level, multi-service abstractions of AWS CDK constructs that have default configurations based on well-architected best practices. The library is organized into logical modules using object-oriented techniques to create each architectural pattern model.

CDK Versions

AWS Generative AI CDK Constructs and the AWS CDK are independent teams and have different release schedules. Each release of AWS Generative AI CDK Constructs is built against a specific version of the AWS CDK. The CHANGELOG.md file lists the CDK version associated with each AWS Generative AI Constructs release. For instance, AWS Generative AI CDK Constructs v0.0.0 was built against AWS CDK v2.96.2. This means that to use AWS Generative AI CDK Constructs v0.0.0, your application must include AWS CDK v2.96.2 or later. You can continue to use the latest AWS CDK versions and upgrade the your AWS Generative AI CDK Constructs version when new releases become available.

Contributing

Contributions of all kinds are welcome! Check out our contributor guide

Design guidelines and Development guide

If you want to add a new construct to the library, check out our design guidelines, then follow the development guide

Getting Started

TypeScript

Create or use an existing CDK application in TypeScript.
- cdk init app --language typescript
Run npm install @cdklabs/generative-ai-cdk-constructs
The package should be added to your package.json.
Import the library:
- import * as genai from '@cdklabs/generative-ai-cdk-constructs';

Python

Create or use an existing CDK application in Python
- cdk init app --language python
Install the package:
- pip install cdklabs.generative-ai-cdk-constructs
Import the library:
- import cdklabs.generative_ai_cdk_constructs

NuGet

Create or use an existing CDK application in Python
- cdk init app --language csharp
Install the package while in the Visual Studio project:
- dotnet add package CdkLabs.GenerativeAICdkConstructs
Use the namespace:
- using Cdklabs.GenerativeAiCdkConstructs;

Create or use an existing CDK application in Python
- cdk init app --language go
Get the module:
- go get github.com/cdklabs/generative-ai-cdk-constructs-go/generative-ai-cdk-constructs
Import the library:
- import "github.com/cdklabs/generative-ai-cdk-constructs-go/generative-ai-cdk-constructs"

NOTE: The Go distribution repository, distributes the JSII tar gzipped versioned source from the source repository

Java

Create or use an existing CDK application in Java
- cdk init app --language java
Add the dependency into the pom.xml

<dependency>
    <groupId>io.github.cdklabs</groupId>
    <artifactId>generative-ai-cdk-constructs</artifactId>
    <version>Get the latest version and insert it here</version>
</dependency>

Refer to the documentation for additional guidance on a particular construct: Catalog

Catalog

The following constructs are available in the library:

L3 constructs

Construct	Description	AWS Services used
SageMaker model deployment (JumpStart)	Deploy a foundation model from Amazon SageMaker JumpStart to an Amazon SageMaker endpoint.	Amazon SageMaker
SageMaker model deployment (Hugging Face)	Deploy a foundation model from Hugging Face to an Amazon SageMaker endpoint.	Amazon SageMaker
SageMaker model deployment (Custom)	Deploy a foundation model from an S3 location to an Amazon SageMaker endpoint.	Amazon SageMaker
Amazon Bedrock Monitoring (Amazon CloudWatch Dashboard)	Amazon CloudWatch dashboard to monitor model usage from Amazon Bedrock.	Amazon CloudWatch
Bedrock Data Automation	Use Amazon bedrock data automation client to to build and manage intelligent document processing, media analysis, and other multimodal data-centric automation solutions	AWS Lambda, Amazon S3 bucket
Bedrock Batch Step Functions	Manage Bedrock model invocation jobs(batch inference) in AWS Step Functions state machines	AWS Step Functions, AWS Lambda, AWS EventBridge, Amazon Bedrock, AWS IAM

L2 Constructs

Construct	Description	AWS Services used
Amazon Bedrock	CDK L2 Constructs for Amazon Bedrock.	Amazon Bedrock, Amazon OpenSearch Serverless, AWS Lambda
Amazon OpenSearch Serverless Vector Collection	CDK L2 Constructs to create a vector collection.	Amazon OpenSearch Vector Index
Amazon OpenSearch Vector Index	CDK L1 Custom Resource to create a vector index.	Amazon OpenSearch Serverless, AWS Lambda

Sample Use Cases

The official samples repository includes a collection of functional use case implementations to demonstrate the usage of AWS Generative AI CDK Constructs. These can be used in the same way as architectural patterns, and can be conceptualized as an additional "higher-level" abstraction of those patterns. Those patterns (constructs) are composed together into stacks, forming a "CDK app".

Additional Resources

Resource	Type	Description
AWS re:Invent 2023 - Keynote with Dr. Werner Vogels	Keynote	Dr. Werner Vogels, Amazon.com's VP and CTO, announces the AWS Generative AI CDK Constructs during his AWS re:Invent 2023 keynote.
Workshop - Building Generative AI Apps on AWS with CDK	Workshop	In this workshop, you will explore how to build a sample generative AI app on AWS using CDK and Generative AI CDK Constructs.
Workshop - Hands on AWS CDK Generative AI Constructs	Workshop	In this workshop you will deploy projects that use CDK constructs from this library. Projects are from the amazon-bedrock-samples Github Repository.
Build generative AI applications with Amazon Titan Text Premier, Amazon Bedrock, and AWS CDK	Blog post + Code sample	Blog post exploring building and deploying two sample applications powered by Amazon Titan Text Premier using the Generative AI CDK constructs.
aws-cdk-stack-builder-tool	Code sample	AWS CDK Builder is a browser-based tool designed to streamline bootstrapping of Infrastructure as Code (IaC) projects using the AWS Cloud Development Kit (CDK).
CDK Live! Building generative AI applications and architectures leveraging AWS CDK Constructs!	Video	CDK Live! episode focused on building and deploying generative AI applications and architectures on AWS using the AWS Cloud Development Kit (CDK) and the AWS Generative AI CDK Constructs.
Announcing AWS Generative AI CDK Constructs!	Blog post	Blog post announcing the release of the AWS Generative AI CDK Constructs.
Streamline insurance underwriting with generative AI using Amazon Bedrock	Blog post + Code sample	Blog post and code sample discussing how to use AWS generative artificial intelligence (AI) solutions like Amazon Bedrock to improve the underwriting process, including rule validation, underwriting guidelines adherence, and decision justification.
aws-genai-llm-chatbot	Code sample	Multi-Model and Multi-RAG Powered Chatbot Using AWS CDK on AWS allowing you to experiment with a variety of Large Language Models and Multimodal Language Models, settings and prompts in your own AWS account.
bedrock-claude-chat	Code sample	AWS-native chatbot using Bedrock + Claude (+Mistral).
amazon-bedrock-rag	Code sample	Fully managed RAG solution using Knowledge Bases for Amazon Bedrock.
Amazon Bedrock Multimodal Search	Code sample	Multimodal product search app built using Amazon Titan Multimodal Embeddings model.
Amazon Bedrock Knowledge Bases with Private Data	Blog post + Code sample	Blog post and associated code sample demonstrating how to integrate Knowledge Bases into Amazon Bedrock to provide foundational models with contextual data from private data sources.
Automating tasks using Amazon Bedrock Agents and AI	Blog post + Code sample	Blog post and associated code sample demonstrating how to deploy an Amazon Bedrock Agent and a Knowledge Base through a hotel and spa use case.
Agents for Amazon Bedrock - Powertools for AWS Lambda (Python)	Code sample	Create Agents for Amazon Bedrock using event handlers and auto generation of OpenAPI schemas.
Text to SQL Bedrock Agent	Code sample	Harnessing the power of natural language processing, the "Text to SQL Bedrock Agent" facilitates the automatic transformation of natural language questions into executable SQL queries.
Dynamic Text-to-SQL for Enterprise Workloads with Amazon Bedrock Agent	Code sample	Elevate your data analysis with an end-to-end agentic Text-to-SQL solution, built on AWS for enterprise-scale adaptability and resilience. Ideal for complex scenarios like fraud detection in financial services.

Contributors

Operational Metrics Collection

Generative AI CDK Constructs may collect anonymous operational metrics, including: the region a construct is deployed, the name and version of the construct deployed, and related information. We may use the metrics to maintain, provide, develop, and improve the constructs and AWS services.

Roadmap

Roadmap is available through the GitHub Project

Deprecation

To understand our deprecation process, please refer to the dedicated documentation

License

Apache-2.0

Legal Disclaimer

You should consider doing your own independent assessment before using the content in this library for production purposes. This may include (amongst other things) testing, securing, and optimizing the CDK constructs and other content, provided in this library, based on your specific quality control practices and standards.

For Tasks:

Click tags to check more tools for each tasks

build generative ai solutions define solutions in code create predictable and repeatable infrastructure

For Jobs:

data scientist machine learning engineer software engineer devops engineer cloud architect

Alternative AI tools for generative-ai-cdk-constructs

Similar Open Source Tools

generative-ai-cdk-constructs

github

: 444

taranis-ai

Taranis AI is an advanced Open-Source Intelligence (OSINT) tool that leverages Artificial Intelligence to revolutionize information gathering and situational analysis. It navigates through diverse data sources like websites to collect unstructured news articles, utilizing Natural Language Processing and Artificial Intelligence to enhance content quality. Analysts then refine these AI-augmented articles into structured reports that serve as the foundation for deliverables such as PDF files, which are ultimately published.

github

: 358

leapfrogai

LeapfrogAI is a self-hosted AI platform designed to be deployed in air-gapped resource-constrained environments. It brings sophisticated AI solutions to these environments by hosting all the necessary components of an AI stack, including vector databases, model backends, API, and UI. LeapfrogAI's API closely matches that of OpenAI, allowing tools built for OpenAI/ChatGPT to function seamlessly with a LeapfrogAI backend. It provides several backends for various use cases, including llama-cpp-python, whisper, text-embeddings, and vllm. LeapfrogAI leverages Chainguard's apko to harden base python images, ensuring the latest supported Python versions are used by the other components of the stack. The LeapfrogAI SDK provides a standard set of protobuffs and python utilities for implementing backends and gRPC. LeapfrogAI offers UI options for common use-cases like chat, summarization, and transcription. It can be deployed and run locally via UDS and Kubernetes, built out using Zarf packages. LeapfrogAI is supported by a community of users and contributors, including Defense Unicorns, Beast Code, Chainguard, Exovera, Hypergiant, Pulze, SOSi, United States Navy, United States Air Force, and United States Space Force.

github

: 255

aiops-modules

AIOps Modules is a collection of reusable Infrastructure as Code (IAC) modules that work with SeedFarmer CLI. The modules are decoupled and can be aggregated using GitOps principles to achieve desired use cases, removing heavy lifting for end users. They must be generic for reuse in Machine Learning and Foundation Model Operations domain, adhering to SeedFarmer Guide structure. The repository includes deployment steps, project manifests, and various modules for SageMaker, Mlflow, FMOps/LLMOps, MWAA, Step Functions, EKS, and example use cases. It also supports Industry Data Framework (IDF) and Autonomous Driving Data Framework (ADDF) Modules.

github

: 72

OmAgent

OmAgent is an open-source agent framework designed to streamline the development of on-device multimodal agents. It enables agents to empower various hardware devices, integrates speed-optimized SOTA multimodal models, provides SOTA multimodal agent algorithms, and focuses on optimizing the end-to-end computing pipeline for real-time user interaction experience. Key features include easy connection to diverse devices, scalability, flexibility, and workflow orchestration. The architecture emphasizes graph-based workflow orchestration, native multimodality, and device-centricity, allowing developers to create bespoke intelligent agent programs.

github

: 1.3k

refly

Refly.AI is an open-source AI-native creation engine that empowers users to transform ideas into production-ready content. It features a free-form canvas interface with multi-threaded conversations, knowledge base integration, contextual memory, intelligent search, WYSIWYG AI editor, and more. Users can leverage AI-powered capabilities, context memory, knowledge base integration, quotes, and AI document editing to enhance their content creation process. Refly offers both cloud and self-hosting options, making it suitable for individuals, enterprises, and organizations. The tool is designed to facilitate human-AI collaboration and streamline content creation workflows.

github

: 3.4k

kubesphere

KubeSphere is a distributed operating system for cloud-native application management, using Kubernetes as its kernel. It provides a plug-and-play architecture, allowing third-party applications to be seamlessly integrated into its ecosystem. KubeSphere is also a multi-tenant container platform with full-stack automated IT operation and streamlined DevOps workflows. It provides developer-friendly wizard web UI, helping enterprises to build out a more robust and feature-rich platform, which includes most common functionalities needed for enterprise Kubernetes strategy.

github

: 15.1k

katib

Katib is a Kubernetes-native project for automated machine learning (AutoML). Katib supports Hyperparameter Tuning, Early Stopping and Neural Architecture Search. Katib is the project which is agnostic to machine learning (ML) frameworks. It can tune hyperparameters of applications written in any language of the users’ choice and natively supports many ML frameworks, such as TensorFlow, Apache MXNet, PyTorch, XGBoost, and others. Katib can perform training jobs using any Kubernetes Custom Resources with out of the box support for Kubeflow Training Operator, Argo Workflows, Tekton Pipelines and many more.

github

: 1.5k

mlcraft

Synmetrix (prev. MLCraft) is an open source data engineering platform and semantic layer for centralized metrics management. It provides a complete framework for modeling, integrating, transforming, aggregating, and distributing metrics data at scale. Key features include data modeling and transformations, semantic layer for unified data model, scheduled reports and alerts, versioning, role-based access control, data exploration, caching, and collaboration on metrics modeling. Synmetrix leverages Cube (Cube.js) for flexible data models that consolidate metrics from various sources, enabling downstream distribution via a SQL API for integration into BI tools, reporting, dashboards, and data science. Use cases include data democratization, business intelligence, embedded analytics, and enhancing accuracy in data handling and queries. The tool speeds up data-driven workflows from metrics definition to consumption by combining data engineering best practices with self-service analytics capabilities.

github

: 480

synmetrix

Synmetrix is an open source data engineering platform and semantic layer for centralized metrics management. It provides a complete framework for modeling, integrating, transforming, aggregating, and distributing metrics data at scale. Key features include data modeling and transformations, semantic layer for unified data model, scheduled reports and alerts, versioning, role-based access control, data exploration, caching, and collaboration on metrics modeling. Synmetrix leverages Cube.js to consolidate metrics from various sources and distribute them downstream via a SQL API. Use cases include data democratization, business intelligence and reporting, embedded analytics, and enhancing accuracy in data handling and queries. The tool speeds up data-driven workflows from metrics definition to consumption by combining data engineering best practices with self-service analytics capabilities.

github

: 531

MNN

MNN is a highly efficient and lightweight deep learning framework that supports inference and training of deep learning models. It has industry-leading performance for on-device inference and training. MNN has been integrated into various Alibaba Inc. apps and is used in scenarios like live broadcast, short video capture, search recommendation, and product searching by image. It is also utilized on embedded devices such as IoT. MNN-LLM and MNN-Diffusion are specific runtime solutions developed based on the MNN engine for deploying language models and diffusion models locally on different platforms. The framework is optimized for devices, supports various neural networks, and offers high performance with optimized assembly code and GPU support. MNN is versatile, easy to use, and supports hybrid computing on multiple devices.

github

: 10.1k

AI-Toolbox

AI-Toolbox is a C++ library aimed at representing and solving common AI problems, with a focus on MDPs, POMDPs, and related algorithms. It provides an easy-to-use interface that is extensible to many problems while maintaining readable code. The toolbox includes tutorials for beginners in reinforcement learning and offers Python bindings for seamless integration. It features utilities for combinatorics, polytopes, linear programming, sampling, distributions, statistics, belief updating, data structures, logging, seeding, and more. Additionally, it supports bandit/normal games, single agent MDP/stochastic games, single agent POMDP, and factored/joint multi-agent scenarios.

github

: 657

all-rag-techniques

This repository provides a hands-on approach to Retrieval-Augmented Generation (RAG) techniques, simplifying advanced concepts into understandable implementations using Python libraries like openai, numpy, and matplotlib. It offers a collection of Jupyter Notebooks with concise explanations, step-by-step implementations, code examples, evaluations, and visualizations for various RAG techniques. The goal is to make RAG more accessible and demystify its workings for educational purposes.

github

: 504

langkit

LangKit is an open-source text metrics toolkit for monitoring language models. It offers methods for extracting signals from input/output text, compatible with whylogs. Features include text quality, relevance, security, sentiment, toxicity analysis. Installation via PyPI. Modules contain UDFs for whylogs. Benchmarks show throughput on AWS instances. FAQs available.

github

: 823

LLMs-from-scratch

This repository contains the code for coding, pretraining, and finetuning a GPT-like LLM and is the official code repository for the book Build a Large Language Model (From Scratch). In _Build a Large Language Model (From Scratch)_, you'll discover how LLMs work from the inside out. In this book, I'll guide you step by step through creating your own LLM, explaining each stage with clear text, diagrams, and examples. The method described in this book for training and developing your own small-but-functional model for educational purposes mirrors the approach used in creating large-scale foundational models such as those behind ChatGPT.

github

: 43.7k

second-brain-ai-assistant-course

This open-source course teaches how to build an advanced RAG and LLM system using LLMOps and ML systems best practices. It helps you create an AI assistant that leverages your personal knowledge base to answer questions, summarize documents, and provide insights. The course covers topics such as LLM system architecture, pipeline orchestration, large-scale web crawling, model fine-tuning, and advanced RAG features. It is suitable for ML/AI engineers and data/software engineers & data scientists looking to level up to production AI systems. The course is free, with minimal costs for tools like OpenAI's API and Hugging Face's Dedicated Endpoints. Participants will build two separate Python applications for offline ML pipelines and online inference pipeline.

github

: 539

For similar tasks

generative-ai-cdk-constructs

github

: 444

For similar jobs

minio

MinIO is a High Performance Object Storage released under GNU Affero General Public License v3.0. It is API compatible with Amazon S3 cloud storage service. Use MinIO to build high performance infrastructure for machine learning, analytics and application data workloads.

github

: 46.0k

ai-on-gke

This repository contains assets related to AI/ML workloads on Google Kubernetes Engine (GKE). Run optimized AI/ML workloads with Google Kubernetes Engine (GKE) platform orchestration capabilities. A robust AI/ML platform considers the following layers: Infrastructure orchestration that support GPUs and TPUs for training and serving workloads at scale Flexible integration with distributed computing and data processing frameworks Support for multiple teams on the same infrastructure to maximize utilization of resources

github

: 280

kong

Kong, or Kong API Gateway, is a cloud-native, platform-agnostic, scalable API Gateway distinguished for its high performance and extensibility via plugins. It also provides advanced AI capabilities with multi-LLM support. By providing functionality for proxying, routing, load balancing, health checking, authentication (and more), Kong serves as the central layer for orchestrating microservices or conventional API traffic with ease. Kong runs natively on Kubernetes thanks to its official Kubernetes Ingress Controller.

github

: 40.4k

AI-in-a-Box

AI-in-a-Box is a curated collection of solution accelerators that can help engineers establish their AI/ML environments and solutions rapidly and with minimal friction, while maintaining the highest standards of quality and efficiency. It provides essential guidance on the responsible use of AI and LLM technologies, specific security guidance for Generative AI (GenAI) applications, and best practices for scaling OpenAI applications within Azure. The available accelerators include: Azure ML Operationalization in-a-box, Edge AI in-a-box, Doc Intelligence in-a-box, Image and Video Analysis in-a-box, Cognitive Services Landing Zone in-a-box, Semantic Kernel Bot in-a-box, NLP to SQL in-a-box, Assistants API in-a-box, and Assistants API Bot in-a-box.

github

: 527

awsome-distributed-training

This repository contains reference architectures and test cases for distributed model training with Amazon SageMaker Hyperpod, AWS ParallelCluster, AWS Batch, and Amazon EKS. The test cases cover different types and sizes of models as well as different frameworks and parallel optimizations (Pytorch DDP/FSDP, MegatronLM, NemoMegatron...).

github

: 230

generative-ai-cdk-constructs

github

: 444

model_server

OpenVINO™ Model Server (OVMS) is a high-performance system for serving models. Implemented in C++ for scalability and optimized for deployment on Intel architectures, the model server uses the same architecture and API as TensorFlow Serving and KServe while applying OpenVINO for inference execution. Inference service is provided via gRPC or REST API, making deploying new algorithms and AI experiments easy.

github

: 718

dify-helm

Deploy langgenius/dify, an LLM based chat bot app on kubernetes with helm chart.

github

: 340

generative-ai-cdk-constructs

README:

AWS Generative AI CDK Constructs

Table of contents

Introduction

CDK Versions

Contributing

Design guidelines and Development guide

Getting Started

Catalog

L3 constructs

L2 Constructs

Sample Use Cases

Additional Resources

Contributors

Operational Metrics Collection

Roadmap

Deprecation

License

Legal Disclaimer

For Tasks:

For Jobs:

Alternative AI tools for generative-ai-cdk-constructs

Similar Open Source Tools

generative-ai-cdk-constructs

taranis-ai

leapfrogai

aiops-modules

OmAgent

refly

kubesphere

katib

mlcraft

synmetrix

MNN

AI-Toolbox

all-rag-techniques

langkit

LLMs-from-scratch

second-brain-ai-assistant-course

For similar tasks

generative-ai-cdk-constructs

For similar jobs

minio

ai-on-gke

kong

AI-in-a-Box

awsome-distributed-training

generative-ai-cdk-constructs

model_server

dify-helm