GenerativeAIExamples
Generative AI reference workflows optimized for accelerated infrastructure and microservice architecture.
Stars: 2160
NVIDIA Generative AI Examples are state-of-the-art examples that are easy to deploy, test, and extend. All examples run on the high performance NVIDIA CUDA-X software stack and NVIDIA GPUs. These examples showcase the capabilities of NVIDIA's Generative AI platform, which includes tools, frameworks, and models for building and deploying generative AI applications.
README:
This repository is a starting point for developers looking to integrate with the NVIDIA software ecosystem to speed up their generative AI systems. Whether you are building RAG pipelines, agentic workflows, or fine-tuning models, this repository will help you integrate NVIDIA, seamlessly and natively, with your development stack.
This example implements a GPU-accelerated pipeline for creating and querying knowledge graphs using RAG by leveraging NIM microservices and the RAPIDS ecosystem to process large-scale datasets efficiently.
- Build an Agentic RAG Pipeline with Llama 3.1 and NVIDIA NeMo Retriever NIM microservices [Blog, Notebook]
- NVIDIA Morpheus, NIM microservices, and RAG pipelines integrated to create LLM-based agent pipelines
- Tips for Building a RAG Pipeline with NVIDIA AI LangChain AI Endpoints by Amit Bleiweiss. [Blog, Notebook]
For more information, refer to the Generative AI Example releases.
Experience NVIDIA RAG Pipelines with just a few steps!
-
Get your NVIDIA API key.
- Go to the NVIDIA API Catalog.
- Select any model.
- Click Get API Key.
- Run:
export NVIDIA_API_KEY=nvapi-...
-
Clone the repository.
git clone https://github.com/nvidia/GenerativeAIExamples.git
-
Build and run the basic RAG pipeline.
cd GenerativeAIExamples/RAG/examples/basic_rag/langchain/ docker compose up -d --build
-
Go to https://localhost:8090/ and submit queries to the sample RAG Playground.
-
Stop containers when done.
docker compose down
NVIDIA has first-class support for popular generative AI developer frameworks like LangChain, LlamaIndex, and Haystack. These end-to-end notebooks show how to integrate NIM microservices using your preferred generative AI development framework.
Use these notebooks to learn about the LangChain and LlamaIndex connectors.
- RAG
- Agents
By default, these end-to-end examples use preview NIM endpoints on NVIDIA API Catalog. Alternatively, you can run any of the examples on premises.
Example tools and tutorials to enhance LLM development and productivity when using NVIDIA RAG pipelines.
- NVIDIA Tokkio LLM-RAG: Use Tokkio to add avatar animation for RAG responses.
- Hybrid RAG Project on AI Workbench: Run an NVIDIA AI Workbench example project for RAG.
- Changing the Inference or Embedded Model
- Customizing the Vector Database
- Customizing the Chain Server:
- Configuring LLM Parameters at Runtime
- Supporting Multi-Turn Conversations
- Speaking Queries and Listening to Responses with NVIDIA Riva
- Support Matrix
- Architecture
- Using the Sample Chat Web Application
- RAG Playground Web Application
- Software Component Configuration
We're posting these examples on GitHub to support the NVIDIA LLM community and facilitate feedback. We invite contributions! Open a GitHub issue or pull request! Check out the community examples and notebooks.
For Tasks:
Click tags to check more tools for each tasksFor Jobs:
Alternative AI tools for GenerativeAIExamples
Similar Open Source Tools
GenerativeAIExamples
NVIDIA Generative AI Examples are state-of-the-art examples that are easy to deploy, test, and extend. All examples run on the high performance NVIDIA CUDA-X software stack and NVIDIA GPUs. These examples showcase the capabilities of NVIDIA's Generative AI platform, which includes tools, frameworks, and models for building and deploying generative AI applications.
ConvoForm
ConvoForm.com transforms traditional forms into interactive conversational experiences, powered by AI for an enhanced user journey. It offers AI-Powered Form Generation, Real-time Form Editing and Preview, and Customizable Submission Pages. The tech stack includes Next.js for frontend, tRPC for backend, GPT-3.5-Turbo for AI integration, and Socket.io for real-time updates. Local setup requires Node.js, pnpm, Git, PostgreSQL database, Clerk for Authentication, OpenAI key, Redis Database, and Sentry for monitoring. The project is open for contributions and is licensed under the MIT License.
CursorLens
Cursor Lens is an open-source tool that acts as a proxy between Cursor and various AI providers, logging interactions and providing detailed analytics to help developers optimize their use of AI in their coding workflow. It supports multiple AI providers, captures and logs all requests, provides visual analytics on AI usage, allows users to set up and switch between different AI configurations, offers real-time monitoring of AI interactions, tracks token usage, estimates costs based on token usage and model pricing. Built with Next.js, React, PostgreSQL, Prisma ORM, Vercel AI SDK, Tailwind CSS, and shadcn/ui components.
Prompt_Engineering
Prompt Engineering Techniques is a comprehensive repository for learning, building, and sharing prompt engineering techniques, from basic concepts to advanced strategies for leveraging large language models. It provides step-by-step tutorials, practical implementations, and a platform for showcasing innovative prompt engineering techniques. The repository covers fundamental concepts, core techniques, advanced strategies, optimization and refinement, specialized applications, and advanced applications in prompt engineering.
llm-zoomcamp
LLM Zoomcamp is a free online course focusing on real-life applications of Large Language Models (LLMs). Over 10 weeks, participants will learn to build an AI bot capable of answering questions based on a knowledge base. The course covers topics such as LLMs, RAG, open-source LLMs, vector databases, orchestration, monitoring, and advanced RAG systems. Pre-requisites include comfort with programming, Python, and the command line, with no prior exposure to AI or ML required. The course features a pre-course workshop and is led by instructors Alexey Grigorev and Magdalena Kuhn, with support from sponsors and partners.
tap4-ai-webui
Tap4 AI Web UI is an open source AI tools directory built by Tap4 AI Tools Directory. The project aims to help everyone build their own AI Tools Directory easily. Users can fork the project, deploy it to Vercel with one click, and update their own AI tools using the data list in the project. The web UI features internationalization, SEO friendliness, dynamic sitemap generation, fast shipping, NEXT 14 with app route, and integration with Supabase serverless database.
AI-Gateway
The AI-Gateway repository explores the AI Gateway pattern through a series of experimental labs, focusing on Azure API Management for handling AI services APIs. The labs provide step-by-step instructions using Jupyter notebooks with Python scripts, Bicep files, and APIM policies. The goal is to accelerate experimentation of advanced use cases and pave the way for further innovation in the rapidly evolving field of AI. The repository also includes a Mock Server to mimic the behavior of the OpenAI API for testing and development purposes.
obsidian-ai-assistant
Obsidian AI Assistant is a simple plugin that enables interactions with various AI models such as OpenAI ChatGPT, Anthropic Claude, OpenAI DALL·E, and OpenAI Whisper directly from Obsidian notes. The plugin offers features like text assistance, image generation, and speech-to-text functionality. Users can chat with the AI assistant, generate images for notes, and dictate notes using speech-to-text. The plugin allows customization of text models, image generation options, and language settings for speech-to-text. It requires official API keys for using OpenAI and Anthropic Claude models.
ai-driven-dev-community
AI Driven Dev Community is a repository aimed at helping developers become more efficient by utilizing AI tools in their daily coding tasks. It provides a collection of tools, prompts, snippets, and agents for developers to integrate AI into their workflow. The repository is regularly updated with new resources and focuses on best practices for using AI in development work. Users can find tools like Espanso, ChatGPT, GitHub Copilot, and VSCode recommended for enhancing their coding experience. Additionally, the repository offers guidance on customizing AI for developers, installing AI toolbox for software engineers, and contributing to the community through easy steps.
AI-Youtube-Shorts-Generator
AI Youtube Shorts Generator is a Python tool that utilizes GPT-4 and Whisper to generate engaging YouTube shorts from long-form videos. It downloads videos, transcribes them, extracts highlights, detects speakers, and crops content vertically for shorts. The tool requires Python 3.7 or higher, FFmpeg, and OpenCV. Users can contribute to the project under the MIT License.
extensionOS
Extension | OS is an open-source browser extension that brings AI directly to users' web browsers, allowing them to access powerful models like LLMs seamlessly. Users can create prompts, fix grammar, and access intelligent assistance without switching tabs. The extension aims to revolutionize online information interaction by integrating AI into everyday browsing experiences. It offers features like Prompt Factory for tailored prompts, seamless LLM model access, secure API key storage, and a Mixture of Agents feature. The extension was developed to empower users to unleash their creativity with custom prompts and enhance their browsing experience with intelligent assistance.
RAG_Techniques
Advanced RAG Techniques is a comprehensive collection of cutting-edge Retrieval-Augmented Generation (RAG) tutorials aimed at enhancing the accuracy, efficiency, and contextual richness of RAG systems. The repository serves as a hub for state-of-the-art RAG enhancements, comprehensive documentation, practical implementation guidelines, and regular updates with the latest advancements. It covers a wide range of techniques from foundational RAG methods to advanced retrieval methods, iterative and adaptive techniques, evaluation processes, explainability and transparency features, and advanced architectures integrating knowledge graphs and recursive processing.
BotSharp
BotSharp is an open-source machine learning framework for building AI bot platforms. It provides a comprehensive set of tools and components for developing and deploying intelligent virtual assistants. BotSharp is designed to be modular and extensible, allowing developers to easily integrate it with their existing systems and applications. With BotSharp, you can quickly and easily create AI-powered chatbots, virtual assistants, and other conversational AI applications.
rowfill
Rowfill is an open-source document processing platform designed for knowledge workers. It offers advanced AI capabilities to extract, analyze, and process data from complex documents, images, and PDFs. The platform features advanced OCR and processing functionalities, auto-schema generation, and custom actions for creating tailored workflows. It prioritizes privacy and security by supporting Local LLMs like Llama and Mistral, syncing with company data while maintaining privacy, and being open source with AGPLv3 licensing. Rowfill is a versatile tool that aims to streamline document processing tasks for users in various industries.
paperless-ai
Paperless-AI is an automated document analyzer tool designed for Paperless-ngx users. It utilizes the OpenAI API and Ollama (Mistral, llama, phi 3, gemma 2) to automatically scan, analyze, and tag documents. The tool offers features such as automatic document scanning, AI-powered document analysis, automatic title and tag assignment, manual mode for analyzing documents, easy setup through a web interface, document processing dashboard, error handling, and Docker support. Users can configure the tool through a web interface and access a debug interface for monitoring and troubleshooting. Paperless-AI aims to streamline document organization and analysis processes for users with access to Paperless-ngx and AI capabilities.
Instrukt
Instrukt is a terminal-based AI integrated environment that allows users to create and instruct modular AI agents, generate document indexes for question-answering, and attach tools to any agent. It provides a platform for users to interact with AI agents in natural language and run them inside secure containers for performing tasks. The tool supports custom AI agents, chat with code and documents, tools customization, prompt console for quick interaction, LangChain ecosystem integration, secure containers for agent execution, and developer console for debugging and introspection. Instrukt aims to make AI accessible to everyone by providing tools that empower users without relying on external APIs and services.
For similar tasks
GenerativeAIExamples
NVIDIA Generative AI Examples are state-of-the-art examples that are easy to deploy, test, and extend. All examples run on the high performance NVIDIA CUDA-X software stack and NVIDIA GPUs. These examples showcase the capabilities of NVIDIA's Generative AI platform, which includes tools, frameworks, and models for building and deploying generative AI applications.
aiconfig
AIConfig is a framework that makes it easy to build generative AI applications for production. It manages generative AI prompts, models and model parameters as JSON-serializable configs that can be version controlled, evaluated, monitored and opened in a local editor for rapid prototyping. It allows you to store and iterate on generative AI behavior separately from your application code, offering a streamlined AI development workflow.
learn-generative-ai
Learn Cloud Applied Generative AI Engineering (GenEng) is a course focusing on the application of generative AI technologies in various industries. The course covers topics such as the economic impact of generative AI, the role of developers in adopting and integrating generative AI technologies, and the future trends in generative AI. Students will learn about tools like OpenAI API, LangChain, and Pinecone, and how to build and deploy Large Language Models (LLMs) for different applications. The course also explores the convergence of generative AI with Web 3.0 and its potential implications for decentralized intelligence.
For similar jobs
weave
Weave is a toolkit for developing Generative AI applications, built by Weights & Biases. With Weave, you can log and debug language model inputs, outputs, and traces; build rigorous, apples-to-apples evaluations for language model use cases; and organize all the information generated across the LLM workflow, from experimentation to evaluations to production. Weave aims to bring rigor, best-practices, and composability to the inherently experimental process of developing Generative AI software, without introducing cognitive overhead.
agentcloud
AgentCloud is an open-source platform that enables companies to build and deploy private LLM chat apps, empowering teams to securely interact with their data. It comprises three main components: Agent Backend, Webapp, and Vector Proxy. To run this project locally, clone the repository, install Docker, and start the services. The project is licensed under the GNU Affero General Public License, version 3 only. Contributions and feedback are welcome from the community.
oss-fuzz-gen
This framework generates fuzz targets for real-world `C`/`C++` projects with various Large Language Models (LLM) and benchmarks them via the `OSS-Fuzz` platform. It manages to successfully leverage LLMs to generate valid fuzz targets (which generate non-zero coverage increase) for 160 C/C++ projects. The maximum line coverage increase is 29% from the existing human-written targets.
LLMStack
LLMStack is a no-code platform for building generative AI agents, workflows, and chatbots. It allows users to connect their own data, internal tools, and GPT-powered models without any coding experience. LLMStack can be deployed to the cloud or on-premise and can be accessed via HTTP API or triggered from Slack or Discord.
VisionCraft
The VisionCraft API is a free API for using over 100 different AI models. From images to sound.
kaito
Kaito is an operator that automates the AI/ML inference model deployment in a Kubernetes cluster. It manages large model files using container images, avoids tuning deployment parameters to fit GPU hardware by providing preset configurations, auto-provisions GPU nodes based on model requirements, and hosts large model images in the public Microsoft Container Registry (MCR) if the license allows. Using Kaito, the workflow of onboarding large AI inference models in Kubernetes is largely simplified.
PyRIT
PyRIT is an open access automation framework designed to empower security professionals and ML engineers to red team foundation models and their applications. It automates AI Red Teaming tasks to allow operators to focus on more complicated and time-consuming tasks and can also identify security harms such as misuse (e.g., malware generation, jailbreaking), and privacy harms (e.g., identity theft). The goal is to allow researchers to have a baseline of how well their model and entire inference pipeline is doing against different harm categories and to be able to compare that baseline to future iterations of their model. This allows them to have empirical data on how well their model is doing today, and detect any degradation of performance based on future improvements.
Azure-Analytics-and-AI-Engagement
The Azure-Analytics-and-AI-Engagement repository provides packaged Industry Scenario DREAM Demos with ARM templates (Containing a demo web application, Power BI reports, Synapse resources, AML Notebooks etc.) that can be deployed in a customer’s subscription using the CAPE tool within a matter of few hours. Partners can also deploy DREAM Demos in their own subscriptions using DPoC.