llamafarm

Deploy any AI model, agents, database, RAG, and pipeline locally in minutes

Stars: 115

Visit

LlamaFarm is a comprehensive AI framework that empowers users to build powerful AI applications locally, with full control over costs and deployment options. It provides modular components for RAG systems, vector databases, model management, prompt engineering, and fine-tuning. Users can create differentiated AI products without needing extensive ML expertise, using simple CLI commands and YAML configs. The framework supports local-first development, production-ready components, strategy-based configuration, and deployment anywhere from laptops to the cloud.

README:

🦙 LlamaFarm - Build Powerful AI Locally, Deploy Anywhere

The Complete AI Development Framework - From Local Prototypes to Production Systems

🚀 Quick Start • 📚 Documentation • 🏗️ Architecture • 🤝 Contributing

🚀 What is LlamaFarm?

🚧 Building in the Open: We're actively developing LlamaFarm and not everything is working yet. Join us as we build the future of local-first AI development! Check our roadmap to see what's coming and how you can contribute.

Why LlamaFarm?

The AI revolution should be accessible to everyone, not just ML experts and big tech companies. We believe you shouldn't need a PhD to build powerful AI applications - just a CLI, your config files, and your data. Too many teams are stuck between expensive cloud APIs that lock you in, or complex open-source tools that require months of ML expertise to productionize. LlamaFarm changes this: full control and production-ready AI with simple commands and YAML configs. No machine learning degree required - if you can write config files and run CLI commands, you can build sophisticated AI systems. Build locally with your data, maintain complete control over costs, and deploy anywhere from your laptop to the cloud - all with the same straightforward interface.

LlamaFarm is a comprehensive, modular framework for building AI Projects that run locally, collaborate, and deploy anywhere. We provide battle-tested components for RAG systems, vector databases, model management, prompt engineering, and soon fine-tuning - all designed to work seamlessly together or independently.

We're not local-only zealots - use cloud APIs where they make sense for your needs - llamafarm helps with that! But we believe the real value in the AI economy comes from building something uniquely yours, not just wrapping another UI around GPT-5. True innovation happens when you can train on your proprietary data, fine-tune for your specific use cases, and maintain full control over your AI stack. LlamaFarm gives you the tools to create differentiated AI products that your competitors can't simply copy by calling the same API.

LlamaFarm is a comprehensive, modular AI framework that gives you complete control over your AI stack. Unlike cloud-only solutions, we provide:

🏠 Local-First Development - Build and test entirely on your machine
🔧 Production-Ready Components - Battle-tested modules that scale from laptop to cluster
🎯 Strategy/config-Based Configuration - Smart defaults with infinite customization
🚀 Deploy Anywhere - Same code runs locally, on-premise, or in any cloud

🎭 Perfect For

Developers who want to build AI applications without vendor lock-in
Teams needing cost control and data privacy
Enterprises requiring scalable, secure AI infrastructure
Researchers experimenting with cutting-edge techniques

🏗️ Core Components

LlamaFarm is built as a modular system where each component can be used independently or orchestrated together for powerful AI applications.

⚙️ System Components

🚀 Runtime

The execution environment that orchestrates all components and manages the application lifecycle.

Process Management: Handles component initialization and shutdown
API/Access Layer: Send queries to /chat, data to /data, and get full results with ease.
Resource Allocation: Manages memory, CPU, and GPU resources efficiently
Service Discovery: Automatically finds and connects components
Health Monitoring: Tracks component status and performance metrics
Error Recovery: Automatic restart and fallback mechanisms

📦 Deployer

Zero-configuration deployment system that works from local development to production clusters.

Environment Detection: Automatically adapts to local, Docker, or cloud environments
Configuration Management: Handles environment variables and secrets securely
Scaling: Horizontal and vertical scaling based on load
Load Balancing: Distributes requests across multiple instances
Rolling Updates: Zero-downtime deployments with automatic rollback

🧠 AI Components

🔍 Data Pipeline (RAG)

Complete document processing and retrieval system for building knowledge-augmented applications.

Document Ingestion: Parse 15+ formats (PDF, Word, Excel, HTML, Markdown, etc.)
Smart Extraction: Extract entities, keywords, statistics without LLMs
Vector Storage: Integration with 8+ vector databases (Chroma, Pinecone, FAISS, etc.)
Hybrid Search: Combine semantic, keyword, and metadata-based retrieval
Chunking Strategies: Adaptive chunking based on document type and use case
Incremental Updates: Efficiently update knowledge base without full reprocessing

🤖 Models

Unified interface for all LLM operations with enterprise-grade features.

Multi-Provider Support: 25+ providers (OpenAI, Anthropic, Google, Ollama, etc.)
Automatic Failover: Seamless fallback between providers when errors occur
Fine-Tuning Pipeline: Train custom models on your data (Coming Q2 2025)
Cost Optimization: Route queries to cheapest capable model
Load Balancing: Distribute across multiple API keys and endpoints
Response Caching: Intelligent caching to reduce API costs
Model Configuration: Per-model temperature, token limits, and parameters

📝 Prompts

Enterprise prompt management system with version control and A/B testing.

Template Library: 20+ pre-built templates for common use cases
Dynamic Variables: Jinja2 templating with type validation (roadmap)
Strategy Selection: Automatically choose best template based on context
Version Control: Track prompt changes and performance over time (roadmap)
A/B Testing: Compare prompt variations with built-in analytics (roadmap)
Chain-of-Thought: Built-in support for reasoning chains
Multi-Agent: Coordinate multiple specialized prompts (roadmap)

🔄 How Components Work Together

User Request → Runtime receives and validates the request
Context Retrieval → Data Pipeline searches relevant documents
Prompt Selection → Prompts system chooses optimal template
Model Execution → Models component handles LLM interaction with automatic failover
Response Delivery → Runtime returns formatted response to user

Each component is independent but designed to work seamlessly together through standardized interfaces.

🚀 Quick Start

Installation

curl -fsSL https://raw.githubusercontent.com/llama-farm/llamafarm/main/install.sh | bash

Or, to start components manually for development:

git clone https://github.com/llama-farm/llamafarm.git
cd llamafarm

npm install -g nx
nx init --useDotNxInstallation --interactive=false
nx start server

🎯 Getting Started

💡 Important: All our demos use the REAL CLI and REAL configuration system - what you see in the demos is exactly how you'll use LlamaFarm in production!

For the best experience getting started with LlamaFarm, we recommend exploring our component documentation and running the interactive demos:

📚 RAG System (Document Processing & Retrieval)

Read the RAG Documentation - Complete guide to document ingestion, embedding, and retrieval

Run the Interactive Demos:

cd rag
uv sync

# Interactive setup wizard - guides you through configuration
uv run python setup_demo.py

# Or try specific demos with the real CLI:
uv run python cli.py demo research_papers    # Academic paper analysis
uv run python cli.py demo customer_support   # Support ticket processing
uv run python cli.py demo code_analysis      # Source code understanding

# Use your own documents:
uv run python cli.py ingest ./your-docs/ --strategy research
uv run python cli.py search "your query here" --top-k 5

🤖 Models (LLM Management & Optimization)

Read the Models Documentation - Multi-provider support, fallback strategies, and cost optimization

Run the Interactive Demos:

cd models
uv sync

# Try our showcase demos:
uv run python demos/demo1_cloud_fallback.py  # Automatic provider fallback
uv run python demos/demo2_multi_model.py     # Smart model routing
uv run python demos/demo3_training.py        # Fine-tuning pipeline (preview)

# Or use the real CLI directly:
uv run python cli.py chat --strategy balanced "Explain quantum computing"
uv run python cli.py chat --primary gpt-4 --fallback claude-3 "Write a haiku"

# Test with your own config:
uv run python cli.py setup your-strategy.yaml --verify
uv run python cli.py demo your-strategy

📝 Prompts (Coming Soon)

The prompts system is under active development. For now, explore the template system:

cd prompts
uv sync
uv run python -m prompts.cli template list  # View available templates
uv run python -m prompts.cli execute "Your task" --template research

🎮 Try It Live with the LlamaFarm CLI

Building the CLI Locally

If you're working with the latest changes that haven't been released yet, you can build and run the CLI locally:

# Prerequisites: Go 1.19+ must be installed

# Build the CLI binary
cd cli && go build -o lf . && cd ..

# Create a symlink for easy access (optional)
ln -sf cli/lf lf

# Now you can run the CLI as ./lf from the project root
./lf version  # Should show "LlamaFarm CLI vdev"

# To rebuild after making changes to the CLI code:
cd cli && go build -o lf . && cd ..

Complete RAG Pipeline Example

# Using the locally built CLI
./lf version  # Verify it's working

# Create and populate a dataset
./lf datasets add my-docs -s universal_processor -b main_database
./lf datasets ingest my-docs examples/rag_pipeline/sample_files/research_papers/*.txt
./lf datasets ingest my-docs examples/rag_pipeline/sample_files/fda/*.pdf
./lf datasets process my-docs

# Query your documents
./lf rag query --database main_database "What is transformer architecture?"
./lf rag query --database main_database --top-k 10 "What FDA submissions are discussed?"

# Chat with RAG augmentation (default behavior)
./lf run --database main_database "Explain neural scaling laws"
./lf run --database main_database --debug "What is BLA 761248?"

# Chat without RAG (LLM only)
./lf run --no-rag "What is machine learning?"

Component-Specific Examples

RAG System:

cd rag
uv run python cli.py demo research_papers
uv run python cli.py ingest ./your-docs/ --strategy research
uv run python cli.py search "your query" --top-k 5

Models System:

cd models
uv run python demos/demo1_cloud_fallback.py
uv run python cli.py chat --strategy balanced "Explain quantum computing"

Prompts System:

cd prompts
uv run python -m prompts.cli template list
uv run python -m prompts.cli execute "Your task" --template research

🎯 Configuration System

LlamaFarm uses a strategy-based configuration system that adapts to your use case:

Strategy Configuration Example

# config/strategies.yaml
strategies:
  research:
    rag:
      embedder: "sentence-transformers"
      chunk_size: 512
      overlap: 50
      retrievers:
        - type: "hybrid"
          weights: {dense: 0.7, sparse: 0.3}
    models:
      primary: "gpt-4"
      fallback: "claude-3-opus"
      temperature: 0.3
    prompts:
      template: "academic_research"
      style: "formal"
      citations: true

  customer_support:
    rag:
      embedder: "openai"
      chunk_size: 256
      retrievers:
        - type: "similarity"
          top_k: 3
    models:
      primary: "gpt-3.5-turbo"
      temperature: 0.7
    prompts:
      template: "conversational"
      style: "friendly"
      include_context: true

Using Strategies

# Apply strategy across all components
export LLAMAFARM_STRATEGY=research

# Or specify per command
uv run python rag/cli.py ingest docs/ --strategy research
uv run python models/cli.py chat --strategy customer_support "Help me with my order"

📚 Documentation

📖 Comprehensive Guides

Component	Description	Documentation
RAG System	Document processing, embedding, retrieval	📚 RAG Guide
Models	LLM providers, management, optimization	🤖 Models Guide
Prompts	Templates, strategies, evaluation	📝 Prompts Guide
CLI	Command-line tools and utilities	⚡ CLI Reference
API	REST API services	🔌 API Docs

🎓 Tutorials

🔧 Examples

Check out our examples/ directory for complete working applications:

📚 Knowledge Base Assistant
💬 Customer Support Bot
📊 Document Analysis Pipeline
🔍 Semantic Search Engine
🤖 Multi-Agent System

🚢 Deployment Options

Local Development

# Run with hot-reload
uv run python main.py --dev

# Or use Docker
docker-compose up -d

Production Deployment

# docker-compose.prod.yml
version: '3.8'
services:
  llamafarm:
    image: llamafarm/llamafarm:latest
    environment:
      - STRATEGY=production
      - WORKERS=4
    volumes:
      - ./config:/app/config
      - ./data:/app/data
    ports:
      - "8000:8000"
    deploy:
      replicas: 3
      resources:
        limits:
          memory: 4G

Cloud Deployment

AWS: ECS, Lambda, SageMaker
GCP: Cloud Run, Vertex AI
Azure: Container Instances, ML Studio
Self-Hosted: Kubernetes, Docker Swarm

See deployment guide for detailed instructions.

🛠️ Advanced Features

🔄 Pipeline Composition

from llamafarm import Pipeline, RAG, Models, Prompts

# Create a complete AI pipeline
pipeline = Pipeline(strategy="research")
  .add(RAG.ingest("documents/"))
  .add(Prompts.select_template())
  .add(Models.generate())
  .add(RAG.store_results())

# Execute with monitoring
results = pipeline.run(
    query="What are the implications?",
    monitor=True,
    cache=True
)

🎯 Custom Strategies

from llamafarm.strategies import Strategy

class MedicalStrategy(Strategy):
    """Custom strategy for medical document analysis"""

    def configure_rag(self):
        return {
            "extractors": ["medical_entities", "dosages", "symptoms"],
            "embedder": "biobert",
            "chunk_size": 256
        }

    def configure_models(self):
        return {
            "primary": "med-palm-2",
            "temperature": 0.1,
            "require_citations": True
        }

📊 Monitoring & Analytics

from llamafarm.monitoring import Monitor

monitor = Monitor()
monitor.track_usage()
monitor.analyze_costs()
monitor.export_metrics("prometheus")

🌍 Community & Ecosystem

🤝 Contributing

We welcome contributions! See our Contributing Guide for:

🐛 Reporting bugs
💡 Suggesting features
🔧 Submitting PRs
📚 Improving docs

🏆 Contributors

_{Bobby Radford} 💻 🚧	_{Matt Hamann} 💻 🚧	_{Rachel Orrino} 💻	_{Rob Thelen} 💻	_{Racheal Ochalek} 💻	_{github-actions[bot]} 💻	_{Davon Davis} 💻
_{Neha Prasad} 💻

🔗 Integration Partners

Vector DBs: ChromaDB, Pinecone, Weaviate, Qdrant, FAISS
LLM Providers: OpenAI, Anthropic, Google, Cohere, Together, Groq
Deployment: Docker, Kubernetes, AWS, GCP, Azure
Monitoring: Prometheus, Grafana, DataDog, New Relic

🚦 Roadmap

✅ Released

RAG System with 10+ parsers and 5+ extractors
25+ LLM provider integrations
20+ prompt templates with strategies
CLI tools for all components
Docker deployment support

🚀 Coming Soon

Full Runtime System - Complete orchestration layer for managing all components with health monitoring, resource allocation, and automatic recovery
Production Deployer - Zero-configuration deployment from local development to cloud with automatic scaling and load balancing
Fine-tuning Pipeline - Train custom models on your data with integrated evaluation and deployment
Web UI Dashboard - Visual interface for monitoring, configuration, and management
Enhanced CLI - Unified command interface across all components

🚧 In Progress

Fine-tuning pipeline (Looking for contributors with ML experience)
Advanced caching system (Redis/Memcached integration - 40% complete)
GraphRAG implementation (Design phase - Join discussion)
Multi-modal support (Vision models integration - Early prototype)
Agent orchestration (LangGraph integration planned)

📅 Planned (late-2025)

AutoML for strategy optimization (Q4 2025 - Seeking ML engineers)
Distributed training (Q4 2025 - Partnership opportunities welcome)
Edge deployment (Q4 2025 - IoT and mobile focus)
Mobile SDKs (iOS/Android - Looking for mobile developers)
Web UI dashboard (Q4 2025 - React/Vue developers needed)

🤝 Want to Contribute?

We're actively looking for contributors in these areas:

🧠 Machine Learning: Fine-tuning, distributed training
📱 Mobile Development: iOS/Android SDKs
🎨 Frontend: Web UI dashboard
🔍 Search: GraphRAG and advanced retrieval
📚 Documentation: Tutorials and examples

📄 License

LlamaFarm is MIT licensed. See LICENSE for details.

🙏 Acknowledgments

LlamaFarm stands on the shoulders of giants:

🦜 LangChain - LLM orchestration inspiration
🤗 Transformers - Model implementations
🎯 ChromaDB - Vector database excellence
🚀 uv - Lightning-fast package management

See CREDITS.md for complete acknowledgments.

🦙 Ready to Build Production AI?

Join thousands of developers building with LlamaFarm

⭐ Star on GitHub • 💬 Join Discord • 📚 Read Docs •

Build locally. Deploy anywhere. Own your AI.

For Tasks:

Click tags to check more tools for each tasks

build ai applications control costs deploy ai systems experiment with techniques optimize models

For Jobs:

ai developer data scientist machine learning engineer software engineer research scientist

Alternative AI tools for llamafarm

Similar Open Source Tools

llamafarm

github

: 115

finite-monkey-engine

FiniteMonkey is an advanced vulnerability mining engine powered purely by GPT, requiring no prior knowledge base or fine-tuning. Its effectiveness significantly surpasses most current related research approaches. The tool is task-driven, prompt-driven, and focuses on prompt design, leveraging 'deception' and hallucination as key mechanics. It has helped identify vulnerabilities worth over $60,000 in bounties. The tool requires PostgreSQL database, OpenAI API access, and Python environment for setup. It supports various languages like Solidity, Rust, Python, Move, Cairo, Tact, Func, Java, and Fake Solidity for scanning. FiniteMonkey is best suited for logic vulnerability mining in real projects, not recommended for academic vulnerability testing. GPT-4-turbo is recommended for optimal results with an average scan time of 2-3 hours for medium projects. The tool provides detailed scanning results guide and implementation tips for users.

github

: 305

persistent-ai-memory

Persistent AI Memory System is a comprehensive tool that offers persistent, searchable storage for AI assistants. It includes features like conversation tracking, MCP tool call logging, and intelligent scheduling. The system supports multiple databases, provides enhanced memory management, and offers various tools for memory operations, schedule management, and system health checks. It also integrates with various platforms like LM Studio, VS Code, Koboldcpp, Ollama, and more. The system is designed to be modular, platform-agnostic, and scalable, allowing users to handle large conversation histories efficiently.

github

: 138

evi-run

evi-run is a powerful, production-ready multi-agent AI system built on Python using the OpenAI Agents SDK. It offers instant deployment, ultimate flexibility, built-in analytics, Telegram integration, and scalable architecture. The system features memory management, knowledge integration, task scheduling, multi-agent orchestration, custom agent creation, deep research, web intelligence, document processing, image generation, DEX analytics, and Solana token swap. It supports flexible usage modes like private, free, and pay mode, with upcoming features including NSFW mode, task scheduler, and automatic limit orders. The technology stack includes Python 3.11, OpenAI Agents SDK, Telegram Bot API, PostgreSQL, Redis, and Docker & Docker Compose for deployment.

github

: 74

bifrost

Bifrost is a high-performance AI gateway that unifies access to multiple providers through a single OpenAI-compatible API. It offers features like automatic failover, load balancing, semantic caching, and enterprise-grade functionalities. Users can deploy Bifrost in seconds with zero configuration, benefiting from its core infrastructure, advanced features, enterprise and security capabilities, and developer experience. The repository structure is modular, allowing for maximum flexibility. Bifrost is designed for quick setup, easy configuration, and seamless integration with various AI models and tools.

github

: 615

layra

LAYRA is the world's first visual-native AI automation engine that sees documents like a human, preserves layout and graphical elements, and executes arbitrarily complex workflows with full Python control. It empowers users to build next-generation intelligent systems with no limits or compromises. Built for Enterprise-Grade deployment, LAYRA features a modern frontend, high-performance backend, decoupled service architecture, visual-native multimodal document understanding, and a powerful workflow engine.

github

: 817

claude-007-agents

Claude Code Agents is an open-source AI agent system designed to enhance development workflows by providing specialized AI agents for orchestration, resilience engineering, and organizational memory. These agents offer specialized expertise across technologies, AI system with organizational memory, and an agent orchestration system. The system includes features such as engineering excellence by design, advanced orchestration system, Task Master integration, live MCP integrations, professional-grade workflows, and organizational intelligence. It is suitable for solo developers, small teams, enterprise teams, and open-source projects. The system requires a one-time bootstrap setup for each project to analyze the tech stack, select optimal agents, create configuration files, set up Task Master integration, and validate system readiness.

github

: 159

RepoMaster

RepoMaster is an AI agent that leverages GitHub repositories to solve complex real-world tasks. It transforms how coding tasks are solved by automatically finding the right GitHub tools and making them work together seamlessly. Users can describe their tasks, and RepoMaster's AI analysis leads to auto discovery and smart execution, resulting in perfect outcomes. The tool provides a web interface for beginners and a command-line interface for advanced users, along with specialized agents for deep search, general assistance, and repository tasks.

github

: 167

lyraios

LYRAIOS (LLM-based Your Reliable AI Operating System) is an advanced AI assistant platform built with FastAPI and Streamlit, designed to serve as an operating system for AI applications. It offers core features such as AI process management, memory system, and I/O system. The platform includes built-in tools like Calculator, Web Search, Financial Analysis, File Management, and Research Tools. It also provides specialized assistant teams for Python and research tasks. LYRAIOS is built on a technical architecture comprising FastAPI backend, Streamlit frontend, Vector Database, PostgreSQL storage, and Docker support. It offers features like knowledge management, process control, and security & access control. The roadmap includes enhancements in core platform, AI process management, memory system, tools & integrations, security & access control, open protocol architecture, multi-agent collaboration, and cross-platform support.

github

: 202

astrsk

astrsk is a tool that pushes the boundaries of AI storytelling by offering advanced AI agents, customizable response formatting, and flexible prompt editing for immersive roleplaying experiences. It provides complete AI agent control, a visual flow editor for conversation flows, and ensures 100% local-first data storage. The tool is true cross-platform with support for various AI providers and modern technologies like React, TypeScript, and Tailwind CSS. Coming soon features include cross-device sync, enhanced session customization, and community features.

github

: 106

DreamLayer

DreamLayer AI is an open-source Stable Diffusion WebUI designed for AI researchers, labs, and developers. It automates prompts, seeds, and metrics for benchmarking models, datasets, and samplers, enabling reproducible evaluations across multiple seeds and configurations. The tool integrates custom metrics and evaluation pipelines, providing a streamlined workflow for AI research. With features like automated benchmarking, reproducibility, built-in metrics, multi-modal readiness, and researcher-friendly interface, DreamLayer AI aims to simplify and accelerate the model evaluation process.

github

: 367

zotero-mcp

Zotero MCP is an open-source project that integrates AI capabilities with Zotero using the Model Context Protocol. It consists of a Zotero plugin and an MCP server, enabling AI assistants to search, retrieve, and cite references from Zotero library. The project features a unified architecture with an integrated MCP server, eliminating the need for a separate server process. It provides features like intelligent search, detailed reference information, filtering by tags and identifiers, aiding in academic tasks such as literature reviews and citation management.

github

: 99

opcode

opcode is a powerful desktop application built with Tauri 2 that serves as a command center for interacting with Claude Code. It offers a visual GUI for managing Claude Code sessions, creating custom agents, tracking usage, and more. Users can navigate projects, create specialized AI agents, monitor usage analytics, manage MCP servers, create session checkpoints, edit CLAUDE.md files, and more. The tool bridges the gap between command-line tools and visual experiences, making AI-assisted development more intuitive and productive.

github

: 15.8k

Dive

Dive is an open-source MCP Host Desktop Application that seamlessly integrates with any LLMs supporting function calling capabilities. It offers universal LLM support, cross-platform compatibility, model context protocol for AI agent integration, OAP cloud integration, dual architecture for optimal performance, multi-language support, advanced API management, granular tool control, custom instructions, auto-update mechanism, and more. Dive provides a user-friendly interface for managing multiple AI models and tools, with recent updates introducing major architecture changes, new features, improvements, and platform availability. Users can easily download and install Dive on Windows, MacOS, and Linux, and set up MCP tools through local servers or OAP cloud services.

github

: 1.6k

neuropilot

NeuroPilot is an open-source AI-powered education platform that transforms study materials into interactive learning resources. It provides tools like contextual chat, smart notes, flashcards, quizzes, and AI podcasts. Supported by various AI models and embedding providers, it offers features like WebSocket streaming, JSON or vector database support, file-based storage, and configurable multi-provider setup for LLMs and TTS engines. The technology stack includes Node.js, TypeScript, Vite, React, TailwindCSS, JSON database, multiple LLM providers, and Docker for deployment. Users can contribute to the project by integrating AI models, adding mobile app support, improving performance, enhancing accessibility features, and creating documentation and tutorials.

github

: 108

gemini-cli

Gemini CLI is an open-source AI agent that provides lightweight access to Gemini, offering powerful capabilities like code understanding, generation, automation, integration, and advanced features. It is designed for developers who prefer working in the command line and offers extensibility through MCP support. The tool integrates directly into GitHub workflows and offers various authentication options for individual developers, enterprise teams, and production workloads. With features like code querying, editing, app generation, debugging, and GitHub integration, Gemini CLI aims to streamline development workflows and enhance productivity.

github

: 76.8k

For similar tasks

griptape

Griptape is a modular Python framework for building AI-powered applications that securely connect to your enterprise data and APIs. It offers developers the ability to maintain control and flexibility at every step. Griptape's core components include Structures (Agents, Pipelines, and Workflows), Tasks, Tools, Memory (Conversation Memory, Task Memory, and Meta Memory), Drivers (Prompt and Embedding Drivers, Vector Store Drivers, Image Generation Drivers, Image Query Drivers, SQL Drivers, Web Scraper Drivers, and Conversation Memory Drivers), Engines (Query Engines, Extraction Engines, Summary Engines, Image Generation Engines, and Image Query Engines), and additional components (Rulesets, Loaders, Artifacts, Chunkers, and Tokenizers). Griptape enables developers to create AI-powered applications with ease and efficiency.

github

: 2.2k

AI-in-a-Box

AI-in-a-Box is a curated collection of solution accelerators that can help engineers establish their AI/ML environments and solutions rapidly and with minimal friction, while maintaining the highest standards of quality and efficiency. It provides essential guidance on the responsible use of AI and LLM technologies, specific security guidance for Generative AI (GenAI) applications, and best practices for scaling OpenAI applications within Azure. The available accelerators include: Azure ML Operationalization in-a-box, Edge AI in-a-box, Doc Intelligence in-a-box, Image and Video Analysis in-a-box, Cognitive Services Landing Zone in-a-box, Semantic Kernel Bot in-a-box, NLP to SQL in-a-box, Assistants API in-a-box, and Assistants API Bot in-a-box.

github

: 527

spring-ai

The Spring AI project provides a Spring-friendly API and abstractions for developing AI applications. It offers a portable client API for interacting with generative AI models, enabling developers to easily swap out implementations and access various models like OpenAI, Azure OpenAI, and HuggingFace. Spring AI also supports prompt engineering, providing classes and interfaces for creating and parsing prompts, as well as incorporating proprietary data into generative AI without retraining the model. This is achieved through Retrieval Augmented Generation (RAG), which involves extracting, transforming, and loading data into a vector database for use by AI models. Spring AI's VectorStore abstraction allows for seamless transitions between different vector database implementations.

github

: 6.8k

ragstack-ai

RAGStack is an out-of-the-box solution simplifying Retrieval Augmented Generation (RAG) in GenAI apps. RAGStack includes the best open-source for implementing RAG, giving developers a comprehensive Gen AI Stack leveraging LangChain, CassIO, and more. RAGStack leverages the LangChain ecosystem and is fully compatible with LangSmith for monitoring your AI deployments.

github

: 127

breadboard

Breadboard is a library for prototyping generative AI applications. It is inspired by the hardware maker community and their boundless creativity. Breadboard makes it easy to wire prototypes and share, remix, reuse, and compose them. The library emphasizes ease and flexibility of wiring, as well as modularity and composability.

github

: 332

cloudflare-ai-web

Cloudflare-ai-web is a lightweight and easy-to-use tool that allows you to quickly deploy a multi-modal AI platform using Cloudflare Workers AI. It supports serverless deployment, password protection, and local storage of chat logs. With a size of only ~638 kB gzip, it is a great option for building AI-powered applications without the need for a dedicated server.

github

: 2.1k

app-builder

AppBuilder SDK is a one-stop development tool for AI native applications, providing basic cloud resources, AI capability engine, Qianfan large model, and related capability components to improve the development efficiency of AI native applications.

github

: 553

cookbook

This repository contains community-driven practical examples of building AI applications and solving various tasks with AI using open-source tools and models. Everyone is welcome to contribute, and we value everybody's contribution! There are several ways you can contribute to the Open-Source AI Cookbook: Submit an idea for a desired example/guide via GitHub Issues. Contribute a new notebook with a practical example. Improve existing examples by fixing issues/typos. Before contributing, check currently open issues and pull requests to avoid working on something that someone else is already working on.

github

: 2.3k

For similar jobs

promptflow

**Prompt flow** is a suite of development tools designed to streamline the end-to-end development cycle of LLM-based AI applications, from ideation, prototyping, testing, evaluation to production deployment and monitoring. It makes prompt engineering much easier and enables you to build LLM apps with production quality.

github

: 9.2k

deepeval

DeepEval is a simple-to-use, open-source LLM evaluation framework specialized for unit testing LLM outputs. It incorporates various metrics such as G-Eval, hallucination, answer relevancy, RAGAS, etc., and runs locally on your machine for evaluation. It provides a wide range of ready-to-use evaluation metrics, allows for creating custom metrics, integrates with any CI/CD environment, and enables benchmarking LLMs on popular benchmarks. DeepEval is designed for evaluating RAG and fine-tuning applications, helping users optimize hyperparameters, prevent prompt drifting, and transition from OpenAI to hosting their own Llama2 with confidence.

github

: 11.3k

MegaDetector

MegaDetector is an AI model that identifies animals, people, and vehicles in camera trap images (which also makes it useful for eliminating blank images). This model is trained on several million images from a variety of ecosystems. MegaDetector is just one of many tools that aims to make conservation biologists more efficient with AI. If you want to learn about other ways to use AI to accelerate camera trap workflows, check out our of the field, affectionately titled "Everything I know about machine learning and camera traps".

github

: 186

leapfrogai

LeapfrogAI is a self-hosted AI platform designed to be deployed in air-gapped resource-constrained environments. It brings sophisticated AI solutions to these environments by hosting all the necessary components of an AI stack, including vector databases, model backends, API, and UI. LeapfrogAI's API closely matches that of OpenAI, allowing tools built for OpenAI/ChatGPT to function seamlessly with a LeapfrogAI backend. It provides several backends for various use cases, including llama-cpp-python, whisper, text-embeddings, and vllm. LeapfrogAI leverages Chainguard's apko to harden base python images, ensuring the latest supported Python versions are used by the other components of the stack. The LeapfrogAI SDK provides a standard set of protobuffs and python utilities for implementing backends and gRPC. LeapfrogAI offers UI options for common use-cases like chat, summarization, and transcription. It can be deployed and run locally via UDS and Kubernetes, built out using Zarf packages. LeapfrogAI is supported by a community of users and contributors, including Defense Unicorns, Beast Code, Chainguard, Exovera, Hypergiant, Pulze, SOSi, United States Navy, United States Air Force, and United States Space Force.

github

: 255

llava-docker

This Docker image for LLaVA (Large Language and Vision Assistant) provides a convenient way to run LLaVA locally or on RunPod. LLaVA is a powerful AI tool that combines natural language processing and computer vision capabilities. With this Docker image, you can easily access LLaVA's functionalities for various tasks, including image captioning, visual question answering, text summarization, and more. The image comes pre-installed with LLaVA v1.2.0, Torch 2.1.2, xformers 0.0.23.post1, and other necessary dependencies. You can customize the model used by setting the MODEL environment variable. The image also includes a Jupyter Lab environment for interactive development and exploration. Overall, this Docker image offers a comprehensive and user-friendly platform for leveraging LLaVA's capabilities.

github

: 59

carrot

The 'carrot' repository on GitHub provides a list of free and user-friendly ChatGPT mirror sites for easy access. The repository includes sponsored sites offering various GPT models and services. Users can find and share sites, report errors, and access stable and recommended sites for ChatGPT usage. The repository also includes a detailed list of ChatGPT sites, their features, and accessibility options, making it a valuable resource for ChatGPT users seeking free and unlimited GPT services.

github

: 17.1k

TrustLLM

TrustLLM is a comprehensive study of trustworthiness in LLMs, including principles for different dimensions of trustworthiness, established benchmark, evaluation, and analysis of trustworthiness for mainstream LLMs, and discussion of open challenges and future directions. Specifically, we first propose a set of principles for trustworthy LLMs that span eight different dimensions. Based on these principles, we further establish a benchmark across six dimensions including truthfulness, safety, fairness, robustness, privacy, and machine ethics. We then present a study evaluating 16 mainstream LLMs in TrustLLM, consisting of over 30 datasets. The document explains how to use the trustllm python package to help you assess the performance of your LLM in trustworthiness more quickly. For more details about TrustLLM, please refer to project website.

github

: 535

AI-YinMei

AI-YinMei is an AI virtual anchor Vtuber development tool (N card version). It supports fastgpt knowledge base chat dialogue, a complete set of solutions for LLM large language models: [fastgpt] + [one-api] + [Xinference], supports docking bilibili live broadcast barrage reply and entering live broadcast welcome speech, supports Microsoft edge-tts speech synthesis, supports Bert-VITS2 speech synthesis, supports GPT-SoVITS speech synthesis, supports expression control Vtuber Studio, supports painting stable-diffusion-webui output OBS live broadcast room, supports painting picture pornography public-NSFW-y-distinguish, supports search and image search service duckduckgo (requires magic Internet access), supports image search service Baidu image search (no magic Internet access), supports AI reply chat box [html plug-in], supports AI singing Auto-Convert-Music, supports playlist [html plug-in], supports dancing function, supports expression video playback, supports head touching action, supports gift smashing action, supports singing automatic start dancing function, chat and singing automatic cycle swing action, supports multi scene switching, background music switching, day and night automatic switching scene, supports open singing and painting, let AI automatically judge the content.

github

: 529