ChordMiniApp

Chord Recognition, Beat Tracking, Guitar Diagrams, Lyrics Transcription Application with LLM context aware for analysis from uploaded audio and YouTube video

Stars: 97

Visit

ChordMini is an advanced music analysis platform with AI-powered chord recognition, beat detection, and synchronized lyrics. It features a clean and intuitive interface for YouTube search, chord progression visualization, interactive guitar diagrams with accurate fingering patterns, lead sheet with AI assistant for synchronized lyrics transcription, and various add-on features like Roman Numeral Analysis, Key Modulation Signals, Simplified Chord Notation, and Enhanced Chord Correction. The tool requires Node.js, Python 3.9+, and a Firebase account for setup. It offers a hybrid backend architecture for local development and production deployments, with features like beat detection, chord recognition, lyrics processing, rate limiting, and audio processing supporting MP3, WAV, and FLAC formats. ChordMini provides a comprehensive music analysis workflow from user input to visualization, including dual input support, environment-aware processing, intelligent caching, advanced ML pipeline, and rich visualization options.

README:

ChordMini

Advanced music analysis platform with AI-powered chord recognition, beat detection, and synchronized lyrics.

Features Overview

🏠 Homepage Interface

Clean, intuitive interface for YouTube search, URL input, and recent video access.

🎵 Beat & Chord Analysis

Chord progression visualization with synchronized beat detection and grid layout with add-on features: Roman Numeral Analysis, Key Modulation Signals, Simplified Chord Notation, and Enhanced Chord Correction.

🎵 Guitar Diagrams

Interactive guitar chord diagrams with accurate fingering patterns from the official @tombatossals/chords-db database, featuring multiple chord positions and synchronized beat grid integration.

🎤 Lead Sheet with AI Assistant

Synchronized lyrics transcription with AI chatbot for contextual music analysis and translation support.

🚀 Quick Setup

Prerequisites

Node.js 18+ and npm
Python 3.9+ (for backend)
Firebase account (free tier)

Setup Steps

Clone and install Clone with submodules in one command (for fresh clones)

git clone --recursive https://github.com/ptnghia-j/ChordMiniApp.git
cd ChordMiniApp
npm install

Verify that submodules are populated

ls -la python_backend/models/Beat-Transformer/
ls -la python_backend/models/Chord-CNN-LSTM/
ls -la python_backend/models/ChordMini/

If chord recognition encounters issue with fluidsynth:

Install FluidSynth for MIDI synthesis

   
# --- Windows ---
choco install fluidsynth

# --- macOS ---
brew install fluidsynth

# --- Linux (Debian/Ubuntu-based) ---
sudo apt update
sudo apt install fluidsynth

Environment setup

cp .env.example .env.local

Edit .env.local:

NEXT_PUBLIC_PYTHON_API_URL=http://localhost:5001
NEXT_PUBLIC_FIREBASE_API_KEY=your_firebase_api_key
NEXT_PUBLIC_FIREBASE_AUTH_DOMAIN=your_project.firebaseapp.com
NEXT_PUBLIC_FIREBASE_PROJECT_ID=your_project_id
NEXT_PUBLIC_FIREBASE_STORAGE_BUCKET=your_project.appspot.com
NEXT_PUBLIC_FIREBASE_MESSAGING_SENDER_ID=your_sender_id
NEXT_PUBLIC_FIREBASE_APP_ID=your_app_id

Start Python backend (Terminal 1)

cd python_backend
python -m venv myenv
source myenv/bin/activate  # On Windows: myenv\Scripts\activate
pip install -r requirements.txt
python app.py

Start frontend (Terminal 2)
```
npm run dev
```
Open application

Visit http://localhost:3000
Create Firebase project
- Visit Firebase Console
- Click "Create a project"
- Follow the setup wizard
Enable Firestore Database
- Go to "Firestore Database" in the sidebar
- Click "Create database"
- Choose "Start in test mode" for development
Get Firebase configuration
- Go to Project Settings (gear icon)
- Scroll down to "Your apps"
- Click "Add app" → Web app
- Copy the configuration values to your .env.local
Create Firestore collections

The app uses the following Firestore collections. They are created automatically on first write (no manual creation required):
- transcriptions — Beat and chord analysis results (docId: ${videoId}_${beatModel}_${chordModel})
- translations — Lyrics translation cache (docId: cacheKey based on content hash)
- lyrics — Music.ai transcription results (docId: videoId)
- keyDetections — Musical key analysis cache (docId: cacheKey)
- audioFiles — Audio file metadata and URLs (docId: videoId)
Enable Anonymous Authentication
- In Firebase Console: Authentication → Sign-in method → enable Anonymous
Configure Firebase Storage
- Set environment variable: NEXT_PUBLIC_FIREBASE_STORAGE_BUCKET=your_project_id.appspot.com
- Folder structure:
  - audio/ for audio files
  - video/ for optional video files
- Filename pattern requirement: filenames must include the 11-character YouTube video ID in brackets, e.g. audio_[VIDEOID]_timestamp.mp3 (enforced by Storage rules)
- File size limits (enforced by Storage rules):
  - Audio: up to 50MB
  - Video: up to 100MB

API Keys Setup

Music.ai API

# 1. Sign up at music.ai
# 2. Get API key from dashboard
# 3. Add to .env.local
NEXT_PUBLIC_MUSIC_AI_API_KEY=your_key_here

Google Gemini API

# 1. Visit Google AI Studio
# 2. Generate API key
# 3. Add to .env.local
NEXT_PUBLIC_GEMINI_API_KEY=your_key_here

🔧 Docker Setup

To be added in next version...

🏗️ Backend Architecture

ChordMiniApp uses a hybrid backend architecture:

🔧 Local Development Backend (Required)

For local development, you must run the Python backend on localhost:5001:

URL: http://localhost:5001
Port Note: Uses port 5001 to avoid conflict with macOS AirPlay/AirTunes service on port 5000

☁️ Production Backend (your VPS)

Production deployments is configured based on your VPS and url should be set in the NEXT_PUBLIC_PYTHON_API_URL environment variable.

Prerequisites

Python 3.9+ (Python 3.9-3.11 recommended)
Virtual environment (venv or conda)
Git for cloning dependencies
System dependencies (varies by OS)

Quick Setup

Navigate to backend directory
```
cd python_backend
```

Create virtual environment

python -m venv myenv

# Activate virtual environment
# On macOS/Linux:
source myenv/bin/activate

# On Windows:
myenv\Scripts\activate

Install dependencies

pip install --no-cache-dir Cython>=0.29.0 numpy==1.22.4
pip install --no-cache-dir madmom>=0.16.1
pip install --no-cache-dir -r requirements.txt

Start local backend on port 5001

python app.py

The backend will start on http://localhost:5001 and should display:

Starting Flask app on port 5001
App is ready to serve requests
Note: Using port 5001 to avoid conflict with macOS AirPlay/AirTunes on port 5000

Verify backend is running

Open a new terminal and test the backend:

curl http://localhost:5001/health
# Should return: {"status": "healthy"}

Start frontend development server
```
# In the main project directory (new terminal)
npm run dev
```
The frontend will automatically connect to http://localhost:5001 based on your .env.local configuration.

Backend Features Available Locally

Beat Detection: Beat-Transformer and madmom models
Chord Recognition: Chord-CNN-LSTM, BTC-SL, BTC-PL models
Lyrics Processing: Genius.com integration
Rate Limiting: IP-based rate limiting with Flask-Limiter
Audio Processing: Support for MP3, WAV, FLAC formats

Environment Variables for Local Backend

Create a .env file in the python_backend directory:

# Optional: Redis URL for distributed rate limiting
REDIS_URL=redis://localhost:6379

# Optional: Genius API for lyrics
GENIUS_ACCESS_TOKEN=your_genius_token

# Flask configuration
FLASK_MAX_CONTENT_LENGTH_MB=150
CORS_ORIGINS=http://localhost:3000,http://127.0.0.1:3000

Troubleshooting Local Backend

Backend connectivity issues:

# 1. Verify backend is running
curl http://localhost:5001/health
# Expected: {"status": "healthy"}

# 2. Check if port 5001 is in use
lsof -i :5001  # macOS/Linux
netstat -ano | findstr :5001  # Windows

# 3. Verify environment configuration
cat .env.local | grep PYTHON_API_URL
# Expected: NEXT_PUBLIC_PYTHON_API_URL=http://localhost:5001

# 4. Check for macOS AirTunes conflict (if using port 5000)
curl -I http://localhost:5000/health
# If you see "Server: AirTunes", that's the conflict we're avoiding

Frontend connection errors:

# Check browser console for errors like:
# "Failed to fetch" or "Network Error"
# This usually means the backend is not running on port 5001

# Restart both frontend and backend:
# Terminal 1 (Backend):
cd python_backend && python app.py

# Terminal 2 (Frontend):
npm run dev

Import errors:

# Ensure virtual environment is activated
source myenv/bin/activate  # macOS/Linux
myenv\Scripts\activate     # Windows

# Reinstall dependencies
pip install -r requirements.txt

🏗️ Complete Application Flow

ChordMini provides a comprehensive music analysis workflow from user input to visualization. This diagram shows the complete user journey and data processing pipeline:

graph TD
    %% User Entry Points
    A[User Input] --> B{Input Type}
    B -->|YouTube URL/Search| C[YouTube Search Interface]
    B -->|Direct Audio File| D[File Upload Interface]

    %% YouTube Workflow
    C --> E[YouTube Search API]
    E --> F[Video Selection]
    F --> G[Navigate to /analyze/videoId]

    %% Direct Upload Workflow
    D --> H[Vercel Blob Upload]
    H --> I[Navigate to /analyze with blob URL]

    %% Cache Check Phase
    G --> J{Firebase Cache Check}
    J -->|Cache Hit| K[Load Cached Results]
    J -->|Cache Miss| L[Audio Extraction Pipeline]

    %% Audio Extraction (Environment-Aware)
    L --> M{Environment Detection}
    M -->|Development| N[yt-dlp Service]
    M -->|Production| O[yt-mp3-go Service]
    N --> P[Audio URL Generation]
    O --> P

    %% ML Analysis Pipeline
    P --> Q[Parallel ML Processing]
    I --> Q
    Q --> R[Beat Detection]
    Q --> S[Chord Recognition]
    Q --> T[Key Detection]

    %% ML Backend Processing
    R --> U[Beat-Transformer/madmom Models]
    S --> V[Chord-CNN-LSTM/BTC Models]
    T --> W[Gemini AI Key Analysis]

    %% Results Processing
    U --> X[Beat Data]
    V --> Y[Chord Progression Data]
    W --> Z[Musical Key & Corrections]

    %% Synchronization & Caching
    X --> AA[Chord-Beat Synchronization]
    Y --> AA
    Z --> AA
    AA --> BB[Firebase Result Caching]
    AA --> CC[Real-time Visualization]

    %% Visualization Tabs
    CC --> DD[Beat & Chord Map Tab]
    CC --> EE[Guitar Chords Tab - Beta]
    CC --> FF[Lyrics & Chords Tab - Beta]

    %% Beat & Chord Map Features
    DD --> GG[Interactive Chord Grid]
    DD --> HH[Synchronized Audio Player]
    DD --> II[Real-time Beat Highlighting]

    %% Guitar Chords Features
    EE --> JJ[Animated Chord Progression]
    EE --> KK[Guitar Chord Diagrams]
    EE --> LL[Responsive Design 7/5/3/2/1]

    %% Lyrics Features
    FF --> MM[Lyrics Transcription]
    FF --> NN[Multi-language Translation]
    FF --> OO[Word-level Synchronization]

    %% External API Integration
    MM --> PP[Music.ai Transcription]
    NN --> QQ[Gemini AI Translation]

    %% Cache Integration
    K --> CC
    BB --> RR[Enhanced Metadata Storage]

    %% Error Handling & Fallbacks
    L --> SS{Extraction Failed?}
    SS -->|Yes| TT[Fallback Service]
    SS -->|No| P
    TT --> P

    %% Styling
    style A fill:#e1f5fe,stroke:#1976d2,stroke-width:3px
    style CC fill:#c8e6c9,stroke:#388e3c,stroke-width:3px
    style EE fill:#fff3e0,stroke:#f57c00,stroke-width:2px
    style FF fill:#f3e5f5,stroke:#7b1fa2,stroke-width:2px
    style Q fill:#ffebee,stroke:#d32f2f,stroke-width:2px
    style AA fill:#e8f5e8,stroke:#4caf50,stroke-width:2px

    %% Environment Indicators
    style N fill:#e3f2fd,stroke:#1976d2
    style O fill:#e8f5e8,stroke:#4caf50

Key Workflow Features

Dual Input Support

YouTube Integration: URL/search → video selection → analysis
Direct Upload: Audio file → blob storage → immediate analysis

Environment-Aware Processing

Development: localhost:5001 Python backend with yt-dlp (avoiding macOS AirTunes port conflict)
Production: Google Cloud Run backend with yt-mp3-go

Intelligent Caching

Firebase Cache: Analysis results with enhanced metadata
Cache Hit: Instant loading of previous analyses
Cache Miss: Full ML processing pipeline

Advanced ML Pipeline

Parallel Processing: Beat detection + chord recognition + key analysis
Multiple Models: Beat-Transformer/madmom, Chord-CNN-LSTM/BTC variants
AI Integration: Gemini AI for key detection and enharmonic corrections

Rich Visualization

Beat & Chord Map: Interactive grid with synchronized playback
Guitar Chords [Beta]: Responsive chord diagrams (7/5/3/2/1 layouts)
Lyrics & Chords [Beta]: Multi-language transcription with word-level sync

🛠️ Tech Stack

Frontend Framework

Next.js 15.3.1 - React framework with App Router
React 19 - UI library with latest features
TypeScript - Type-safe development
Tailwind CSS - Utility-first styling

UI Components & Visualization

Framer Motion - Smooth animations and transitions
Chart.js - Data visualization for audio analysis
@tombatossals/react-chords - Guitar chord diagram visualization with official chord database
React Player - Video playback integration

Backend & ML

Python Flask - Lightweight backend framework
Google Cloud Run - Serverless container deployment
Custom ML Models - Chord recognition and beat detection

Database & Storage

Firebase Firestore - NoSQL database for caching
Vercel Blob - File storage for audio processing

External APIs & Services

YouTube Search API - github.com/damonwonghv/youtube-search-api
yt-dlp - github.com/yt-dlp/yt-dlp - YouTube audio extraction
yt-mp3-go - github.com/vukan322/yt-mp3-go - Alternative audio extraction
LRClib - github.com/tranxuanthang/lrclib - Lyrics synchronization
Music.ai SDK - AI-powered music transcription
Google Gemini API - AI language model for translations

📱 Features

Core Analysis

Beat Detection - Automatic tempo and beat tracking
Chord Recognition - AI-powered chord progression analysis
Key Detection - Musical key identification with Gemini AI

Guitar Features [Beta]

Accurate Chord Database Integration - Official @tombatossals/chords-db with verified chord fingering patterns
Enhanced Chord Recognition - Support for both ML model colon notation (C:minor) and standard notation (Cm)
Interactive Chord Diagrams - Visual guitar fingering patterns with correct fret positions and finger placements
Responsive Design - Adaptive chord count (7/5/3/2/1 for xl/lg/md/sm/xs)
Smooth Animations - transitions with optimized scaling
Unicode Notation - Proper musical symbols (♯, ♭) with enharmonic equivalents

Lyrics & Transcription [Beta]

Synchronized Lyrics - Time-aligned lyrics display
Multi-language Support - Translation with Gemini AI
Word-level Timing - Precise synchronization with Music.ai

User Experience

Dark/Light Mode - Automatic theme switching
Responsive Design - Mobile-first approach
Performance Optimized - Lazy loading and code splitting
Offline Support - Service worker for caching

🚀 Deployment Options

Docker Deployment (Recommended for Production)

ChordMiniApp provides production-ready Docker images for easy deployment:

# Production deployment with Docker Compose
curl -O https://raw.githubusercontent.com/ptnghia-j/ChordMiniApp/main/docker-compose.prod.yml
docker-compose -f docker-compose.prod.yml up -d

Available on multiple registries:

Docker Hub: ptnghia/chordmini-frontend:latest, ptnghia/chordmini-backend:latest
GitHub Container Registry: ghcr.io/ptnghia-j/chordminiapp/frontend:latest

Deployment targets:

Cloud platforms: AWS, GCP, Azure, DigitalOcean
Container orchestration: Kubernetes, Docker Swarm
Edge computing: Raspberry Pi 4+ (ARM64 support)

Traditional Deployment

For custom deployments, see the Local Setup section above.

🤝 Contributing

We welcome contributions! Please see our Contributing Guidelines for details.

For Tasks:

Click tags to check more tools for each tasks

analyze chords transcribe lyrics detect beats visualize chord progressions generate guitar diagrams

For Jobs:

music analyst audio engineer music teacher ai developer web developer

Alternative AI tools for ChordMiniApp

Similar Open Source Tools

ChordMiniApp

github

: 97

opcode

opcode is a powerful desktop application built with Tauri 2 that serves as a command center for interacting with Claude Code. It offers a visual GUI for managing Claude Code sessions, creating custom agents, tracking usage, and more. Users can navigate projects, create specialized AI agents, monitor usage analytics, manage MCP servers, create session checkpoints, edit CLAUDE.md files, and more. The tool bridges the gap between command-line tools and visual experiences, making AI-assisted development more intuitive and productive.

github

: 15.8k

evi-run

evi-run is a powerful, production-ready multi-agent AI system built on Python using the OpenAI Agents SDK. It offers instant deployment, ultimate flexibility, built-in analytics, Telegram integration, and scalable architecture. The system features memory management, knowledge integration, task scheduling, multi-agent orchestration, custom agent creation, deep research, web intelligence, document processing, image generation, DEX analytics, and Solana token swap. It supports flexible usage modes like private, free, and pay mode, with upcoming features including NSFW mode, task scheduler, and automatic limit orders. The technology stack includes Python 3.11, OpenAI Agents SDK, Telegram Bot API, PostgreSQL, Redis, and Docker & Docker Compose for deployment.

github

: 74

aegra

Aegra is a self-hosted AI agent backend platform that provides LangGraph power without vendor lock-in. Built with FastAPI + PostgreSQL, it offers complete control over agent orchestration for teams looking to escape vendor lock-in, meet data sovereignty requirements, enable custom deployments, and optimize costs. Aegra is Agent Protocol compliant and perfect for teams seeking a free, self-hosted alternative to LangGraph Platform with zero lock-in, full control, and compatibility with existing LangGraph Client SDK.

github

: 137

shimmy

Shimmy is a 5.1MB single-binary local inference server providing OpenAI-compatible endpoints for GGUF models. It offers fast, reliable AI inference with sub-second responses, zero configuration, and automatic port management. Perfect for developers seeking privacy, cost-effectiveness, speed, and easy integration with popular tools like VSCode and Cursor. Shimmy is designed to be invisible infrastructure that simplifies local AI development and deployment.

github

: 392

finite-monkey-engine

FiniteMonkey is an advanced vulnerability mining engine powered purely by GPT, requiring no prior knowledge base or fine-tuning. Its effectiveness significantly surpasses most current related research approaches. The tool is task-driven, prompt-driven, and focuses on prompt design, leveraging 'deception' and hallucination as key mechanics. It has helped identify vulnerabilities worth over $60,000 in bounties. The tool requires PostgreSQL database, OpenAI API access, and Python environment for setup. It supports various languages like Solidity, Rust, Python, Move, Cairo, Tact, Func, Java, and Fake Solidity for scanning. FiniteMonkey is best suited for logic vulnerability mining in real projects, not recommended for academic vulnerability testing. GPT-4-turbo is recommended for optimal results with an average scan time of 2-3 hours for medium projects. The tool provides detailed scanning results guide and implementation tips for users.

github

: 305

mcp-memory-service

The MCP Memory Service is a universal memory service designed for AI assistants, providing semantic memory search and persistent storage. It works with various AI applications and offers fast local search using SQLite-vec and global distribution through Cloudflare. The service supports intelligent memory management, universal compatibility with AI tools, flexible storage options, and is production-ready with cross-platform support and secure connections. Users can store and recall memories, search by tags, check system health, and configure the service for Claude Desktop integration and environment variables.

github

: 724

claude-flow

Claude-Flow is a workflow automation tool designed to streamline and optimize business processes. It provides a user-friendly interface for creating and managing workflows, allowing users to automate repetitive tasks and improve efficiency. With features such as drag-and-drop workflow builder, customizable templates, and integration with popular business tools, Claude-Flow empowers users to automate their workflows without the need for extensive coding knowledge. Whether you are a small business owner looking to streamline your operations or a project manager seeking to automate task assignments, Claude-Flow offers a flexible and scalable solution to meet your workflow automation needs.

github

: 8.3k

chat-ollama

ChatOllama is an open-source chatbot based on LLMs (Large Language Models). It supports a wide range of language models, including Ollama served models, OpenAI, Azure OpenAI, and Anthropic. ChatOllama supports multiple types of chat, including free chat with LLMs and chat with LLMs based on a knowledge base. Key features of ChatOllama include Ollama models management, knowledge bases management, chat, and commercial LLMs API keys management.

github

: 3.4k

claude-007-agents

Claude Code Agents is an open-source AI agent system designed to enhance development workflows by providing specialized AI agents for orchestration, resilience engineering, and organizational memory. These agents offer specialized expertise across technologies, AI system with organizational memory, and an agent orchestration system. The system includes features such as engineering excellence by design, advanced orchestration system, Task Master integration, live MCP integrations, professional-grade workflows, and organizational intelligence. It is suitable for solo developers, small teams, enterprise teams, and open-source projects. The system requires a one-time bootstrap setup for each project to analyze the tech stack, select optimal agents, create configuration files, set up Task Master integration, and validate system readiness.

github

: 159

llamafarm

LlamaFarm is a comprehensive AI framework that empowers users to build powerful AI applications locally, with full control over costs and deployment options. It provides modular components for RAG systems, vector databases, model management, prompt engineering, and fine-tuning. Users can create differentiated AI products without needing extensive ML expertise, using simple CLI commands and YAML configs. The framework supports local-first development, production-ready components, strategy-based configuration, and deployment anywhere from laptops to the cloud.

github

: 115

PAI

PAI is an open-source personal AI infrastructure designed to orchestrate personal and professional lives. It provides a scaffolding framework with real-world examples for life management, professional tasks, and personal goals. The core mission is to augment humans with AI capabilities to thrive in a world full of AI. PAI features UFC Context Architecture for persistent memory, specialized digital assistants for various tasks, an integrated tool ecosystem with MCP Servers, voice system, browser automation, and API integrations. The philosophy of PAI focuses on augmenting human capability rather than replacing it. The tool is MIT licensed and encourages contributions from the open-source community.

github

: 317

llm_note

LLM notes repository contains detailed analysis on transformer models, language model compression, inference and deployment, high-performance computing, and system optimization methods. It includes discussions on various algorithms, frameworks, and performance analysis related to large language models and high-performance computing. The repository serves as a comprehensive resource for understanding and optimizing language models and computing systems.

github

: 817

AutoAgents

AutoAgents is a cutting-edge multi-agent framework built in Rust that enables the creation of intelligent, autonomous agents powered by Large Language Models (LLMs) and Ractor. Designed for performance, safety, and scalability. AutoAgents provides a robust foundation for building complex AI systems that can reason, act, and collaborate. With AutoAgents you can create Cloud Native Agents, Edge Native Agents and Hybrid Models as well. It is so extensible that other ML Models can be used to create complex pipelines using Actor Framework.

github

: 65

layra

LAYRA is the world's first visual-native AI automation engine that sees documents like a human, preserves layout and graphical elements, and executes arbitrarily complex workflows with full Python control. It empowers users to build next-generation intelligent systems with no limits or compromises. Built for Enterprise-Grade deployment, LAYRA features a modern frontend, high-performance backend, decoupled service architecture, visual-native multimodal document understanding, and a powerful workflow engine.

github

: 817

Call

Call is an open-source AI-native alternative to Google Meet and Zoom, offering video calling, team collaboration, contact management, meeting scheduling, AI-powered features, security, and privacy. It is cross-platform, web-based, mobile responsive, and supports offline capabilities. The tech stack includes Next.js, TypeScript, Tailwind CSS, Mediasoup-SFU, React Query, Zustand, Hono, PostgreSQL, Drizzle ORM, Better Auth, Turborepo, Docker, Vercel, and Rate Limiting.

github

: 395

For similar tasks

ChordMiniApp

github

: 97

For similar jobs

promptflow

**Prompt flow** is a suite of development tools designed to streamline the end-to-end development cycle of LLM-based AI applications, from ideation, prototyping, testing, evaluation to production deployment and monitoring. It makes prompt engineering much easier and enables you to build LLM apps with production quality.

github

: 9.2k

deepeval

DeepEval is a simple-to-use, open-source LLM evaluation framework specialized for unit testing LLM outputs. It incorporates various metrics such as G-Eval, hallucination, answer relevancy, RAGAS, etc., and runs locally on your machine for evaluation. It provides a wide range of ready-to-use evaluation metrics, allows for creating custom metrics, integrates with any CI/CD environment, and enables benchmarking LLMs on popular benchmarks. DeepEval is designed for evaluating RAG and fine-tuning applications, helping users optimize hyperparameters, prevent prompt drifting, and transition from OpenAI to hosting their own Llama2 with confidence.

github

: 11.3k

MegaDetector

MegaDetector is an AI model that identifies animals, people, and vehicles in camera trap images (which also makes it useful for eliminating blank images). This model is trained on several million images from a variety of ecosystems. MegaDetector is just one of many tools that aims to make conservation biologists more efficient with AI. If you want to learn about other ways to use AI to accelerate camera trap workflows, check out our of the field, affectionately titled "Everything I know about machine learning and camera traps".

github

: 186

leapfrogai

LeapfrogAI is a self-hosted AI platform designed to be deployed in air-gapped resource-constrained environments. It brings sophisticated AI solutions to these environments by hosting all the necessary components of an AI stack, including vector databases, model backends, API, and UI. LeapfrogAI's API closely matches that of OpenAI, allowing tools built for OpenAI/ChatGPT to function seamlessly with a LeapfrogAI backend. It provides several backends for various use cases, including llama-cpp-python, whisper, text-embeddings, and vllm. LeapfrogAI leverages Chainguard's apko to harden base python images, ensuring the latest supported Python versions are used by the other components of the stack. The LeapfrogAI SDK provides a standard set of protobuffs and python utilities for implementing backends and gRPC. LeapfrogAI offers UI options for common use-cases like chat, summarization, and transcription. It can be deployed and run locally via UDS and Kubernetes, built out using Zarf packages. LeapfrogAI is supported by a community of users and contributors, including Defense Unicorns, Beast Code, Chainguard, Exovera, Hypergiant, Pulze, SOSi, United States Navy, United States Air Force, and United States Space Force.

github

: 255

llava-docker

This Docker image for LLaVA (Large Language and Vision Assistant) provides a convenient way to run LLaVA locally or on RunPod. LLaVA is a powerful AI tool that combines natural language processing and computer vision capabilities. With this Docker image, you can easily access LLaVA's functionalities for various tasks, including image captioning, visual question answering, text summarization, and more. The image comes pre-installed with LLaVA v1.2.0, Torch 2.1.2, xformers 0.0.23.post1, and other necessary dependencies. You can customize the model used by setting the MODEL environment variable. The image also includes a Jupyter Lab environment for interactive development and exploration. Overall, this Docker image offers a comprehensive and user-friendly platform for leveraging LLaVA's capabilities.

github

: 59

carrot

The 'carrot' repository on GitHub provides a list of free and user-friendly ChatGPT mirror sites for easy access. The repository includes sponsored sites offering various GPT models and services. Users can find and share sites, report errors, and access stable and recommended sites for ChatGPT usage. The repository also includes a detailed list of ChatGPT sites, their features, and accessibility options, making it a valuable resource for ChatGPT users seeking free and unlimited GPT services.

github

: 17.1k

TrustLLM

TrustLLM is a comprehensive study of trustworthiness in LLMs, including principles for different dimensions of trustworthiness, established benchmark, evaluation, and analysis of trustworthiness for mainstream LLMs, and discussion of open challenges and future directions. Specifically, we first propose a set of principles for trustworthy LLMs that span eight different dimensions. Based on these principles, we further establish a benchmark across six dimensions including truthfulness, safety, fairness, robustness, privacy, and machine ethics. We then present a study evaluating 16 mainstream LLMs in TrustLLM, consisting of over 30 datasets. The document explains how to use the trustllm python package to help you assess the performance of your LLM in trustworthiness more quickly. For more details about TrustLLM, please refer to project website.

github

: 535

AI-YinMei

AI-YinMei is an AI virtual anchor Vtuber development tool (N card version). It supports fastgpt knowledge base chat dialogue, a complete set of solutions for LLM large language models: [fastgpt] + [one-api] + [Xinference], supports docking bilibili live broadcast barrage reply and entering live broadcast welcome speech, supports Microsoft edge-tts speech synthesis, supports Bert-VITS2 speech synthesis, supports GPT-SoVITS speech synthesis, supports expression control Vtuber Studio, supports painting stable-diffusion-webui output OBS live broadcast room, supports painting picture pornography public-NSFW-y-distinguish, supports search and image search service duckduckgo (requires magic Internet access), supports image search service Baidu image search (no magic Internet access), supports AI reply chat box [html plug-in], supports AI singing Auto-Convert-Music, supports playlist [html plug-in], supports dancing function, supports expression video playback, supports head touching action, supports gift smashing action, supports singing automatic start dancing function, chat and singing automatic cycle swing action, supports multi scene switching, background music switching, day and night automatic switching scene, supports open singing and painting, let AI automatically judge the content.

github

: 529