ai-gradio

A Python package that makes it easy for developers to create AI apps powered by various AI providers.

Stars: 1065

Visit

ai-gradio is a Python package that simplifies the creation of machine learning apps using various models like OpenAI, Google's Gemini, Anthropic's Claude, LumaAI, CrewAI, XAI's Grok, and Hyperbolic. It provides easy installation with support for different providers and offers features like text chat, voice chat, video chat, code generation interfaces, and AI agent teams. Users can set API keys for different providers and customize interfaces for specific tasks.

README:

ai-gradio

A Python package that makes it easy for developers to create machine learning apps powered by various AI providers. Built on top of Gradio, it provides a unified interface for multiple AI models and services.

Features

Core Features

Multi-Provider Support: Integrate with 15+ AI providers including OpenAI, Google Gemini, Anthropic, and more
Text Chat: Interactive chat interfaces for all text models
Voice Chat: Real-time voice interactions with OpenAI models
Video Chat: Video processing capabilities with Gemini models
Code Generation: Specialized interfaces for coding assistance
Multi-Modal: Support for text, image, and video inputs
Agent Teams: CrewAI integration for collaborative AI tasks
Browser Automation: AI agents that can perform web-based tasks

Model Support

Core Language Models

Provider	Models
OpenAI	gpt-4-turbo, gpt-4, gpt-3.5-turbo
Anthropic	claude-3-opus, claude-3-sonnet, claude-3-haiku
Gemini	gemini-pro, gemini-pro-vision, gemini-2.0-flash-exp
Groq	llama-3.2-70b-chat, mixtral-8x7b-chat

Specialized Models

Provider	Type	Models
LumaAI	Generation	dream-machine, photon-1
DeepSeek	Multi-purpose	deepseek-chat, deepseek-coder, deepseek-vision
CrewAI	Agent Teams	Support Team, Article Team
Qwen	Language	qwen-turbo, qwen-plus, qwen-max
Browser	Automation	browser-use-agent

Installation

Basic Installation

# Install core package
pip install ai-gradio

# Install with specific provider support
pip install 'ai-gradio[openai]'     # OpenAI support
pip install 'ai-gradio[gemini]'     # Google Gemini support
pip install 'ai-gradio[anthropic]'  # Anthropic Claude support
pip install 'ai-gradio[groq]'       # Groq support

# Install all providers
pip install 'ai-gradio[all]'

Additional Providers

pip install 'ai-gradio[crewai]'     # CrewAI support
pip install 'ai-gradio[lumaai]'     # LumaAI support
pip install 'ai-gradio[xai]'        # XAI/Grok support
pip install 'ai-gradio[cohere]'     # Cohere support
pip install 'ai-gradio[sambanova]'  # SambaNova support
pip install 'ai-gradio[hyperbolic]' # Hyperbolic support
pip install 'ai-gradio[deepseek]'   # DeepSeek support
pip install 'ai-gradio[smolagents]' # SmolagentsAI support
pip install 'ai-gradio[fireworks]'  # Fireworks support
pip install 'ai-gradio[together]'   # Together support
pip install 'ai-gradio[qwen]'       # Qwen support
pip install 'ai-gradio[browser]'    # Browser support

Usage

API Key Configuration

# Core Providers
export OPENAI_API_KEY=<your token>
export GEMINI_API_KEY=<your token>
export ANTHROPIC_API_KEY=<your token>
export GROQ_API_KEY=<your token>
export TAVILY_API_KEY=<your token>  # Required for Langchain agents

# Additional Providers (as needed)
export LUMAAI_API_KEY=<your token>
export XAI_API_KEY=<your token>
export COHERE_API_KEY=<your token>
# ... (other provider keys)

# Twilio credentials (required for WebRTC voice chat)
export TWILIO_ACCOUNT_SID=<your Twilio account SID>
export TWILIO_AUTH_TOKEN=<your Twilio auth token>

Quick Start

import gradio as gr
import ai_gradio

# Create a simple chat interface
gr.load(
    name='openai:gpt-4-turbo',  # or 'gemini:gemini-1.5-flash', 'groq:llama-3.2-70b-chat'
    src=ai_gradio.registry,
    title='AI Chat',
    description='Chat with an AI model'
).launch()

# Create a chat interface with Transformers models
gr.load(
    name='transformers:phi-4',  # or 'transformers:tulu-3', 'transformers:olmo-2-13b'
    src=ai_gradio.registry,
    title='Local AI Chat',
    description='Chat with locally running models'
).launch()

# Create a coding assistant with OpenAI
gr.load(
    name='openai:gpt-4-turbo',
    src=ai_gradio.registry,
    coder=True,
    title='OpenAI Code Assistant',
    description='OpenAI Code Generator'
).launch()

# Create a coding assistant with Gemini
gr.load(
    name='gemini:gemini-2.0-flash-thinking-exp-1219',  # or 'openai:gpt-4-turbo', 'anthropic:claude-3-opus'
    src=ai_gradio.registry,
    coder=True,
    title='Gemini Code Generator',
).launch()

Advanced Features

Voice Chat

gr.load(
    name='openai:gpt-4-turbo',
    src=ai_gradio.registry,
    enable_voice=True,
    title='AI Voice Assistant'
).launch()

Camera Mode

# Create a vision-enabled interface with camera support
gr.load(
    name='gemini:gemini-2.0-flash-exp',
    src=ai_gradio.registry,
    camera=True,
).launch()

Multi-Provider Interface

import gradio as gr
import ai_gradio

with gr.Blocks() as demo:
    with gr.Tab("Text"):
        gr.load('openai:gpt-4-turbo', src=ai_gradio.registry)
    with gr.Tab("Vision"):
        gr.load('gemini:gemini-pro-vision', src=ai_gradio.registry)
    with gr.Tab("Code"):
        gr.load('deepseek:deepseek-coder', src=ai_gradio.registry)

demo.launch()

CrewAI Teams

# Article Creation Team
gr.load(
    name='crewai:gpt-4-turbo',
    src=ai_gradio.registry,
    crew_type='article',
    title='AI Writing Team'
).launch()

Browser Automation

playwright install

use python 3.11+ for browser use

import gradio as gr
import ai_gradio

# Create a browser automation interface
gr.load(
    name='browser:gpt-4-turbo',
    src=ai_gradio.registry,
    title='AI Browser Assistant',
    description='Let AI help with web tasks'
).launch()

Example tasks:

Flight searches on Google Flights
Weather lookups
Product price comparisons
News searches

Swarms Integration

import gradio as gr
import ai_gradio

# Create a chat interface with Swarms
gr.load(
    name='swarms:gpt-4-turbo',  # or other OpenAI models
    src=ai_gradio.registry,
    agent_name="Stock-Analysis-Agent",  # customize agent name
    title='Swarms Chat',
    description='Chat with an AI agent powered by Swarms'
).launch()

Langchain Agents

import gradio as gr
import ai_gradio

# Create a Langchain agent interface
gr.load(
    name='langchain:gpt-4-turbo',  # or other supported models
    src=ai_gradio.registry,
    title='Langchain Agent',
    description='AI agent powered by Langchain'
).launch()

Requirements

Core Requirements

Python 3.10+
gradio >= 5.9.1

Optional Features

Voice Chat: gradio-webrtc, numba==0.60.0, pydub, librosa
Video Chat: opencv-python, Pillow
Agent Teams: crewai>=0.1.0, langchain>=0.1.0

Troubleshooting

Authentication Issues

If you encounter 401 errors, verify your API keys:

import os

# Set API keys manually if needed
os.environ["OPENAI_API_KEY"] = "your-api-key"
os.environ["GEMINI_API_KEY"] = "your-api-key"

Provider Installation

If you see "no providers installed" errors:

# Install specific provider
pip install 'ai-gradio[provider_name]'

# Or install all providers
pip install 'ai-gradio[all]'

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

For Tasks:

Click tags to check more tools for each tasks

create ai chat generate web applications chat with ai model create articles with ai collaborate with ai agents

For Jobs:

machine learning engineer data scientist ai developer software engineer research scientist

Alternative AI tools for ai-gradio

Similar Open Source Tools

ai-gradio

github

: 1.1k

aio-scrapy

Aio-scrapy is an asyncio-based web crawling and web scraping framework inspired by Scrapy. It supports distributed crawling/scraping, implements compatibility with scrapyd, and provides options for using redis queue and rabbitmq queue. The framework is designed for fast extraction of structured data from websites. Aio-scrapy requires Python 3.9+ and is compatible with Linux, Windows, macOS, and BSD systems.

github

: 52

retro-aim-server

Retro AIM Server is an instant messaging server that revives AOL Instant Messenger clients from the 2000s. It supports Windows AIM client versions 5.0-5.9, away messages, buddy icons, buddy list, chat rooms, instant messaging, user profiles, blocking/visibility toggle/idle notification, and warning. The Management API provides functionality for administering the server, including listing users, creating users, changing passwords, and listing active sessions.

github

: 128

xlings

Xlings is a developer tool for programming learning, development, and course building. It provides features such as software installation, one-click environment setup, project dependency management, and cross-platform language package management. Additionally, it offers real-time compilation and running, AI code suggestions, tutorial project creation, automatic code checking for practice, and demo examples collection.

github

: 361

cloudflare-ai-web

Cloudflare-ai-web is a lightweight and easy-to-use tool that allows you to quickly deploy a multi-modal AI platform using Cloudflare Workers AI. It supports serverless deployment, password protection, and local storage of chat logs. With a size of only ~638 kB gzip, it is a great option for building AI-powered applications without the need for a dedicated server.

github

: 1.9k

Q-Bench

Q-Bench is a benchmark for general-purpose foundation models on low-level vision, focusing on multi-modality LLMs performance. It includes three realms for low-level vision: perception, description, and assessment. The benchmark datasets LLVisionQA and LLDescribe are collected for perception and description tasks, with open submission-based evaluation. An abstract evaluation code is provided for assessment using public datasets. The tool can be used with the datasets API for single images and image pairs, allowing for automatic download and usage. Various tasks and evaluations are available for testing MLLMs on low-level vision tasks.

github

: 224

gemini-openai-proxy

Gemini-OpenAI-Proxy is a proxy software designed to convert OpenAI API protocol calls into Google Gemini Pro protocol, allowing software using OpenAI protocol to utilize Gemini Pro models seamlessly. It provides an easy integration of Gemini Pro's powerful features without the need for complex development work.

github

: 264

weixin-dyh-ai

WeiXin-Dyh-AI is a backend management system that supports integrating WeChat subscription accounts with AI services. It currently supports integration with Ali AI, Moonshot, and Tencent Hyunyuan. Users can configure different AI models to simulate and interact with AI in multiple modes: text-based knowledge Q&A, text-to-image drawing, image description, text-to-voice conversion, enabling human-AI conversations on WeChat. The system allows hierarchical AI prompt settings at system, subscription account, and WeChat user levels. Users can configure AI model types, providers, and specific instances. The system also supports rules for allocating models and keys at different levels. It addresses limitations of WeChat's messaging system and offers features like text-based commands and voice support for interactions with AI.

github

: 117

LongLLaVA

LongLLaVA is a tool for scaling multi-modal LLMs to 1000 images efficiently via hybrid architecture. It includes stages for single-image alignment, instruction-tuning, and multi-image instruction-tuning, with evaluation through a command line interface and model inference. The tool aims to achieve GPT-4V level capabilities and beyond, providing reproducibility of results and benchmarks for efficiency and performance.

github

: 158

cool-admin-midway

Cool-admin (midway version) is a cool open-source backend permission management system that supports modular, plugin-based, rapid CRUD development. It facilitates the quick construction and iteration of backend management systems, deployable in various ways such as serverless, docker, and traditional servers. It features AI coding for generating APIs and frontend pages, flow orchestration for drag-and-drop functionality, modular and plugin-based design for clear and maintainable code. The tech stack includes Node.js, Midway.js, Koa.js, TypeScript for backend, and Vue.js, Element-Plus, JSX, Pinia, Vue Router for frontend. It offers friendly technology choices for both frontend and backend developers, with TypeScript syntax similar to Java and PHP for backend developers. The tool is suitable for those looking for a modern, efficient, and fast development experience.

github

: 2.7k

deepseek-free-api

DeepSeek Free API is a high-speed streaming output tool that supports multi-turn conversations and zero-configuration deployment. It is compatible with the ChatGPT interface and offers multiple token support. The tool provides eight free APIs for various AI interfaces. Users can access the tool online, prepare for integration, deploy using Docker, Docker-compose, Render, Vercel, or native deployment methods. It also offers client recommendations for faster integration and supports dialogue completion and userToken live checks. The tool comes with important considerations for Nginx reverse proxy optimization and token statistics.

github

: 722

metaso-free-api

Metaso AI Free service supports high-speed streaming output, secret tower AI super network search (full network or academic as well as concise, in-depth, research three modes), zero-configuration deployment, multi-token support. Fully compatible with ChatGPT interface. It also has seven other free APIs available for use. The tool provides various deployment options such as Docker, Docker-compose, Render, Vercel, and native deployment. Users can access the tool for chat completions and token live checks. Note: Reverse API is unstable, it is recommended to use the official Metaso AI website to avoid the risk of banning. This project is for research and learning purposes only, not for commercial use.

github

: 372

agentscope

AgentScope is a multi-agent platform designed to empower developers to build multi-agent applications with large-scale models. It features three high-level capabilities: Easy-to-Use, High Robustness, and Actor-Based Distribution. AgentScope provides a list of `ModelWrapper` to support both local model services and third-party model APIs, including OpenAI API, DashScope API, Gemini API, and ollama. It also enables developers to rapidly deploy local model services using libraries such as ollama (CPU inference), Flask + Transformers, Flask + ModelScope, FastChat, and vllm. AgentScope supports various services, including Web Search, Data Query, Retrieval, Code Execution, File Operation, and Text Processing. Example applications include Conversation, Game, and Distribution. AgentScope is released under Apache License 2.0 and welcomes contributions.

github

: 5.8k

pyllms

PyLLMs is a minimal Python library designed to connect to various Language Model Models (LLMs) such as OpenAI, Anthropic, Google, AI21, Cohere, Aleph Alpha, and HuggingfaceHub. It provides a built-in model performance benchmark for fast prototyping and evaluating different models. Users can easily connect to top LLMs, get completions from multiple models simultaneously, and evaluate models on quality, speed, and cost. The library supports asynchronous completion, streaming from compatible models, and multi-model initialization for testing and comparison. Additionally, it offers features like passing chat history, system messages, counting tokens, and benchmarking models based on quality, speed, and cost.

github

: 734

Groq2API

Groq2API is a REST API wrapper around the Groq2 model, a large language model trained by Google. The API allows you to send text prompts to the model and receive generated text responses. The API is easy to use and can be integrated into a variety of applications.

github

: 288

emohaa-free-api

Emohaa AI Free API is a free API that allows you to access the Emohaa AI chatbot. Emohaa AI is a powerful chatbot that can understand and respond to a wide range of natural language queries. It can be used for a variety of purposes, such as customer service, information retrieval, and language translation. The Emohaa AI Free API is easy to use and can be integrated into any application. It is a great way to add AI capabilities to your projects without having to build your own chatbot from scratch.

github

: 77

For similar tasks

ai-gradio

github

: 1.1k

For similar jobs

promptflow

**Prompt flow** is a suite of development tools designed to streamline the end-to-end development cycle of LLM-based AI applications, from ideation, prototyping, testing, evaluation to production deployment and monitoring. It makes prompt engineering much easier and enables you to build LLM apps with production quality.

github

: 9.2k

deepeval

DeepEval is a simple-to-use, open-source LLM evaluation framework specialized for unit testing LLM outputs. It incorporates various metrics such as G-Eval, hallucination, answer relevancy, RAGAS, etc., and runs locally on your machine for evaluation. It provides a wide range of ready-to-use evaluation metrics, allows for creating custom metrics, integrates with any CI/CD environment, and enables benchmarking LLMs on popular benchmarks. DeepEval is designed for evaluating RAG and fine-tuning applications, helping users optimize hyperparameters, prevent prompt drifting, and transition from OpenAI to hosting their own Llama2 with confidence.

github

: 4.5k

MegaDetector

MegaDetector is an AI model that identifies animals, people, and vehicles in camera trap images (which also makes it useful for eliminating blank images). This model is trained on several million images from a variety of ecosystems. MegaDetector is just one of many tools that aims to make conservation biologists more efficient with AI. If you want to learn about other ways to use AI to accelerate camera trap workflows, check out our of the field, affectionately titled "Everything I know about machine learning and camera traps".

github

: 106

leapfrogai

LeapfrogAI is a self-hosted AI platform designed to be deployed in air-gapped resource-constrained environments. It brings sophisticated AI solutions to these environments by hosting all the necessary components of an AI stack, including vector databases, model backends, API, and UI. LeapfrogAI's API closely matches that of OpenAI, allowing tools built for OpenAI/ChatGPT to function seamlessly with a LeapfrogAI backend. It provides several backends for various use cases, including llama-cpp-python, whisper, text-embeddings, and vllm. LeapfrogAI leverages Chainguard's apko to harden base python images, ensuring the latest supported Python versions are used by the other components of the stack. The LeapfrogAI SDK provides a standard set of protobuffs and python utilities for implementing backends and gRPC. LeapfrogAI offers UI options for common use-cases like chat, summarization, and transcription. It can be deployed and run locally via UDS and Kubernetes, built out using Zarf packages. LeapfrogAI is supported by a community of users and contributors, including Defense Unicorns, Beast Code, Chainguard, Exovera, Hypergiant, Pulze, SOSi, United States Navy, United States Air Force, and United States Space Force.

github

: 255

llava-docker

This Docker image for LLaVA (Large Language and Vision Assistant) provides a convenient way to run LLaVA locally or on RunPod. LLaVA is a powerful AI tool that combines natural language processing and computer vision capabilities. With this Docker image, you can easily access LLaVA's functionalities for various tasks, including image captioning, visual question answering, text summarization, and more. The image comes pre-installed with LLaVA v1.2.0, Torch 2.1.2, xformers 0.0.23.post1, and other necessary dependencies. You can customize the model used by setting the MODEL environment variable. The image also includes a Jupyter Lab environment for interactive development and exploration. Overall, this Docker image offers a comprehensive and user-friendly platform for leveraging LLaVA's capabilities.

github

: 59

carrot

The 'carrot' repository on GitHub provides a list of free and user-friendly ChatGPT mirror sites for easy access. The repository includes sponsored sites offering various GPT models and services. Users can find and share sites, report errors, and access stable and recommended sites for ChatGPT usage. The repository also includes a detailed list of ChatGPT sites, their features, and accessibility options, making it a valuable resource for ChatGPT users seeking free and unlimited GPT services.

github

: 17.1k

TrustLLM

TrustLLM is a comprehensive study of trustworthiness in LLMs, including principles for different dimensions of trustworthiness, established benchmark, evaluation, and analysis of trustworthiness for mainstream LLMs, and discussion of open challenges and future directions. Specifically, we first propose a set of principles for trustworthy LLMs that span eight different dimensions. Based on these principles, we further establish a benchmark across six dimensions including truthfulness, safety, fairness, robustness, privacy, and machine ethics. We then present a study evaluating 16 mainstream LLMs in TrustLLM, consisting of over 30 datasets. The document explains how to use the trustllm python package to help you assess the performance of your LLM in trustworthiness more quickly. For more details about TrustLLM, please refer to project website.

github

: 412

AI-YinMei

AI-YinMei is an AI virtual anchor Vtuber development tool (N card version). It supports fastgpt knowledge base chat dialogue, a complete set of solutions for LLM large language models: [fastgpt] + [one-api] + [Xinference], supports docking bilibili live broadcast barrage reply and entering live broadcast welcome speech, supports Microsoft edge-tts speech synthesis, supports Bert-VITS2 speech synthesis, supports GPT-SoVITS speech synthesis, supports expression control Vtuber Studio, supports painting stable-diffusion-webui output OBS live broadcast room, supports painting picture pornography public-NSFW-y-distinguish, supports search and image search service duckduckgo (requires magic Internet access), supports image search service Baidu image search (no magic Internet access), supports AI reply chat box [html plug-in], supports AI singing Auto-Convert-Music, supports playlist [html plug-in], supports dancing function, supports expression video playback, supports head touching action, supports gift smashing action, supports singing automatic start dancing function, chat and singing automatic cycle swing action, supports multi scene switching, background music switching, day and night automatic switching scene, supports open singing and painting, let AI automatically judge the content.

github

: 529