![ai-gradio](/statics/github-mark.png)
ai-gradio
A Python package that makes it easy for developers to create AI apps powered by various AI providers.
Stars: 1065
![screenshot](/screenshots_githubs/AK391-ai-gradio.jpg)
ai-gradio is a Python package that simplifies the creation of machine learning apps using various models like OpenAI, Google's Gemini, Anthropic's Claude, LumaAI, CrewAI, XAI's Grok, and Hyperbolic. It provides easy installation with support for different providers and offers features like text chat, voice chat, video chat, code generation interfaces, and AI agent teams. Users can set API keys for different providers and customize interfaces for specific tasks.
README:
A Python package that makes it easy for developers to create machine learning apps powered by various AI providers. Built on top of Gradio, it provides a unified interface for multiple AI models and services.
- Multi-Provider Support: Integrate with 15+ AI providers including OpenAI, Google Gemini, Anthropic, and more
- Text Chat: Interactive chat interfaces for all text models
- Voice Chat: Real-time voice interactions with OpenAI models
- Video Chat: Video processing capabilities with Gemini models
- Code Generation: Specialized interfaces for coding assistance
- Multi-Modal: Support for text, image, and video inputs
- Agent Teams: CrewAI integration for collaborative AI tasks
- Browser Automation: AI agents that can perform web-based tasks
Provider | Models |
---|---|
OpenAI | gpt-4-turbo, gpt-4, gpt-3.5-turbo |
Anthropic | claude-3-opus, claude-3-sonnet, claude-3-haiku |
Gemini | gemini-pro, gemini-pro-vision, gemini-2.0-flash-exp |
Groq | llama-3.2-70b-chat, mixtral-8x7b-chat |
Provider | Type | Models |
---|---|---|
LumaAI | Generation | dream-machine, photon-1 |
DeepSeek | Multi-purpose | deepseek-chat, deepseek-coder, deepseek-vision |
CrewAI | Agent Teams | Support Team, Article Team |
Qwen | Language | qwen-turbo, qwen-plus, qwen-max |
Browser | Automation | browser-use-agent |
# Install core package
pip install ai-gradio
# Install with specific provider support
pip install 'ai-gradio[openai]' # OpenAI support
pip install 'ai-gradio[gemini]' # Google Gemini support
pip install 'ai-gradio[anthropic]' # Anthropic Claude support
pip install 'ai-gradio[groq]' # Groq support
# Install all providers
pip install 'ai-gradio[all]'
pip install 'ai-gradio[crewai]' # CrewAI support
pip install 'ai-gradio[lumaai]' # LumaAI support
pip install 'ai-gradio[xai]' # XAI/Grok support
pip install 'ai-gradio[cohere]' # Cohere support
pip install 'ai-gradio[sambanova]' # SambaNova support
pip install 'ai-gradio[hyperbolic]' # Hyperbolic support
pip install 'ai-gradio[deepseek]' # DeepSeek support
pip install 'ai-gradio[smolagents]' # SmolagentsAI support
pip install 'ai-gradio[fireworks]' # Fireworks support
pip install 'ai-gradio[together]' # Together support
pip install 'ai-gradio[qwen]' # Qwen support
pip install 'ai-gradio[browser]' # Browser support
# Core Providers
export OPENAI_API_KEY=<your token>
export GEMINI_API_KEY=<your token>
export ANTHROPIC_API_KEY=<your token>
export GROQ_API_KEY=<your token>
export TAVILY_API_KEY=<your token> # Required for Langchain agents
# Additional Providers (as needed)
export LUMAAI_API_KEY=<your token>
export XAI_API_KEY=<your token>
export COHERE_API_KEY=<your token>
# ... (other provider keys)
# Twilio credentials (required for WebRTC voice chat)
export TWILIO_ACCOUNT_SID=<your Twilio account SID>
export TWILIO_AUTH_TOKEN=<your Twilio auth token>
import gradio as gr
import ai_gradio
# Create a simple chat interface
gr.load(
name='openai:gpt-4-turbo', # or 'gemini:gemini-1.5-flash', 'groq:llama-3.2-70b-chat'
src=ai_gradio.registry,
title='AI Chat',
description='Chat with an AI model'
).launch()
# Create a chat interface with Transformers models
gr.load(
name='transformers:phi-4', # or 'transformers:tulu-3', 'transformers:olmo-2-13b'
src=ai_gradio.registry,
title='Local AI Chat',
description='Chat with locally running models'
).launch()
# Create a coding assistant with OpenAI
gr.load(
name='openai:gpt-4-turbo',
src=ai_gradio.registry,
coder=True,
title='OpenAI Code Assistant',
description='OpenAI Code Generator'
).launch()
# Create a coding assistant with Gemini
gr.load(
name='gemini:gemini-2.0-flash-thinking-exp-1219', # or 'openai:gpt-4-turbo', 'anthropic:claude-3-opus'
src=ai_gradio.registry,
coder=True,
title='Gemini Code Generator',
).launch()
gr.load(
name='openai:gpt-4-turbo',
src=ai_gradio.registry,
enable_voice=True,
title='AI Voice Assistant'
).launch()
# Create a vision-enabled interface with camera support
gr.load(
name='gemini:gemini-2.0-flash-exp',
src=ai_gradio.registry,
camera=True,
).launch()
import gradio as gr
import ai_gradio
with gr.Blocks() as demo:
with gr.Tab("Text"):
gr.load('openai:gpt-4-turbo', src=ai_gradio.registry)
with gr.Tab("Vision"):
gr.load('gemini:gemini-pro-vision', src=ai_gradio.registry)
with gr.Tab("Code"):
gr.load('deepseek:deepseek-coder', src=ai_gradio.registry)
demo.launch()
# Article Creation Team
gr.load(
name='crewai:gpt-4-turbo',
src=ai_gradio.registry,
crew_type='article',
title='AI Writing Team'
).launch()
playwright install
use python 3.11+ for browser use
import gradio as gr
import ai_gradio
# Create a browser automation interface
gr.load(
name='browser:gpt-4-turbo',
src=ai_gradio.registry,
title='AI Browser Assistant',
description='Let AI help with web tasks'
).launch()
Example tasks:
- Flight searches on Google Flights
- Weather lookups
- Product price comparisons
- News searches
import gradio as gr
import ai_gradio
# Create a chat interface with Swarms
gr.load(
name='swarms:gpt-4-turbo', # or other OpenAI models
src=ai_gradio.registry,
agent_name="Stock-Analysis-Agent", # customize agent name
title='Swarms Chat',
description='Chat with an AI agent powered by Swarms'
).launch()
import gradio as gr
import ai_gradio
# Create a Langchain agent interface
gr.load(
name='langchain:gpt-4-turbo', # or other supported models
src=ai_gradio.registry,
title='Langchain Agent',
description='AI agent powered by Langchain'
).launch()
- Python 3.10+
- gradio >= 5.9.1
- Voice Chat: gradio-webrtc, numba==0.60.0, pydub, librosa
- Video Chat: opencv-python, Pillow
- Agent Teams: crewai>=0.1.0, langchain>=0.1.0
If you encounter 401 errors, verify your API keys:
import os
# Set API keys manually if needed
os.environ["OPENAI_API_KEY"] = "your-api-key"
os.environ["GEMINI_API_KEY"] = "your-api-key"
If you see "no providers installed" errors:
# Install specific provider
pip install 'ai-gradio[provider_name]'
# Or install all providers
pip install 'ai-gradio[all]'
Contributions are welcome! Please feel free to submit a Pull Request.
For Tasks:
Click tags to check more tools for each tasksFor Jobs:
Alternative AI tools for ai-gradio
Similar Open Source Tools
![ai-gradio Screenshot](/screenshots_githubs/AK391-ai-gradio.jpg)
ai-gradio
ai-gradio is a Python package that simplifies the creation of machine learning apps using various models like OpenAI, Google's Gemini, Anthropic's Claude, LumaAI, CrewAI, XAI's Grok, and Hyperbolic. It provides easy installation with support for different providers and offers features like text chat, voice chat, video chat, code generation interfaces, and AI agent teams. Users can set API keys for different providers and customize interfaces for specific tasks.
![aio-scrapy Screenshot](/screenshots_githubs/ConlinH-aio-scrapy.jpg)
aio-scrapy
Aio-scrapy is an asyncio-based web crawling and web scraping framework inspired by Scrapy. It supports distributed crawling/scraping, implements compatibility with scrapyd, and provides options for using redis queue and rabbitmq queue. The framework is designed for fast extraction of structured data from websites. Aio-scrapy requires Python 3.9+ and is compatible with Linux, Windows, macOS, and BSD systems.
![retro-aim-server Screenshot](/screenshots_githubs/mk6i-retro-aim-server.jpg)
retro-aim-server
Retro AIM Server is an instant messaging server that revives AOL Instant Messenger clients from the 2000s. It supports Windows AIM client versions 5.0-5.9, away messages, buddy icons, buddy list, chat rooms, instant messaging, user profiles, blocking/visibility toggle/idle notification, and warning. The Management API provides functionality for administering the server, including listing users, creating users, changing passwords, and listing active sessions.
![xlings Screenshot](/screenshots_githubs/d2learn-xlings.jpg)
xlings
Xlings is a developer tool for programming learning, development, and course building. It provides features such as software installation, one-click environment setup, project dependency management, and cross-platform language package management. Additionally, it offers real-time compilation and running, AI code suggestions, tutorial project creation, automatic code checking for practice, and demo examples collection.
![cloudflare-ai-web Screenshot](/screenshots_githubs/Jazee6-cloudflare-ai-web.jpg)
cloudflare-ai-web
Cloudflare-ai-web is a lightweight and easy-to-use tool that allows you to quickly deploy a multi-modal AI platform using Cloudflare Workers AI. It supports serverless deployment, password protection, and local storage of chat logs. With a size of only ~638 kB gzip, it is a great option for building AI-powered applications without the need for a dedicated server.
![Q-Bench Screenshot](/screenshots_githubs/Q-Future-Q-Bench.jpg)
Q-Bench
Q-Bench is a benchmark for general-purpose foundation models on low-level vision, focusing on multi-modality LLMs performance. It includes three realms for low-level vision: perception, description, and assessment. The benchmark datasets LLVisionQA and LLDescribe are collected for perception and description tasks, with open submission-based evaluation. An abstract evaluation code is provided for assessment using public datasets. The tool can be used with the datasets API for single images and image pairs, allowing for automatic download and usage. Various tasks and evaluations are available for testing MLLMs on low-level vision tasks.
![gemini-openai-proxy Screenshot](/screenshots_githubs/zuisong-gemini-openai-proxy.jpg)
gemini-openai-proxy
Gemini-OpenAI-Proxy is a proxy software designed to convert OpenAI API protocol calls into Google Gemini Pro protocol, allowing software using OpenAI protocol to utilize Gemini Pro models seamlessly. It provides an easy integration of Gemini Pro's powerful features without the need for complex development work.
![weixin-dyh-ai Screenshot](/screenshots_githubs/aceway-weixin-dyh-ai.jpg)
weixin-dyh-ai
WeiXin-Dyh-AI is a backend management system that supports integrating WeChat subscription accounts with AI services. It currently supports integration with Ali AI, Moonshot, and Tencent Hyunyuan. Users can configure different AI models to simulate and interact with AI in multiple modes: text-based knowledge Q&A, text-to-image drawing, image description, text-to-voice conversion, enabling human-AI conversations on WeChat. The system allows hierarchical AI prompt settings at system, subscription account, and WeChat user levels. Users can configure AI model types, providers, and specific instances. The system also supports rules for allocating models and keys at different levels. It addresses limitations of WeChat's messaging system and offers features like text-based commands and voice support for interactions with AI.
![LongLLaVA Screenshot](/screenshots_githubs/FreedomIntelligence-LongLLaVA.jpg)
LongLLaVA
LongLLaVA is a tool for scaling multi-modal LLMs to 1000 images efficiently via hybrid architecture. It includes stages for single-image alignment, instruction-tuning, and multi-image instruction-tuning, with evaluation through a command line interface and model inference. The tool aims to achieve GPT-4V level capabilities and beyond, providing reproducibility of results and benchmarks for efficiency and performance.
![cool-admin-midway Screenshot](/screenshots_githubs/cool-team-official-cool-admin-midway.jpg)
cool-admin-midway
Cool-admin (midway version) is a cool open-source backend permission management system that supports modular, plugin-based, rapid CRUD development. It facilitates the quick construction and iteration of backend management systems, deployable in various ways such as serverless, docker, and traditional servers. It features AI coding for generating APIs and frontend pages, flow orchestration for drag-and-drop functionality, modular and plugin-based design for clear and maintainable code. The tech stack includes Node.js, Midway.js, Koa.js, TypeScript for backend, and Vue.js, Element-Plus, JSX, Pinia, Vue Router for frontend. It offers friendly technology choices for both frontend and backend developers, with TypeScript syntax similar to Java and PHP for backend developers. The tool is suitable for those looking for a modern, efficient, and fast development experience.
![deepseek-free-api Screenshot](/screenshots_githubs/LLM-Red-Team-deepseek-free-api.jpg)
deepseek-free-api
DeepSeek Free API is a high-speed streaming output tool that supports multi-turn conversations and zero-configuration deployment. It is compatible with the ChatGPT interface and offers multiple token support. The tool provides eight free APIs for various AI interfaces. Users can access the tool online, prepare for integration, deploy using Docker, Docker-compose, Render, Vercel, or native deployment methods. It also offers client recommendations for faster integration and supports dialogue completion and userToken live checks. The tool comes with important considerations for Nginx reverse proxy optimization and token statistics.
![metaso-free-api Screenshot](/screenshots_githubs/LLM-Red-Team-metaso-free-api.jpg)
metaso-free-api
Metaso AI Free service supports high-speed streaming output, secret tower AI super network search (full network or academic as well as concise, in-depth, research three modes), zero-configuration deployment, multi-token support. Fully compatible with ChatGPT interface. It also has seven other free APIs available for use. The tool provides various deployment options such as Docker, Docker-compose, Render, Vercel, and native deployment. Users can access the tool for chat completions and token live checks. Note: Reverse API is unstable, it is recommended to use the official Metaso AI website to avoid the risk of banning. This project is for research and learning purposes only, not for commercial use.
![agentscope Screenshot](/screenshots_githubs/modelscope-agentscope.jpg)
agentscope
AgentScope is a multi-agent platform designed to empower developers to build multi-agent applications with large-scale models. It features three high-level capabilities: Easy-to-Use, High Robustness, and Actor-Based Distribution. AgentScope provides a list of `ModelWrapper` to support both local model services and third-party model APIs, including OpenAI API, DashScope API, Gemini API, and ollama. It also enables developers to rapidly deploy local model services using libraries such as ollama (CPU inference), Flask + Transformers, Flask + ModelScope, FastChat, and vllm. AgentScope supports various services, including Web Search, Data Query, Retrieval, Code Execution, File Operation, and Text Processing. Example applications include Conversation, Game, and Distribution. AgentScope is released under Apache License 2.0 and welcomes contributions.
![pyllms Screenshot](/screenshots_githubs/kagisearch-pyllms.jpg)
pyllms
PyLLMs is a minimal Python library designed to connect to various Language Model Models (LLMs) such as OpenAI, Anthropic, Google, AI21, Cohere, Aleph Alpha, and HuggingfaceHub. It provides a built-in model performance benchmark for fast prototyping and evaluating different models. Users can easily connect to top LLMs, get completions from multiple models simultaneously, and evaluate models on quality, speed, and cost. The library supports asynchronous completion, streaming from compatible models, and multi-model initialization for testing and comparison. Additionally, it offers features like passing chat history, system messages, counting tokens, and benchmarking models based on quality, speed, and cost.
![Groq2API Screenshot](/screenshots_githubs/Star-Studio-Develop-Groq2API.jpg)
Groq2API
Groq2API is a REST API wrapper around the Groq2 model, a large language model trained by Google. The API allows you to send text prompts to the model and receive generated text responses. The API is easy to use and can be integrated into a variety of applications.
![emohaa-free-api Screenshot](/screenshots_githubs/LLM-Red-Team-emohaa-free-api.jpg)
emohaa-free-api
Emohaa AI Free API is a free API that allows you to access the Emohaa AI chatbot. Emohaa AI is a powerful chatbot that can understand and respond to a wide range of natural language queries. It can be used for a variety of purposes, such as customer service, information retrieval, and language translation. The Emohaa AI Free API is easy to use and can be integrated into any application. It is a great way to add AI capabilities to your projects without having to build your own chatbot from scratch.
For similar tasks
![ai-gradio Screenshot](/screenshots_githubs/AK391-ai-gradio.jpg)
ai-gradio
ai-gradio is a Python package that simplifies the creation of machine learning apps using various models like OpenAI, Google's Gemini, Anthropic's Claude, LumaAI, CrewAI, XAI's Grok, and Hyperbolic. It provides easy installation with support for different providers and offers features like text chat, voice chat, video chat, code generation interfaces, and AI agent teams. Users can set API keys for different providers and customize interfaces for specific tasks.
For similar jobs
![promptflow Screenshot](/screenshots_githubs/microsoft-promptflow.jpg)
promptflow
**Prompt flow** is a suite of development tools designed to streamline the end-to-end development cycle of LLM-based AI applications, from ideation, prototyping, testing, evaluation to production deployment and monitoring. It makes prompt engineering much easier and enables you to build LLM apps with production quality.
![deepeval Screenshot](/screenshots_githubs/confident-ai-deepeval.jpg)
deepeval
DeepEval is a simple-to-use, open-source LLM evaluation framework specialized for unit testing LLM outputs. It incorporates various metrics such as G-Eval, hallucination, answer relevancy, RAGAS, etc., and runs locally on your machine for evaluation. It provides a wide range of ready-to-use evaluation metrics, allows for creating custom metrics, integrates with any CI/CD environment, and enables benchmarking LLMs on popular benchmarks. DeepEval is designed for evaluating RAG and fine-tuning applications, helping users optimize hyperparameters, prevent prompt drifting, and transition from OpenAI to hosting their own Llama2 with confidence.
![MegaDetector Screenshot](/screenshots_githubs/agentmorris-MegaDetector.jpg)
MegaDetector
MegaDetector is an AI model that identifies animals, people, and vehicles in camera trap images (which also makes it useful for eliminating blank images). This model is trained on several million images from a variety of ecosystems. MegaDetector is just one of many tools that aims to make conservation biologists more efficient with AI. If you want to learn about other ways to use AI to accelerate camera trap workflows, check out our of the field, affectionately titled "Everything I know about machine learning and camera traps".
![leapfrogai Screenshot](/screenshots_githubs/defenseunicorns-leapfrogai.jpg)
leapfrogai
LeapfrogAI is a self-hosted AI platform designed to be deployed in air-gapped resource-constrained environments. It brings sophisticated AI solutions to these environments by hosting all the necessary components of an AI stack, including vector databases, model backends, API, and UI. LeapfrogAI's API closely matches that of OpenAI, allowing tools built for OpenAI/ChatGPT to function seamlessly with a LeapfrogAI backend. It provides several backends for various use cases, including llama-cpp-python, whisper, text-embeddings, and vllm. LeapfrogAI leverages Chainguard's apko to harden base python images, ensuring the latest supported Python versions are used by the other components of the stack. The LeapfrogAI SDK provides a standard set of protobuffs and python utilities for implementing backends and gRPC. LeapfrogAI offers UI options for common use-cases like chat, summarization, and transcription. It can be deployed and run locally via UDS and Kubernetes, built out using Zarf packages. LeapfrogAI is supported by a community of users and contributors, including Defense Unicorns, Beast Code, Chainguard, Exovera, Hypergiant, Pulze, SOSi, United States Navy, United States Air Force, and United States Space Force.
![llava-docker Screenshot](/screenshots_githubs/ashleykleynhans-llava-docker.jpg)
llava-docker
This Docker image for LLaVA (Large Language and Vision Assistant) provides a convenient way to run LLaVA locally or on RunPod. LLaVA is a powerful AI tool that combines natural language processing and computer vision capabilities. With this Docker image, you can easily access LLaVA's functionalities for various tasks, including image captioning, visual question answering, text summarization, and more. The image comes pre-installed with LLaVA v1.2.0, Torch 2.1.2, xformers 0.0.23.post1, and other necessary dependencies. You can customize the model used by setting the MODEL environment variable. The image also includes a Jupyter Lab environment for interactive development and exploration. Overall, this Docker image offers a comprehensive and user-friendly platform for leveraging LLaVA's capabilities.
![carrot Screenshot](/screenshots_githubs/xx025-carrot.jpg)
carrot
The 'carrot' repository on GitHub provides a list of free and user-friendly ChatGPT mirror sites for easy access. The repository includes sponsored sites offering various GPT models and services. Users can find and share sites, report errors, and access stable and recommended sites for ChatGPT usage. The repository also includes a detailed list of ChatGPT sites, their features, and accessibility options, making it a valuable resource for ChatGPT users seeking free and unlimited GPT services.
![TrustLLM Screenshot](/screenshots_githubs/HowieHwong-TrustLLM.jpg)
TrustLLM
TrustLLM is a comprehensive study of trustworthiness in LLMs, including principles for different dimensions of trustworthiness, established benchmark, evaluation, and analysis of trustworthiness for mainstream LLMs, and discussion of open challenges and future directions. Specifically, we first propose a set of principles for trustworthy LLMs that span eight different dimensions. Based on these principles, we further establish a benchmark across six dimensions including truthfulness, safety, fairness, robustness, privacy, and machine ethics. We then present a study evaluating 16 mainstream LLMs in TrustLLM, consisting of over 30 datasets. The document explains how to use the trustllm python package to help you assess the performance of your LLM in trustworthiness more quickly. For more details about TrustLLM, please refer to project website.
![AI-YinMei Screenshot](/screenshots_githubs/worm128-AI-YinMei.jpg)
AI-YinMei
AI-YinMei is an AI virtual anchor Vtuber development tool (N card version). It supports fastgpt knowledge base chat dialogue, a complete set of solutions for LLM large language models: [fastgpt] + [one-api] + [Xinference], supports docking bilibili live broadcast barrage reply and entering live broadcast welcome speech, supports Microsoft edge-tts speech synthesis, supports Bert-VITS2 speech synthesis, supports GPT-SoVITS speech synthesis, supports expression control Vtuber Studio, supports painting stable-diffusion-webui output OBS live broadcast room, supports painting picture pornography public-NSFW-y-distinguish, supports search and image search service duckduckgo (requires magic Internet access), supports image search service Baidu image search (no magic Internet access), supports AI reply chat box [html plug-in], supports AI singing Auto-Convert-Music, supports playlist [html plug-in], supports dancing function, supports expression video playback, supports head touching action, supports gift smashing action, supports singing automatic start dancing function, chat and singing automatic cycle swing action, supports multi scene switching, background music switching, day and night automatic switching scene, supports open singing and painting, let AI automatically judge the content.