models
CLI and TUI for browsing AI models, benchmarks, and coding agents. Compare 2000+ models across 85+ providers, explore ~400 benchmark entries from Artificial Analysis, and track 12+ coding agents with version detection.
Stars: 63
A fast CLI and TUI for browsing AI models, benchmarks, and coding agents. Browse 2000+ models across 85+ providers from models.dev. Track AI coding assistants with version detection and GitHub integration. Compare model performance across 15+ benchmarks from Artificial Analysis. Features CLI commands, interactive TUI, cross-provider search, copy to clipboard, JSON output. Includes curated catalog of AI coding assistants, auto-updating benchmark data, per-model open weights detection, and detail panel for benchmarks. Supports customization of tracked agents and quick sorting of benchmarks. Utilizes data from models.dev, Artificial Analysis, curated catalog in data/agents.json, and GitHub API.
README:
A fast CLI and TUI for browsing AI models, benchmarks, and coding agents.
- Models Tab: Browse 2000+ models across 85+ providers from models.dev, categorized by type (Origin, Cloud, Inference, Gateway, Dev Tool)
- Agents Tab: Track AI coding assistants (Claude Code, Aider, Cursor, etc.) with version detection and GitHub integration
- Benchmarks Tab: Compare model performance across 15+ benchmarks from Artificial Analysis, with creator filtering by source, region, and type
-
91% match rate — three-stage matching pipeline using Jaro-Winkler similarity (
strsim) to determine open/closed status per model - Global fallback — when creator-scoped matching fails, searches all models.dev providers for the best slug match
- Known creator overrides — hardcoded open/closed status for 12 well-known creators absent from models.dev (IBM, AI2, TII, etc.)
- No more "Mixed" labels — removed CreatorOpenness fallback; unmatched models show an em dash instead of misleading labels
- jsDelivr cache purging — GitHub Action now purges CDN cache after committing new data for faster propagation
- No disk cache — benchmark data fetched fresh from CDN on every launch for simplicity
-
Price sort columns — sort benchmarks by input, output, or blended price per million tokens via
[s]cycle - Per-model source detection — runtime matching of AA entries against models.dev data
-
Source filter —
[4]cycles through All / Open / Closed -
Region and type grouping —
[5]and[6]toggle grouped layout with colored section headers
- Optimized release binary — strip, LTO, single codegen unit, panic=abort (~6MB, down from ~11MB)
- Dedicated Benchmarks tab — browse ~400 model entries from Artificial Analysis with quality, speed, and pricing data
- Creator sidebar with 40+ creators, classified by region and type with grouping toggles
-
Quick-sort keys —
[1]Intelligence,[2]Date,[3]Speed — press again to flip direction - Dynamic columns, detail panel, TTFAT, AIME benchmarks and more
- Provider categories — filter and group providers by type (Origin, Cloud, Inference, Gateway, Dev Tool)
- OpenClaw agent added to the agents catalog
- Responsive layouts — models tab detail panel scales with terminal height
- CLI commands for scripting and quick lookups
- Interactive TUI for browsing and comparing models
- Provider categories — filter and group providers by type (Origin, Cloud, Inference, Gateway, Dev Tool)
- Cross-provider search to compare the same model across different providers
- Copy to clipboard with a single keypress
- JSON output for scripting and automation
- Curated catalog of 12+ AI coding assistants
- Version detection — automatically detects installed agents
- GitHub integration — stars, releases, changelogs, update availability
- Persistent cache — instant startup with ETag-based conditional fetching
- Customizable tracking — choose which agents to monitor
- ~400 benchmark entries from Artificial Analysis with quality, speed, and pricing scores
- Auto-updating — data fetched fresh from CDN on every launch; GitHub Action refreshes source data every 6 hours
- Creator sidebar with 40+ creators — group by region or type with colored section headers
- Per-model open weights detection — runtime matching against models.dev, with source filter toggle
- Quick-sort keys — instantly sort by Intelligence, Date, or Speed
- Dynamic columns — list columns adapt to show the most relevant benchmarks for the active sort
- Detail panel — full benchmark breakdown with indexes, scores, performance, and pricing
cargo install modelsdevbrew install arimxyer/tap/modelsscoop bucket add arimxyer https://github.com/arimxyer/scoop-bucket
scoop install modelsDownload the latest release for your platform from GitHub Releases.
git clone https://github.com/arimxyer/models
cd models
cargo build --release
./target/release/modelsJust run models with no arguments to launch the interactive browser:
modelsGlobal
| Key | Action |
|---|---|
] / [
|
Switch tabs (Models / Agents / Benchmarks) |
? |
Show context-aware help |
q |
Quit |
Navigation
| Key | Action |
|---|---|
j / ↓
|
Move down |
k / ↑
|
Move up |
g |
Jump to first item |
G |
Jump to last item |
Ctrl+d / PageDown
|
Page down |
Ctrl+u / PageUp
|
Page up |
Tab / Shift+Tab
|
Switch panels |
← / →
|
Switch panels |
Search
| Key | Action |
|---|---|
/ |
Enter search mode |
Enter / Esc
|
Exit search mode |
Esc |
Clear search (in normal mode) |
Filters & Sort
| Key | Action |
|---|---|
s |
Cycle sort (name → date → cost → context) |
1 |
Toggle reasoning filter |
2 |
Toggle tools filter |
3 |
Toggle open weights filter |
4 |
Cycle provider category filter (All → Origin → Cloud → Inference → Gateway → Tool) |
5 |
Toggle category grouping |
Copy & Open
| Key | Action |
|---|---|
c |
Copy provider/model-id
|
C |
Copy model-id only |
o |
Open provider docs in browser |
D |
Copy provider docs URL |
A |
Copy provider API URL |
Filters & Sort
| Key | Action |
|---|---|
s |
Cycle sort (name → updated → stars → status) |
1 |
Toggle installed filter |
2 |
Toggle CLI tools filter |
3 |
Toggle open source filter |
Actions
| Key | Action |
|---|---|
a |
Open tracked agents picker |
o |
Open docs in browser |
r |
Open GitHub repo |
c |
Copy agent name |
By default, models tracks 4 popular agents: Claude Code, Codex, Gemini CLI, and OpenCode.
Press a in the Agents tab to open the picker and customize which agents you track. Your preferences are saved to ~/.config/models/config.toml.
You can also add custom agents not in the catalog:
# ~/.config/models/config.toml
[[agents.custom]]
name = "My Agent"
repo = "owner/repo"
binary = "my-agent"
version_command = ["--version"]See Custom Agents for the full reference.
Quick Sort (press again to toggle direction)
| Key | Action |
|---|---|
1 |
Sort by Intelligence index |
2 |
Sort by Release date |
3 |
Sort by Speed (tok/s) |
Filters & Grouping
| Key | Action |
|---|---|
4 |
Cycle source filter (All / Open / Closed) |
5 |
Toggle region grouping |
6 |
Toggle type grouping |
Sort (full cycle)
| Key | Action |
|---|---|
s |
Cycle through all 20 sort columns |
S |
Toggle sort direction (asc/desc) |
Actions
| Key | Action |
|---|---|
c |
Copy benchmark name |
o |
Open Artificial Analysis page |
models list providers# All models
models list models
# Models from a specific provider
models list models anthropicmodels show claude-opus-4-5-20251101Claude Opus 4.5
===============
ID: claude-opus-4-5-20251101
Provider: Anthropic (anthropic)
Family: claude-opus
Limits
------
Context: 200k tokens
Max Output: 64k tokens
Pricing (per million tokens)
----------------------------
Input: $5.00
Output: $25.00
Cache Read: $0.50
Cache Write: $6.25
Capabilities
------------
Reasoning: Yes
Tool Use: Yes
Attachments: Yes
Modalities: text, image, pdf -> text
Metadata
--------
Released: 2025-11-01
Updated: 2025-11-01
Knowledge: 2025-03-31
Open Weights: No
models search "gpt-4"
models search "claude opus"All commands support --json for scripting:
models list providers --json
models show claude-opus-4-5 --json
models search "llama" --jsonLots of gratitude and couldn't have made this application without these workhorses doing the legwork. Shout out to the sources!:
- Model data: Fetched from models.dev, an open-source database of AI models maintained by SST
- Benchmark data: Fetched from Artificial Analysis — quality indexes, benchmark scores, speed, and pricing for ~400 model entries
-
Agent data: Curated catalog in
data/agents.json— contributions welcome! - GitHub data: Fetched from GitHub API (stars, releases, changelogs)
MIT
For Tasks:
Click tags to check more tools for each tasksFor Jobs:
Alternative AI tools for models
Similar Open Source Tools
models
A fast CLI and TUI for browsing AI models, benchmarks, and coding agents. Browse 2000+ models across 85+ providers from models.dev. Track AI coding assistants with version detection and GitHub integration. Compare model performance across 15+ benchmarks from Artificial Analysis. Features CLI commands, interactive TUI, cross-provider search, copy to clipboard, JSON output. Includes curated catalog of AI coding assistants, auto-updating benchmark data, per-model open weights detection, and detail panel for benchmarks. Supports customization of tracked agents and quick sorting of benchmarks. Utilizes data from models.dev, Artificial Analysis, curated catalog in data/agents.json, and GitHub API.
tokscale
Tokscale is a high-performance CLI tool and visualization dashboard for tracking token usage and costs across multiple AI coding agents. It helps monitor and analyze token consumption from various AI coding tools, providing real-time pricing calculations using LiteLLM's pricing data. Inspired by the Kardashev scale, Tokscale measures token consumption as users scale the ranks of AI-augmented development. It offers interactive TUI mode, multi-platform support, real-time pricing, detailed breakdowns, web visualization, flexible filtering, and social platform features.
llamafarm
LlamaFarm is a comprehensive AI framework that empowers users to build powerful AI applications locally, with full control over costs and deployment options. It provides modular components for RAG systems, vector databases, model management, prompt engineering, and fine-tuning. Users can create differentiated AI products without needing extensive ML expertise, using simple CLI commands and YAML configs. The framework supports local-first development, production-ready components, strategy-based configuration, and deployment anywhere from laptops to the cloud.
llm-checker
LLM Checker is an AI-powered CLI tool that analyzes your hardware to recommend optimal LLM models. It features deterministic scoring across 35+ curated models with hardware-calibrated memory estimation. The tool helps users understand memory bandwidth, VRAM limits, and performance characteristics to choose the right LLM for their hardware. It provides actionable recommendations in seconds by scoring compatible models across four dimensions: Quality, Speed, Fit, and Context. LLM Checker is designed to work on any Node.js 16+ system, with optional SQLite search features for advanced functionality.
oh-my-pi
oh-my-pi is an AI coding agent for the terminal, providing tools for interactive coding, AI-powered git commits, Python code execution, LSP integration, time-traveling streamed rules, interactive code review, task management, interactive questioning, custom TypeScript slash commands, universal config discovery, MCP & plugin system, web search & fetch, SSH tool, Cursor provider integration, multi-credential support, image generation, TUI overhaul, edit fuzzy matching, and more. It offers a modern terminal interface with smart session management, supports multiple AI providers, and includes various tools for coding, task management, code review, and interactive questioning.
dexto
Dexto is a lightweight runtime for creating and running AI agents that turn natural language into real-world actions. It serves as the missing intelligence layer for building AI applications, standalone chatbots, or as the reasoning engine inside larger products. Dexto features a powerful CLI and Web UI for running AI agents, supports multiple interfaces, allows hot-swapping of LLMs from various providers, connects to remote tool servers via the Model Context Protocol, is config-driven with version-controlled YAML, offers production-ready core features, extensibility for custom services, and enables multi-agent collaboration via MCP and A2A.
claude-talk-to-figma-mcp
A Model Context Protocol (MCP) plugin named Claude Talk to Figma MCP that enables Claude Desktop and other AI tools to interact directly with Figma for AI-assisted design capabilities. It provides document interaction, element creation, smart modifications, text mastery, and component integration. Users can connect the plugin to Figma, start designing, and utilize various tools for document analysis, element creation, modification, text manipulation, and component management. The project offers installation instructions, AI client configuration options, usage patterns, command references, troubleshooting support, testing guidelines, architecture overview, contribution guidelines, version history, and licensing information.
ruby_llm-agents
RubyLLM::Agents is a production-ready Rails engine for building, managing, and monitoring LLM-powered AI agents. It seamlessly integrates with Rails apps, providing features like automatic execution tracking, cost analytics, budget controls, and a real-time dashboard. Users can build intelligent AI agents in Ruby using a clean DSL and support various LLM providers like OpenAI GPT-4, Anthropic Claude, and Google Gemini. The engine offers features such as agent DSL configuration, execution tracking, cost analytics, reliability with retries and fallbacks, budget controls, multi-tenancy support, async execution with Ruby fibers, real-time dashboard, streaming, conversation history, image operations, alerts, and more.
mindnlp
MindNLP is an open-source NLP library based on MindSpore. It provides a platform for solving natural language processing tasks, containing many common approaches in NLP. It can help researchers and developers to construct and train models more conveniently and rapidly. Key features of MindNLP include: * Comprehensive data processing: Several classical NLP datasets are packaged into a friendly module for easy use, such as Multi30k, SQuAD, CoNLL, etc. * Friendly NLP model toolset: MindNLP provides various configurable components. It is friendly to customize models using MindNLP. * Easy-to-use engine: MindNLP simplified complicated training process in MindSpore. It supports Trainer and Evaluator interfaces to train and evaluate models easily. MindNLP supports a wide range of NLP tasks, including: * Language modeling * Machine translation * Question answering * Sentiment analysis * Sequence labeling * Summarization MindNLP also supports industry-leading Large Language Models (LLMs), including Llama, GLM, RWKV, etc. For support related to large language models, including pre-training, fine-tuning, and inference demo examples, you can find them in the "llm" directory. To install MindNLP, you can either install it from Pypi, download the daily build wheel, or install it from source. The installation instructions are provided in the documentation. MindNLP is released under the Apache 2.0 license. If you find this project useful in your research, please consider citing the following paper: @misc{mindnlp2022, title={{MindNLP}: a MindSpore NLP library}, author={MindNLP Contributors}, howpublished = {\url{https://github.com/mindlab-ai/mindnlp}}, year={2022} }
agentscope
AgentScope is a multi-agent platform designed to empower developers to build multi-agent applications with large-scale models. It features three high-level capabilities: Easy-to-Use, High Robustness, and Actor-Based Distribution. AgentScope provides a list of `ModelWrapper` to support both local model services and third-party model APIs, including OpenAI API, DashScope API, Gemini API, and ollama. It also enables developers to rapidly deploy local model services using libraries such as ollama (CPU inference), Flask + Transformers, Flask + ModelScope, FastChat, and vllm. AgentScope supports various services, including Web Search, Data Query, Retrieval, Code Execution, File Operation, and Text Processing. Example applications include Conversation, Game, and Distribution. AgentScope is released under Apache License 2.0 and welcomes contributions.
everything-claude-code
The 'Everything Claude Code' repository is a comprehensive collection of production-ready agents, skills, hooks, commands, rules, and MCP configurations developed over 10+ months. It includes guides for setup, foundations, and philosophy, as well as detailed explanations of various topics such as token optimization, memory persistence, continuous learning, verification loops, parallelization, and subagent orchestration. The repository also provides updates on bug fixes, multi-language rules, installation wizard, PM2 support, OpenCode plugin integration, unified commands and skills, and cross-platform support. It offers a quick start guide for installation, ecosystem tools like Skill Creator and Continuous Learning v2, requirements for CLI version compatibility, key concepts like agents, skills, hooks, and rules, running tests, contributing guidelines, OpenCode support, background information, important notes on context window management and customization, star history chart, and relevant links.
axonhub
AxonHub is an all-in-one AI development platform that serves as an AI gateway allowing users to switch between model providers without changing any code. It provides features like vendor lock-in prevention, integration simplification, observability enhancement, and cost control. Users can access any model using any SDK with zero code changes. The platform offers full request tracing, enterprise RBAC, smart load balancing, and real-time cost tracking. AxonHub supports multiple databases, provides a unified API gateway, and offers flexible model management and API key creation for authentication. It also integrates with various AI coding tools and SDKs for seamless usage.
deepfabric
DeepFabric is a CLI tool and SDK designed for researchers and developers to generate high-quality synthetic datasets at scale using large language models. It leverages a graph and tree-based architecture to create diverse and domain-specific datasets while minimizing redundancy. The tool supports generating Chain of Thought datasets for step-by-step reasoning tasks and offers multi-provider support for using different language models. DeepFabric also allows for automatic dataset upload to Hugging Face Hub and uses YAML configuration files for flexibility in dataset generation.
Edit-Banana
Edit Banana is a universal content re-editor that allows users to transform fixed content into fully manipulatable assets. Powered by SAM 3 and multimodal large models, it enables high-fidelity reconstruction while preserving original diagram details and logical relationships. The platform offers advanced segmentation, fixed multi-round VLM scanning, high-quality OCR, user system with credits, multi-user concurrency, and a web interface. Users can upload images or PDFs to get editable DrawIO (XML) or PPTX files in seconds. The project structure includes components for segmentation, text extraction, frontend, models, and scripts, with detailed installation and setup instructions provided. The tool is open-source under the Apache License 2.0, allowing commercial use and secondary development.
ai-coders-context
The @ai-coders/context repository provides the Ultimate MCP for AI Agent Orchestration, Context Engineering, and Spec-Driven Development. It simplifies context engineering for AI by offering a universal process called PREVC, which consists of Planning, Review, Execution, Validation, and Confirmation steps. The tool aims to address the problem of context fragmentation by introducing a single `.context/` directory that works universally across different tools. It enables users to create structured documentation, generate agent playbooks, manage workflows, provide on-demand expertise, and sync across various AI tools. The tool follows a structured, spec-driven development approach to improve AI output quality and ensure reproducible results across projects.
augustus
Augustus is a Go-based LLM vulnerability scanner designed for security professionals to test large language models against a wide range of adversarial attacks. It integrates with 28 LLM providers, covers 210+ adversarial attacks including prompt injection, jailbreaks, encoding exploits, and data extraction, and produces actionable vulnerability reports. The tool is built for production security testing with features like concurrent scanning, rate limiting, retry logic, and timeout handling out of the box.
For similar tasks
sd-civitai-browser-plus
sd-civitai-browser-plus is an extension designed for Automatic1111's Stable Difussion Web UI, providing features to browse models from CivitAI, check for updates, download specific model versions hassle-free, assign tags to models, access model info quickly, and download models with high-speed using Aria2. The extension offers a sleek and intuitive user interface, actively maintained with feature requests welcome. It also addresses known issues like frozen downloads with possible solutions. The tool is actively developed with regular updates and bug fixes, ensuring a smooth user experience.
models
A fast CLI and TUI for browsing AI models, benchmarks, and coding agents. Browse 2000+ models across 85+ providers from models.dev. Track AI coding assistants with version detection and GitHub integration. Compare model performance across 15+ benchmarks from Artificial Analysis. Features CLI commands, interactive TUI, cross-provider search, copy to clipboard, JSON output. Includes curated catalog of AI coding assistants, auto-updating benchmark data, per-model open weights detection, and detail panel for benchmarks. Supports customization of tracked agents and quick sorting of benchmarks. Utilizes data from models.dev, Artificial Analysis, curated catalog in data/agents.json, and GitHub API.
LLaVA-pp
This repository, LLaVA++, extends the visual capabilities of the LLaVA 1.5 model by incorporating the latest LLMs, Phi-3 Mini Instruct 3.8B, and LLaMA-3 Instruct 8B. It provides various models for instruction-following LMMS and academic-task-oriented datasets, along with training scripts for Phi-3-V and LLaMA-3-V. The repository also includes installation instructions and acknowledgments to related open-source contributions.
llm-checker
LLM Checker is an AI-powered CLI tool that analyzes your hardware to recommend optimal LLM models. It features deterministic scoring across 35+ curated models with hardware-calibrated memory estimation. The tool helps users understand memory bandwidth, VRAM limits, and performance characteristics to choose the right LLM for their hardware. It provides actionable recommendations in seconds by scoring compatible models across four dimensions: Quality, Speed, Fit, and Context. LLM Checker is designed to work on any Node.js 16+ system, with optional SQLite search features for advanced functionality.
For similar jobs
explain-openclaw
Explain OpenClaw is a comprehensive documentation repository for the OpenClaw framework, a self-hosted AI assistant platform. It covers various aspects such as plain English explanations, technical architecture, deployment scenarios, privacy and safety measures, security audits, worst-case security scenarios, optimizations, and AI model comparisons. The repository serves as a living knowledge base with beginner-friendly explanations and detailed technical insights for contributors.
models
A fast CLI and TUI for browsing AI models, benchmarks, and coding agents. Browse 2000+ models across 85+ providers from models.dev. Track AI coding assistants with version detection and GitHub integration. Compare model performance across 15+ benchmarks from Artificial Analysis. Features CLI commands, interactive TUI, cross-provider search, copy to clipboard, JSON output. Includes curated catalog of AI coding assistants, auto-updating benchmark data, per-model open weights detection, and detail panel for benchmarks. Supports customization of tracked agents and quick sorting of benchmarks. Utilizes data from models.dev, Artificial Analysis, curated catalog in data/agents.json, and GitHub API.
weave
Weave is a toolkit for developing Generative AI applications, built by Weights & Biases. With Weave, you can log and debug language model inputs, outputs, and traces; build rigorous, apples-to-apples evaluations for language model use cases; and organize all the information generated across the LLM workflow, from experimentation to evaluations to production. Weave aims to bring rigor, best-practices, and composability to the inherently experimental process of developing Generative AI software, without introducing cognitive overhead.
LLMStack
LLMStack is a no-code platform for building generative AI agents, workflows, and chatbots. It allows users to connect their own data, internal tools, and GPT-powered models without any coding experience. LLMStack can be deployed to the cloud or on-premise and can be accessed via HTTP API or triggered from Slack or Discord.
VisionCraft
The VisionCraft API is a free API for using over 100 different AI models. From images to sound.
kaito
Kaito is an operator that automates the AI/ML inference model deployment in a Kubernetes cluster. It manages large model files using container images, avoids tuning deployment parameters to fit GPU hardware by providing preset configurations, auto-provisions GPU nodes based on model requirements, and hosts large model images in the public Microsoft Container Registry (MCR) if the license allows. Using Kaito, the workflow of onboarding large AI inference models in Kubernetes is largely simplified.
PyRIT
PyRIT is an open access automation framework designed to empower security professionals and ML engineers to red team foundation models and their applications. It automates AI Red Teaming tasks to allow operators to focus on more complicated and time-consuming tasks and can also identify security harms such as misuse (e.g., malware generation, jailbreaking), and privacy harms (e.g., identity theft). The goal is to allow researchers to have a baseline of how well their model and entire inference pipeline is doing against different harm categories and to be able to compare that baseline to future iterations of their model. This allows them to have empirical data on how well their model is doing today, and detect any degradation of performance based on future improvements.
tabby
Tabby is a self-hosted AI coding assistant, offering an open-source and on-premises alternative to GitHub Copilot. It boasts several key features: * Self-contained, with no need for a DBMS or cloud service. * OpenAPI interface, easy to integrate with existing infrastructure (e.g Cloud IDE). * Supports consumer-grade GPUs.


