llm-checker

Advanced CLI tool that scans your hardware and tells you exactly which LLM or sLLM models you can run locally, with full Ollama integration.

Stars: 161

Visit

LLM Checker is an AI-powered CLI tool that analyzes your hardware to recommend optimal LLM models. It features deterministic scoring across 35+ curated models with hardware-calibrated memory estimation. The tool helps users understand memory bandwidth, VRAM limits, and performance characteristics to choose the right LLM for their hardware. It provides actionable recommendations in seconds by scoring compatible models across four dimensions: Quality, Speed, Fit, and Context. LLM Checker is designed to work on any Node.js 16+ system, with optional SQLite search features for advanced functionality.

README:

LLM Checker

Intelligent Ollama Model Selector

AI-powered CLI that analyzes your hardware and recommends optimal LLM models
Deterministic scoring across 35+ curated models with hardware-calibrated memory estimation

Installation • Quick Start • Commands • Scoring • Hardware

Why LLM Checker?

Choosing the right LLM for your hardware is complex. With thousands of model variants, quantization levels, and hardware configurations, finding the optimal model requires understanding memory bandwidth, VRAM limits, and performance characteristics.

LLM Checker solves this. It analyzes your system, scores every compatible model across four dimensions (Quality, Speed, Fit, Context), and delivers actionable recommendations in seconds.

Features

	Feature	Description
35+	Curated Models	Hand-picked catalog covering all major families and sizes (1B-32B)
4D	Scoring Engine	Quality, Speed, Fit, Context — weighted by use case
Multi-GPU	Hardware Detection	Apple Silicon, NVIDIA CUDA, AMD ROCm, Intel Arc, CPU
Calibrated	Memory Estimation	Bytes-per-parameter formula validated against real Ollama sizes
Zero	Native Dependencies	Pure JavaScript — works on any Node.js 16+ system
Optional	SQLite Search	Install `sql.js` to unlock `sync`, `search`, and `smart-recommend`

Installation

# Install globally
npm install -g llm-checker

# Or run directly with npx
npx llm-checker hw-detect

Requirements:

Node.js 16+ (any version: 16, 18, 20, 22, 24)
Ollama installed for running models

Optional: For database search features (sync, search, smart-recommend):

npm install sql.js

Quick Start

# 1. Detect your hardware capabilities
llm-checker hw-detect

# 2. Get full analysis with compatible models
llm-checker check

# 3. Get intelligent recommendations by category
llm-checker recommend

# 4. (Optional) Sync full database and search
llm-checker sync
llm-checker search qwen --use-case coding

Commands

Core Commands

Command	Description
`hw-detect`	Detect GPU/CPU capabilities, memory, backends
`check`	Full system analysis with compatible models and recommendations
`recommend`	Intelligent recommendations by category (coding, reasoning, multimodal, etc.)
`installed`	Rank your installed Ollama models by compatibility

Advanced Commands (require `sql.js`)

Command	Description
`sync`	Download the latest model catalog from Ollama registry
`search <query>`	Search models with filters and intelligent scoring
`smart-recommend`	Advanced recommendations using the full scoring engine

AI Commands

Command	Description
`ai-check`	AI-powered model evaluation with meta-analysis
`ai-run`	AI-powered model selection and execution

`hw-detect` — Hardware Analysis

llm-checker hw-detect

Summary:
  Apple M4 Pro (24GB Unified Memory)
  Tier: MEDIUM HIGH
  Max model size: 15GB
  Best backend: metal

CPU:
  Apple M4 Pro
  Cores: 12 (12 physical)
  SIMD: NEON

Metal:
  GPU Cores: 16
  Unified Memory: 24GB
  Memory Bandwidth: 273GB/s

`recommend` — Category Recommendations

llm-checker recommend

INTELLIGENT RECOMMENDATIONS BY CATEGORY
Hardware Tier: HIGH | Models Analyzed: 205

Coding:
   qwen2.5-coder:14b (14B)
   Score: 78/100
   Command: ollama pull qwen2.5-coder:14b

Reasoning:
   deepseek-r1:14b (14B)
   Score: 86/100
   Command: ollama pull deepseek-r1:14b

Multimodal:
   llama3.2-vision:11b (11B)
   Score: 83/100
   Command: ollama pull llama3.2-vision:11b

`search` — Model Search

llm-checker search llama -l 5
llm-checker search coding --use-case coding
llm-checker search qwen --quant Q4_K_M --max-size 8

Option	Description
`-l, --limit <n>`	Number of results (default: 10)
`-u, --use-case <type>`	Optimize for: `general`, `coding`, `chat`, `reasoning`, `creative`, `fast`
`--max-size <gb>`	Maximum model size in GB
`--quant <type>`	Filter by quantization: `Q4_K_M`, `Q8_0`, `FP16`, etc.
`--family <name>`	Filter by model family

Model Catalog

The built-in catalog includes 35+ models from the most popular Ollama families:

Family	Models	Best For
Qwen 2.5/3	7B, 14B, Coder 7B/14B/32B, VL 3B/7B	Coding, general, vision
Llama 3.x	1B, 3B, 8B, Vision 11B	General, chat, multimodal
DeepSeek	R1 8B/14B/32B, Coder V2 16B	Reasoning, coding
Phi-4	14B	Reasoning, math
Gemma 2	2B, 9B	General, efficient
Mistral	7B, Nemo 12B	Creative, chat
CodeLlama	7B, 13B	Coding
LLaVA	7B, 13B	Vision
Embeddings	nomic-embed-text, mxbai-embed-large, bge-m3, all-minilm	RAG, search

Models are automatically combined with any locally installed Ollama models for scoring.

Scoring System

Models are evaluated across four dimensions, weighted by use case:

Dimension	Description
Q Quality	Model family reputation + parameter count + quantization penalty
S Speed	Estimated tokens/sec based on hardware backend and model size
F Fit	Memory utilization efficiency (how well it fits in available RAM)
C Context	Context window capability vs. target context length

Scoring Weights by Use Case

Three scoring systems are available, each optimized for different workflows:

Deterministic Selector (primary — used by check and recommend):

Category	Quality	Speed	Fit	Context
`general`	45%	35%	15%	5%
`coding`	55%	20%	15%	10%
`reasoning`	60%	10%	20%	10%
`multimodal`	50%	15%	20%	15%

Scoring Engine (used by smart-recommend and search):

Use Case	Quality	Speed	Fit	Context
`general`	40%	35%	15%	10%
`coding`	55%	20%	15%	10%
`reasoning`	60%	15%	10%	15%
`chat`	40%	40%	15%	5%
`fast`	25%	55%	15%	5%
`quality`	65%	10%	15%	10%

All weights are centralized in src/models/scoring-config.js.

Memory Estimation

Memory requirements are calculated using calibrated bytes-per-parameter values:

Quantization	Bytes/Param	7B Model	14B Model	32B Model
Q8_0	1.05	~8 GB	~16 GB	~35 GB
Q4_K_M	0.58	~5 GB	~9 GB	~20 GB
Q3_K	0.48	~4 GB	~8 GB	~17 GB

The selector automatically picks the best quantization that fits your available memory.

Supported Hardware

Apple Silicon

M1, M1 Pro, M1 Max, M1 Ultra
M2, M2 Pro, M2 Max, M2 Ultra
M3, M3 Pro, M3 Max
M4, M4 Pro, M4 Max

NVIDIA (CUDA)

RTX 50 Series (5090, 5080, 5070 Ti, 5070)
RTX 40 Series (4090, 4080, 4070 Ti, 4070, 4060 Ti, 4060)
RTX 30 Series (3090 Ti, 3090, 3080 Ti, 3080, 3070 Ti, 3070, 3060 Ti, 3060)
Data Center (H100, A100, A10, L40, T4)

AMD (ROCm)

RX 7900 XTX, 7900 XT, 7800 XT, 7700 XT
RX 6900 XT, 6800 XT, 6800
Instinct MI300X, MI300A, MI250X, MI210

Intel

Arc A770, A750, A580, A380
Integrated Iris Xe, UHD Graphics

CPU Backends

AVX-512 + AMX (Intel Sapphire Rapids, Emerald Rapids)
AVX-512 (Intel Ice Lake+, AMD Zen 4)
AVX2 (Most modern x86 CPUs)
ARM NEON (Apple Silicon, AWS Graviton, Ampere Altra)

Architecture

┌─────────────────┐     ┌─────────────────┐     ┌─────────────────┐
│  Hardware       │────>│  Model          │────>│  Deterministic  │
│  Detection      │     │  Catalog (35+)  │     │  Selector       │
└─────────────────┘     └─────────────────┘     └─────────────────┘
        │                       │                       │
   Detects GPU/CPU         JSON catalog +           4D scoring
   Memory / Backend        Installed models         Per-category weights
   Usable memory calc      Auto-dedup               Memory calibration
                                                        │
                                                        v
                                               ┌─────────────────┐
                                               │  Ranked         │
                                               │  Recommendations│
                                               └─────────────────┘

Selector Pipeline:

Hardware profiling — CPU, GPU, RAM, acceleration backend
Model pool — Merge catalog + installed Ollama models (deduped)
Category filter — Keep models relevant to the use case
Quantization selection — Best quant that fits in memory budget
4D scoring — Q, S, F, C with category-specific weights
Ranking — Top N candidates returned

Examples

Detect your hardware:

llm-checker hw-detect

Get recommendations for all categories:

llm-checker recommend

Full system analysis with compatible models:

llm-checker check

Find the best coding model:

llm-checker recommend --category coding

Search for small, fast models under 5GB:

llm-checker search "7b" --max-size 5 --use-case fast

Get high-quality reasoning models:

llm-checker smart-recommend --use-case reasoning

Development

git clone https://github.com/Pavelevich/llm-checker.git
cd llm-checker
npm install
node bin/enhanced_cli.js hw-detect

Project Structure

src/
  models/
    deterministic-selector.js  # Primary selection algorithm
    scoring-config.js          # Centralized scoring weights
    scoring-engine.js          # Advanced scoring (smart-recommend)
    catalog.json               # Curated model catalog (35+ models)
  ai/
    multi-objective-selector.js  # Multi-objective optimization
    ai-check-selector.js        # LLM-based evaluation
  hardware/
    detector.js                # Hardware detection
    unified-detector.js        # Cross-platform detection
  data/
    model-database.js          # SQLite storage (optional)
    sync-manager.js            # Database sync from Ollama registry
bin/
  enhanced_cli.js              # CLI entry point

License

MIT License — see LICENSE for details.

GitHub • npm • Issues

For Tasks:

Click tags to check more tools for each tasks

analyze hardware recommend models detect capabilities get intelligent recommendations search for models

For Jobs:

ai engineer data scientist machine learning engineer software developer system architect

Alternative AI tools for llm-checker

Similar Open Source Tools

llm-checker

github

: 161

claude-flow

Claude-Flow is a workflow automation tool designed to streamline and optimize business processes. It provides a user-friendly interface for creating and managing workflows, allowing users to automate repetitive tasks and improve efficiency. With features such as drag-and-drop workflow builder, customizable templates, and integration with popular business tools, Claude-Flow empowers users to automate their workflows without the need for extensive coding knowledge. Whether you are a small business owner looking to streamline your operations or a project manager seeking to automate task assignments, Claude-Flow offers a flexible and scalable solution to meet your workflow automation needs.

github

: 14.0k

cactus

Cactus is an energy-efficient and fast AI inference framework designed for phones, wearables, and resource-constrained arm-based devices. It provides a bottom-up approach with no dependencies, optimizing for budget and mid-range phones. The framework includes Cactus FFI for integration, Cactus Engine for high-level transformer inference, Cactus Graph for unified computation graph, and Cactus Kernels for low-level ARM-specific operations. It is suitable for implementing custom models and scientific computing on mobile devices.

github

: 4.3k

claude-craft

Claude Craft is a comprehensive framework for AI-assisted development with Claude Code, providing standardized rules, agents, and commands across multiple technology stacks. It includes autonomous sprint capabilities, documentation accuracy improvements, CI hardening, and test coverage enhancements. With support for 10 technology stacks, 5 languages, 40 AI agents, 157 slash commands, and various project management features like BMAD v6 framework, Ralph Wiggum loop execution, skills, templates, checklists, and hooks system, Claude Craft offers a robust solution for project development and management. The tool also supports workflow methodology, development tracks, document generation, BMAD v6 project management, quality gates, batch processing, backlog migration, and Claude Code hooks integration.

github

: 84

roam-code

Roam is a tool that builds a semantic graph of your codebase and allows AI agents to query it with one shell command. It pre-indexes your codebase into a semantic graph stored in a local SQLite DB, providing architecture-level graph queries offline, cross-language, and compact. Roam understands functions, modules, tests coverage, and overall architecture structure. It is best suited for agent-assisted coding, large codebases, architecture governance, safe refactoring, and multi-repo projects. Roam is not suitable for real-time type checking, dynamic/runtime analysis, small scripts, or pure text search. It offers speed, dependency-awareness, LLM-optimized output, fully local operation, and CI readiness.

github

: 77

paiml-mcp-agent-toolkit

PAIML MCP Agent Toolkit (PMAT) is a zero-configuration AI context generation system with extreme quality enforcement and Toyota Way standards. It allows users to analyze any codebase instantly through CLI, MCP, or HTTP interfaces. The toolkit provides features such as technical debt analysis, advanced monitoring, metrics aggregation, performance profiling, bottleneck detection, alert system, multi-format export, storage flexibility, and more. It also offers AI-powered intelligence for smart recommendations, polyglot analysis, repository showcase, and integration points. PMAT enforces quality standards like complexity ≤20, zero SATD comments, test coverage >80%, no lint warnings, and synchronized documentation with commits. The toolkit follows Toyota Way development principles for iterative improvement, direct AST traversal, automated quality gates, and zero SATD policy.

github

: 130

sf-skills

sf-skills is a collection of reusable skills for Agentic Salesforce Development, enabling AI-powered code generation, validation, testing, debugging, and deployment. It includes skills for development, quality, foundation, integration, AI & automation, DevOps & tooling. The installation process is newbie-friendly and includes an installer script for various CLIs. The skills are compatible with platforms like Claude Code, OpenCode, Codex, Gemini, Amp, Droid, Cursor, and Agentforce Vibes. The repository is community-driven and aims to strengthen the Salesforce ecosystem.

github

: 65

NornicDB

NornicDB is a high-performance graph database designed for AI agents and knowledge systems. It is Neo4j-compatible, GPU-accelerated, and features memory that evolves. The database automatically discovers and manages relationships in the data, allowing meaning to emerge from the knowledge graph. NornicDB is suitable for AI agent memory, knowledge graphs, RAG systems, session context, and research tools. It offers features like intelligent memory, auto-relationships, performance benchmarks, vector search, Heimdall AI assistant, APOC functions, and various Docker images for different platforms. The tool is built with Neo4j Bolt protocol, Cypher query engine, memory decay system, GPU acceleration, vector search, auto-relationship engine, and more.

github

: 129

Open-dLLM

Open-dLLM is the most open release of a diffusion-based large language model, providing pretraining, evaluation, inference, and checkpoints. It introduces Open-dCoder, the code-generation variant of Open-dLLM. The repo offers a complete stack for diffusion LLMs, enabling users to go from raw data to training, checkpoints, evaluation, and inference in one place. It includes pretraining pipeline with open datasets, inference scripts for easy sampling and generation, evaluation suite with various metrics, weights and checkpoints on Hugging Face, and transparent configs for full reproducibility.

github

: 237

ReGraph

ReGraph is a decentralized AI compute marketplace that connects hardware providers with developers who need inference and training resources. It democratizes access to AI computing power by creating a global network of distributed compute nodes. It is cost-effective, decentralized, easy to integrate, supports multiple models, and offers pay-as-you-go pricing.

github

: 52

lm-engine

LM Engine is a research-grade, production-ready library for training large language models at scale. It provides support for multiple accelerators including NVIDIA GPUs, Google TPUs, and AWS Trainiums. Key features include multi-accelerator support, advanced distributed training, flexible model architectures, HuggingFace integration, training modes like pretraining and finetuning, custom kernels for high performance, experiment tracking, and efficient checkpointing.

github

: 114

Unreal_mcp

Unreal Engine MCP Server is a comprehensive Model Context Protocol (MCP) server that allows AI assistants to control Unreal Engine through a native C++ Automation Bridge plugin. It is built with TypeScript, C++, and Rust (WebAssembly). The server provides various features for asset management, actor control, editor control, level management, animation & physics, visual effects, sequencer, graph editing, audio, system operations, and more. It offers dynamic type discovery, graceful degradation, on-demand connection, command safety, asset caching, metrics rate limiting, and centralized configuration. Users can install the server using NPX or by cloning and building it. Additionally, the server supports WebAssembly acceleration for computationally intensive operations and provides an optional GraphQL API for complex queries. The repository includes documentation, community resources, and guidelines for contributing.

github

: 275

Liger-Kernel

Liger Kernel is a collection of Triton kernels designed for LLM training, increasing training throughput by 20% and reducing memory usage by 60%. It includes Hugging Face Compatible modules like RMSNorm, RoPE, SwiGLU, CrossEntropy, and FusedLinearCrossEntropy. The tool works with Flash Attention, PyTorch FSDP, and Microsoft DeepSpeed, aiming to enhance model efficiency and performance for researchers, ML practitioners, and curious novices.

github

: 4.8k

multi-agent-ralph-loop

Multi-agent RALPH (Reinforcement Learning with Probabilistic Hierarchies) Loop is a framework for multi-agent reinforcement learning research. It provides a flexible and extensible platform for developing and testing multi-agent reinforcement learning algorithms. The framework supports various environments, including grid-world environments, and allows users to easily define custom environments. Multi-agent RALPH Loop is designed to facilitate research in the field of multi-agent reinforcement learning by providing a set of tools and utilities for experimenting with different algorithms and scenarios.

github

: 79

qserve

QServe is a serving system designed for efficient and accurate Large Language Models (LLM) on GPUs with W4A8KV4 quantization. It achieves higher throughput compared to leading industry solutions, allowing users to achieve A100-level throughput on cheaper L40S GPUs. The system introduces the QoQ quantization algorithm with 4-bit weight, 8-bit activation, and 4-bit KV cache, addressing runtime overhead challenges. QServe improves serving throughput for various LLM models by implementing compute-aware weight reordering, register-level parallelism, and fused attention memory-bound techniques.

github

: 383

ai-dev-kit

The AI Dev Kit is a comprehensive toolkit designed to enhance AI-driven development on Databricks. It provides trusted sources for AI coding assistants like Claude Code and Cursor to build faster and smarter on Databricks. The kit includes features such as Spark Declarative Pipelines, Databricks Jobs, AI/BI Dashboards, Unity Catalog, Genie Spaces, Knowledge Assistants, MLflow Experiments, Model Serving, Databricks Apps, and more. Users can choose from different adventures like installing the kit, using the visual builder app, teaching AI assistants Databricks patterns, executing Databricks actions, or building custom integrations with the core library. The kit also includes components like databricks-tools-core, databricks-mcp-server, databricks-skills, databricks-builder-app, and ai-dev-project.

github

: 261

For similar tasks

llm-checker

github

: 161

For similar jobs

sweep

Sweep is an AI junior developer that turns bugs and feature requests into code changes. It automatically handles developer experience improvements like adding type hints and improving test coverage.

github

: 7.1k

teams-ai

The Teams AI Library is a software development kit (SDK) that helps developers create bots that can interact with Teams and Microsoft 365 applications. It is built on top of the Bot Framework SDK and simplifies the process of developing bots that interact with Teams' artificial intelligence capabilities. The SDK is available for JavaScript/TypeScript, .NET, and Python.

github

: 502

ai-guide

This guide is dedicated to Large Language Models (LLMs) that you can run on your home computer. It assumes your PC is a lower-end, non-gaming setup.

github

: 159

classifai

Supercharge WordPress Content Workflows and Engagement with Artificial Intelligence. Tap into leading cloud-based services like OpenAI, Microsoft Azure AI, Google Gemini and IBM Watson to augment your WordPress-powered websites. Publish content faster while improving SEO performance and increasing audience engagement. ClassifAI integrates Artificial Intelligence and Machine Learning technologies to lighten your workload and eliminate tedious tasks, giving you more time to create original content that matters.

github

: 697

chatbot-ui

Chatbot UI is an open-source AI chat app that allows users to create and deploy their own AI chatbots. It is easy to use and can be customized to fit any need. Chatbot UI is perfect for businesses, developers, and anyone who wants to create a chatbot.

github

: 27.7k

BricksLLM

BricksLLM is a cloud native AI gateway written in Go. Currently, it provides native support for OpenAI, Anthropic, Azure OpenAI and vLLM. BricksLLM aims to provide enterprise level infrastructure that can power any LLM production use cases. Here are some use cases for BricksLLM: * Set LLM usage limits for users on different pricing tiers * Track LLM usage on a per user and per organization basis * Block or redact requests containing PIIs * Improve LLM reliability with failovers, retries and caching * Distribute API keys with rate limits and cost limits for internal development/production use cases * Distribute API keys with rate limits and cost limits for students

github

: 953

uAgents

uAgents is a Python library developed by Fetch.ai that allows for the creation of autonomous AI agents. These agents can perform various tasks on a schedule or take action on various events. uAgents are easy to create and manage, and they are connected to a fast-growing network of other uAgents. They are also secure, with cryptographically secured messages and wallets.

github

: 1.3k

griptape

Griptape is a modular Python framework for building AI-powered applications that securely connect to your enterprise data and APIs. It offers developers the ability to maintain control and flexibility at every step. Griptape's core components include Structures (Agents, Pipelines, and Workflows), Tasks, Tools, Memory (Conversation Memory, Task Memory, and Meta Memory), Drivers (Prompt and Embedding Drivers, Vector Store Drivers, Image Generation Drivers, Image Query Drivers, SQL Drivers, Web Scraper Drivers, and Conversation Memory Drivers), Engines (Query Engines, Extraction Engines, Summary Engines, Image Generation Engines, and Image Query Engines), and additional components (Rulesets, Loaders, Artifacts, Chunkers, and Tokenizers). Griptape enables developers to create AI-powered applications with ease and efficiency.

github

: 2.2k

llm-checker

README:

LLM Checker

Why LLM Checker?

Features

Installation

Quick Start

Commands

Core Commands

Advanced Commands (require sql.js)

AI Commands

hw-detect — Hardware Analysis

recommend — Category Recommendations

search — Model Search

Model Catalog

Scoring System

Scoring Weights by Use Case

Memory Estimation

Supported Hardware

Architecture

Examples

Development

Project Structure

License

For Tasks:

For Jobs:

Alternative AI tools for llm-checker

Similar Open Source Tools

llm-checker

claude-flow

cactus

claude-craft

roam-code

paiml-mcp-agent-toolkit

sf-skills

NornicDB

Open-dLLM

ReGraph

lm-engine

Unreal_mcp

Liger-Kernel

multi-agent-ralph-loop

qserve

ai-dev-kit

For similar tasks

llm-checker

For similar jobs

sweep

teams-ai

ai-guide

classifai

chatbot-ui

BricksLLM

uAgents

griptape

Advanced Commands (require `sql.js`)

`hw-detect` — Hardware Analysis

`recommend` — Category Recommendations

`search` — Model Search