llm-checker

llm-checker

Advanced CLI tool that scans your hardware and tells you exactly which LLM or sLLM models you can run locally, with full Ollama integration.

Stars: 161

Visit
 screenshot

LLM Checker is an AI-powered CLI tool that analyzes your hardware to recommend optimal LLM models. It features deterministic scoring across 35+ curated models with hardware-calibrated memory estimation. The tool helps users understand memory bandwidth, VRAM limits, and performance characteristics to choose the right LLM for their hardware. It provides actionable recommendations in seconds by scoring compatible models across four dimensions: Quality, Speed, Fit, and Context. LLM Checker is designed to work on any Node.js 16+ system, with optional SQLite search features for advanced functionality.

README:

LLM Checker Logo

LLM Checker

Intelligent Ollama Model Selector

AI-powered CLI that analyzes your hardware and recommends optimal LLM models
Deterministic scoring across 35+ curated models with hardware-calibrated memory estimation

npm version npm downloads License Node.js

InstallationQuick StartCommandsScoringHardware


Why LLM Checker?

Choosing the right LLM for your hardware is complex. With thousands of model variants, quantization levels, and hardware configurations, finding the optimal model requires understanding memory bandwidth, VRAM limits, and performance characteristics.

LLM Checker solves this. It analyzes your system, scores every compatible model across four dimensions (Quality, Speed, Fit, Context), and delivers actionable recommendations in seconds.


Features

Feature Description
35+ Curated Models Hand-picked catalog covering all major families and sizes (1B-32B)
4D Scoring Engine Quality, Speed, Fit, Context — weighted by use case
Multi-GPU Hardware Detection Apple Silicon, NVIDIA CUDA, AMD ROCm, Intel Arc, CPU
Calibrated Memory Estimation Bytes-per-parameter formula validated against real Ollama sizes
Zero Native Dependencies Pure JavaScript — works on any Node.js 16+ system
Optional SQLite Search Install sql.js to unlock sync, search, and smart-recommend

Installation

# Install globally
npm install -g llm-checker

# Or run directly with npx
npx llm-checker hw-detect

Requirements:

  • Node.js 16+ (any version: 16, 18, 20, 22, 24)
  • Ollama installed for running models

Optional: For database search features (sync, search, smart-recommend):

npm install sql.js

Quick Start

# 1. Detect your hardware capabilities
llm-checker hw-detect

# 2. Get full analysis with compatible models
llm-checker check

# 3. Get intelligent recommendations by category
llm-checker recommend

# 4. (Optional) Sync full database and search
llm-checker sync
llm-checker search qwen --use-case coding

Commands

Core Commands

Command Description
hw-detect Detect GPU/CPU capabilities, memory, backends
check Full system analysis with compatible models and recommendations
recommend Intelligent recommendations by category (coding, reasoning, multimodal, etc.)
installed Rank your installed Ollama models by compatibility

Advanced Commands (require sql.js)

Command Description
sync Download the latest model catalog from Ollama registry
search <query> Search models with filters and intelligent scoring
smart-recommend Advanced recommendations using the full scoring engine

AI Commands

Command Description
ai-check AI-powered model evaluation with meta-analysis
ai-run AI-powered model selection and execution

hw-detect — Hardware Analysis

llm-checker hw-detect
Summary:
  Apple M4 Pro (24GB Unified Memory)
  Tier: MEDIUM HIGH
  Max model size: 15GB
  Best backend: metal

CPU:
  Apple M4 Pro
  Cores: 12 (12 physical)
  SIMD: NEON

Metal:
  GPU Cores: 16
  Unified Memory: 24GB
  Memory Bandwidth: 273GB/s

recommend — Category Recommendations

llm-checker recommend
INTELLIGENT RECOMMENDATIONS BY CATEGORY
Hardware Tier: HIGH | Models Analyzed: 205

Coding:
   qwen2.5-coder:14b (14B)
   Score: 78/100
   Command: ollama pull qwen2.5-coder:14b

Reasoning:
   deepseek-r1:14b (14B)
   Score: 86/100
   Command: ollama pull deepseek-r1:14b

Multimodal:
   llama3.2-vision:11b (11B)
   Score: 83/100
   Command: ollama pull llama3.2-vision:11b

search — Model Search

llm-checker search llama -l 5
llm-checker search coding --use-case coding
llm-checker search qwen --quant Q4_K_M --max-size 8
Option Description
-l, --limit <n> Number of results (default: 10)
-u, --use-case <type> Optimize for: general, coding, chat, reasoning, creative, fast
--max-size <gb> Maximum model size in GB
--quant <type> Filter by quantization: Q4_K_M, Q8_0, FP16, etc.
--family <name> Filter by model family

Model Catalog

The built-in catalog includes 35+ models from the most popular Ollama families:

Family Models Best For
Qwen 2.5/3 7B, 14B, Coder 7B/14B/32B, VL 3B/7B Coding, general, vision
Llama 3.x 1B, 3B, 8B, Vision 11B General, chat, multimodal
DeepSeek R1 8B/14B/32B, Coder V2 16B Reasoning, coding
Phi-4 14B Reasoning, math
Gemma 2 2B, 9B General, efficient
Mistral 7B, Nemo 12B Creative, chat
CodeLlama 7B, 13B Coding
LLaVA 7B, 13B Vision
Embeddings nomic-embed-text, mxbai-embed-large, bge-m3, all-minilm RAG, search

Models are automatically combined with any locally installed Ollama models for scoring.


Scoring System

Models are evaluated across four dimensions, weighted by use case:

Dimension Description
Q Quality Model family reputation + parameter count + quantization penalty
S Speed Estimated tokens/sec based on hardware backend and model size
F Fit Memory utilization efficiency (how well it fits in available RAM)
C Context Context window capability vs. target context length

Scoring Weights by Use Case

Three scoring systems are available, each optimized for different workflows:

Deterministic Selector (primary — used by check and recommend):

Category Quality Speed Fit Context
general 45% 35% 15% 5%
coding 55% 20% 15% 10%
reasoning 60% 10% 20% 10%
multimodal 50% 15% 20% 15%

Scoring Engine (used by smart-recommend and search):

Use Case Quality Speed Fit Context
general 40% 35% 15% 10%
coding 55% 20% 15% 10%
reasoning 60% 15% 10% 15%
chat 40% 40% 15% 5%
fast 25% 55% 15% 5%
quality 65% 10% 15% 10%

All weights are centralized in src/models/scoring-config.js.

Memory Estimation

Memory requirements are calculated using calibrated bytes-per-parameter values:

Quantization Bytes/Param 7B Model 14B Model 32B Model
Q8_0 1.05 ~8 GB ~16 GB ~35 GB
Q4_K_M 0.58 ~5 GB ~9 GB ~20 GB
Q3_K 0.48 ~4 GB ~8 GB ~17 GB

The selector automatically picks the best quantization that fits your available memory.


Supported Hardware

Apple Silicon
  • M1, M1 Pro, M1 Max, M1 Ultra
  • M2, M2 Pro, M2 Max, M2 Ultra
  • M3, M3 Pro, M3 Max
  • M4, M4 Pro, M4 Max
NVIDIA (CUDA)
  • RTX 50 Series (5090, 5080, 5070 Ti, 5070)
  • RTX 40 Series (4090, 4080, 4070 Ti, 4070, 4060 Ti, 4060)
  • RTX 30 Series (3090 Ti, 3090, 3080 Ti, 3080, 3070 Ti, 3070, 3060 Ti, 3060)
  • Data Center (H100, A100, A10, L40, T4)
AMD (ROCm)
  • RX 7900 XTX, 7900 XT, 7800 XT, 7700 XT
  • RX 6900 XT, 6800 XT, 6800
  • Instinct MI300X, MI300A, MI250X, MI210
Intel
  • Arc A770, A750, A580, A380
  • Integrated Iris Xe, UHD Graphics
CPU Backends
  • AVX-512 + AMX (Intel Sapphire Rapids, Emerald Rapids)
  • AVX-512 (Intel Ice Lake+, AMD Zen 4)
  • AVX2 (Most modern x86 CPUs)
  • ARM NEON (Apple Silicon, AWS Graviton, Ampere Altra)

Architecture

┌─────────────────┐     ┌─────────────────┐     ┌─────────────────┐
│  Hardware       │────>│  Model          │────>│  Deterministic  │
│  Detection      │     │  Catalog (35+)  │     │  Selector       │
└─────────────────┘     └─────────────────┘     └─────────────────┘
        │                       │                       │
   Detects GPU/CPU         JSON catalog +           4D scoring
   Memory / Backend        Installed models         Per-category weights
   Usable memory calc      Auto-dedup               Memory calibration
                                                        │
                                                        v
                                               ┌─────────────────┐
                                               │  Ranked         │
                                               │  Recommendations│
                                               └─────────────────┘

Selector Pipeline:

  1. Hardware profiling — CPU, GPU, RAM, acceleration backend
  2. Model pool — Merge catalog + installed Ollama models (deduped)
  3. Category filter — Keep models relevant to the use case
  4. Quantization selection — Best quant that fits in memory budget
  5. 4D scoring — Q, S, F, C with category-specific weights
  6. Ranking — Top N candidates returned

Examples

Detect your hardware:

llm-checker hw-detect

Get recommendations for all categories:

llm-checker recommend

Full system analysis with compatible models:

llm-checker check

Find the best coding model:

llm-checker recommend --category coding

Search for small, fast models under 5GB:

llm-checker search "7b" --max-size 5 --use-case fast

Get high-quality reasoning models:

llm-checker smart-recommend --use-case reasoning

Development

git clone https://github.com/Pavelevich/llm-checker.git
cd llm-checker
npm install
node bin/enhanced_cli.js hw-detect

Project Structure

src/
  models/
    deterministic-selector.js  # Primary selection algorithm
    scoring-config.js          # Centralized scoring weights
    scoring-engine.js          # Advanced scoring (smart-recommend)
    catalog.json               # Curated model catalog (35+ models)
  ai/
    multi-objective-selector.js  # Multi-objective optimization
    ai-check-selector.js        # LLM-based evaluation
  hardware/
    detector.js                # Hardware detection
    unified-detector.js        # Cross-platform detection
  data/
    model-database.js          # SQLite storage (optional)
    sync-manager.js            # Database sync from Ollama registry
bin/
  enhanced_cli.js              # CLI entry point

License

MIT License — see LICENSE for details.


GitHubnpmIssues

For Tasks:

Click tags to check more tools for each tasks

For Jobs:

Alternative AI tools for llm-checker

Similar Open Source Tools

For similar tasks

For similar jobs