Snap-Solver

AI笔试测评工具，专为学生、考生和自学者设计。

Stars: 74

Visit

Snap-Solver is a revolutionary AI tool for online exam solving, designed for students, test-takers, and self-learners. With just a keystroke, it automatically captures any question on the screen, analyzes it using AI, and provides detailed answers. Whether it's complex math formulas, physics problems, coding issues, or challenges from other disciplines, Snap-Solver offers clear, accurate, and structured solutions to help you better understand and master the subject matter.

README:

Snap-Solver

🔍 一键截屏，自动解题 - 线上考试，从未如此简单

核心特性 • 快速开始 • 使用指南 • 技术架构 • 高级配置 • 常见问题 • 获取帮助

💫 项目简介

Snap-Solver 是一个革命性的AI笔试测评工具，专为学生、考生和自学者设计。只需按下快捷键，即可自动截取屏幕上的任何题目，通过AI进行分析并提供详细解答。

无论是复杂的数学公式、物理难题、编程问题，还是其他学科的挑战，Snap-Solver都能提供清晰、准确、有条理的解决方案，帮助您更好地理解和掌握知识点。

✨ 核心特性

📱 跨设备协同一键截图：按下快捷键，即可在移动设备上查看和分析电脑屏幕局域网共享：一处部署，多设备访问，提升学习效率	🧠 多模型AI支持 GPT-4o/o3-mini：OpenAI强大的推理能力 Claude-3.7：Anthropic的高级理解与解释 DeepSeek-v3/r1：专为中文场景优化的模型 QVQ-MAX/Qwen-VL-MAX：以视觉推理闻名的国产AI
🔍 精准识别 OCR文字识别：准确捕捉图片中的文本数学公式支持：通过Mathpix精确识别复杂数学符号	🌐 全球无障碍 VPN代理支持：自定义代理设置，解决网络访问限制多语言响应：支持定制AI回复语言
💻 全平台兼容桌面支持：Windows、MacOS、Linux 移动访问：手机、平板通过浏览器直接使用	⚙️ 高度可定制思考深度控制：调整AI的分析深度自定义提示词：针对特定学科优化提示

🚀 快速开始

📋 前置要求

Python 3.x
至少以下一个API Key:
- OpenAI API Key
- Anthropic API Key (推荐✅)
- DeepSeek API Key
- Alibaba API Key （国内用户首选）
- Mathpix API Key (推荐OCR识别✅)

📥 开始使用

# 启动应用
python app.py

📱 访问方式

本机访问：打开浏览器，访问 http://localhost:5000
局域网设备访问：在同一网络的任何设备上访问 http://[电脑IP]:5000

📖 使用指南

1️⃣ 首次配置

点击右上角⚙️设置图标，配置API密钥和首选项

2️⃣ 截图解题

点击"截图"按钮 → 裁剪题目区域 → 选择分析方式

3️⃣ 查看解答

实时查看AI分析过程和详细解答，包含思考路径

🎯 使用场景示例

课后习题：截取教材或作业中的难题，获取步骤详解
编程调试：截取代码错误信息，获取修复建议
考试复习：分析错题并理解解题思路
文献研究：截取复杂论文段落，获取简化解释

🔧 技术架构

graph TD
    A[用户界面] --> B[Flask Web服务]
    B --> C{API路由}
    C --> D[截图服务]
    C --> E[OCR识别]
    C --> F[AI分析]
    E --> |Mathpix API| G[文本提取]
    F --> |模型选择| H1[OpenAI]
    F --> |模型选择| H2[Anthropic]
    F --> |模型选择| H3[DeepSeek]
    D --> I[Socket.IO实时通信]
    I --> A

🧩 组件详情

前端：响应式HTML/CSS/JS界面，支持移动设备
后端：Flask + SocketIO，提供RESTful API和WebSocket
AI接口：多模型支持，统一接口标准
图像处理：高效的截图和裁剪功能

⚙️ 高级配置

模型选择与优化

模型	优势	适用场景
GPT-4o	综合能力强，多模态支持	复杂学科问题，图像理解
o3-mini	速度快，成本低	简单问题，快速反馈
Claude-3.7	详细思考过程，推理透明	数学证明，深度分析
DeepSeek	中文优化，低延迟	中文习题，语文分析
QVQ-MAX	多模态支持，推理支持	复杂视觉分析
Qwen-VL-MAX	多模态支持	简单视觉分析

🛠️ 可调参数

温度：调整回答的创造性与确定性（0.1-1.0）
最大输出Token：控制回答长度
推理深度：标准模式（快速）或深度思考（详细）
思考预算占比：平衡思考过程与最终答案的详细程度
系统提示词：自定义AI的基础行为与专业领域

❓ 常见问题

如何获得最佳识别效果？

确保截图清晰，包含完整题目和必要上下文。对于数学公式，建议使用Mathpix OCR以获得更准确的识别结果。

无法连接到服务怎么办？

1. 检查防火墙设置是否允许5000端口
2. 确认设备在同一局域网内
3. 尝试重启应用程序
4. 查看控制台日志获取错误信息

API调用失败的原因？

1. API密钥可能无效或余额不足
2. 网络连接问题，特别是国际API
3. 代理设置不正确
4. API服务可能临时不可用

如何优化AI回答质量？

1. 调整系统提示词，添加特定学科的指导
2. 根据问题复杂度选择合适的模型
3. 对于复杂题目，使用"深度思考"模式
4. 确保截取的题目包含完整信息

🤝 获取帮助

代部署服务：如果您不擅长编程，需要代部署服务，请联系 [email protected]
问题报告：在GitHub仓库提交Issue
功能建议：欢迎通过Issue或邮件提供改进建议

📜 开源协议

本项目采用 Apache 2.0 协议。

For Tasks:

Click tags to check more tools for each tasks

solve math problems analyze physics questions debug code errors review exam mistakes simplify research readings

For Jobs:

student teacher tutor developer researcher

Alternative AI tools for Snap-Solver

Similar Open Source Tools

Snap-Solver

github

: 74

py-xiaozhi

py-xiaozhi is a Python-based XiaoZhi voice client designed for learning through code and experiencing AI XiaoZhi's voice functions without hardware conditions. The repository is based on the xiaozhi-esp32 port. It supports AI voice interaction, visual multimodal capabilities, IoT device integration, online music playback, voice wake-up, automatic conversation mode, graphical user interface, command-line mode, cross-platform support, volume control, session management, encrypted audio transmission, automatic captcha handling, automatic MAC address retrieval, code modularization, and stability optimization.

github

: 2.5k

MaiMBot

MaiMBot is an intelligent QQ group chat bot based on a large language model. It is developed using the nonebot2 framework, utilizes LLM for conversation abilities, MongoDB for data persistence, and NapCat for QQ protocol support. The bot features keyword-triggered proactive responses, dynamic prompt construction, support for images and message forwarding, typo generation, multiple replies, emotion-based emoji responses, daily schedule generation, user relationship management, knowledge base, and group impressions. Work-in-progress features include personality, group atmosphere, image handling, humor, meme functions, and Minecraft interactions. The tool is in active development with plans for GIF compatibility, mini-program link parsing, bug fixes, documentation improvements, and logic enhancements for emoji sending.

github

: 1.1k

gez

Gez is a high-performance micro frontend framework based on ESM. It uses Rspack compilation and maps modules to URLs with strong caching and content-based hashing. Gez embraces modern micro frontend architecture by leveraging ESM and importmap for dependency management, providing reliable isolation with module scope, seamless integration with any modern frontend framework, intuitive development experience, and optimal performance with zero runtime overhead and reliable caching strategies.

github

: 584

chatless

Chatless is a modern AI chat desktop application built on Tauri and Next.js. It supports multiple AI providers, can connect to local Ollama models, supports document parsing and knowledge base functions. All data is stored locally to protect user privacy. The application is lightweight, simple, starts quickly, and consumes minimal resources.

github

: 212

Daily-DeepLearning

Daily-DeepLearning is a repository that covers various computer science topics such as data structures, operating systems, computer networks, Python programming, data science packages like numpy, pandas, matplotlib, machine learning theories, deep learning theories, NLP concepts, machine learning practical applications, deep learning practical applications, and big data technologies like Hadoop and Hive. It also includes coding exercises related to '剑指offer'. The repository provides detailed explanations and examples for each topic, making it a comprehensive resource for learning and practicing different aspects of computer science and data-related fields.

github

: 666

llm-action

This repository provides a comprehensive guide to large language models (LLMs), covering various aspects such as training, fine-tuning, compression, and applications. It includes detailed tutorials, code examples, and explanations of key concepts and techniques. The repository is maintained by Liguo Dong, an AI researcher and engineer with expertise in LLM research and development.

github

: 12.9k

CradleAI

CradleAI is an open-source front-end tool designed for non-commercial purposes. It allows users to create and manage characters, engage in AI roleplay chats, publish dynamic content in a social circle, participate in group chats, and manage memories and knowledge. The tool supports features like author notes, voice interactions, multimedia messaging, visual novel mode, rich text formatting, image generation, TTS enhancement, and more. Users can deploy the tool using Github Action for APK builds or EAS Build for Android and iOS platforms. The project is licensed under CC BY-NC 4.0, prohibiting commercial use and emphasizing proper attribution.

github

: 181

godoos

GodoOS is an efficient intranet office operating system that includes various office tools such as word/excel/ppt/pdf/internal chat/whiteboard/mind map, with native file storage support. The platform interface mimics the Windows style, making it easy to operate while maintaining low resource consumption and high performance. It automatically connects to intranet users without registration, enabling instant communication and file sharing. The flexible and highly configurable app store allows for unlimited expansion.

github

: 151

chatwiki

ChatWiki is an open-source knowledge base AI question-answering system. It is built on large language models (LLM) and retrieval-augmented generation (RAG) technologies, providing out-of-the-box data processing, model invocation capabilities, and helping enterprises quickly build their own knowledge base AI question-answering systems. It offers exclusive AI question-answering system, easy integration of models, data preprocessing, simple user interface design, and adaptability to different business scenarios.

github

: 415

DocTranslator

DocTranslator is a document translation tool that supports various file formats, compatible with OpenAI format API, and offers batch operations and multi-threading support. Whether for individual users or enterprise teams, DocTranslator helps efficiently complete document translation tasks. It supports formats like txt, markdown, word, csv, excel, pdf (non-scanned), and ppt for AI translation. The tool is deployed using Docker for easy setup and usage.

github

: 60

get_jobs

Get Jobs is a tool designed to help users find and apply for job positions on various recruitment platforms in China. It features AI job matching, automatic cover letter generation, multi-platform job application, automated filtering of inactive HR and headhunter positions, real-time WeChat message notifications, blacklisted company updates, driver adaptation for Win11, centralized configuration, long-lasting cookie login, XPathHelper plugin, global logging, and more. The tool supports platforms like Boss直聘, 猎聘, 拉勾, 51job, and 智联招聘. Users can configure the tool for customized job searches and applications.

github

: 3.9k

LLMAI-writer

LLMAI-writer is a powerful AI tool for assisting in novel writing, utilizing state-of-the-art large language models to help writers brainstorm, plan, and create novels. Whether you are an experienced writer or a beginner, LLMAI-writer can help you efficiently complete the writing process.

github

: 65

vpnfast.github.io

VPNFast is a lightweight and fast VPN service provider that offers secure and private internet access. With VPNFast, users can protect their online privacy, bypass geo-restrictions, and secure their internet connection from hackers and snoopers. The service provides high-speed servers in multiple locations worldwide, ensuring a reliable and seamless VPN experience for users. VPNFast is easy to use, with a user-friendly interface and simple setup process. Whether you're browsing the web, streaming content, or accessing sensitive information, VPNFast helps you stay safe and anonymous online.

github

: 80

AI-Drug-Discovery-Design

AI-Drug-Discovery-Design is a repository focused on Artificial Intelligence-assisted Drug Discovery and Design. It explores the use of AI technology to accelerate and optimize the drug development process. The advantages of AI in drug design include speeding up research cycles, improving accuracy through data-driven models, reducing costs by minimizing experimental redundancies, and enabling personalized drug design for specific patients or disease characteristics.

github

: 77

bella-openapi

Bella OpenAPI is an API gateway that provides rich AI capabilities, similar to openrouter. In addition to chat completion ability, it also offers text embedding, ASR, TTS, image-to-image, and text-to-image AI capabilities. It integrates billing, rate limiting, and resource management functions. All integrated capabilities have been validated in large-scale production environments. The tool supports various AI capabilities, metadata management, unified login service, billing and rate limiting, and has been validated in large-scale production environments for stability and reliability. It offers a user-friendly experience with Java-friendly technology stack, convenient cloud-based experience service, and Dockerized deployment.

github

: 120

For similar tasks

Snap-Solver

github

: 74

MiniCPM

MiniCPM is a series of open-source large models on the client side jointly developed by Face Intelligence and Tsinghua University Natural Language Processing Laboratory. The main language model MiniCPM-2B has only 2.4 billion (2.4B) non-word embedding parameters, with a total of 2.7B parameters. - After SFT, MiniCPM-2B performs similarly to Mistral-7B on public comprehensive evaluation sets (better in Chinese, mathematics, and code capabilities), and outperforms models such as Llama2-13B, MPT-30B, and Falcon-40B overall. - After DPO, MiniCPM-2B also surpasses many representative open-source large models such as Llama2-70B-Chat, Vicuna-33B, Mistral-7B-Instruct-v0.1, and Zephyr-7B-alpha on the current evaluation set MTBench, which is closest to the user experience. - Based on MiniCPM-2B, a multi-modal large model MiniCPM-V 2.0 on the client side is constructed, which achieves the best performance of models below 7B in multiple test benchmarks, and surpasses larger parameter scale models such as Qwen-VL-Chat 9.6B, CogVLM-Chat 17.4B, and Yi-VL 34B on the OpenCompass leaderboard. MiniCPM-V 2.0 also demonstrates leading OCR capabilities, approaching Gemini Pro in scene text recognition capabilities. - After Int4 quantization, MiniCPM can be deployed and inferred on mobile phones, with a streaming output speed slightly higher than human speech speed. MiniCPM-V also directly runs through the deployment of multi-modal large models on mobile phones. - A single 1080/2080 can efficiently fine-tune parameters, and a single 3090/4090 can fully fine-tune parameters. A single machine can continuously train MiniCPM, and the secondary development cost is relatively low.

github

: 8.3k

SemanticKernel.Assistants

This repository contains an assistant proposal for the Semantic Kernel, allowing the usage of assistants without relying on OpenAI Assistant APIs. It runs locally planners and plugins for the assistants, providing scenarios like Assistant with Semantic Kernel plugins, Multi-Assistant conversation, and AutoGen conversation. The Semantic Kernel is a lightweight SDK enabling integration of AI Large Language Models with conventional programming languages, offering functions like semantic functions, native functions, and embeddings-based memory. Users can bring their own model for the assistants and host them locally. The repository includes installation instructions, usage examples, and information on creating new conversation threads with the assistant.

github

: 101

AMchat

AMchat is a large language model that integrates advanced math concepts, exercises, and solutions. The model is based on the InternLM2-Math-7B model and is specifically designed to answer advanced math problems. It provides a comprehensive dataset that combines Math and advanced math exercises and solutions. Users can download the model from ModelScope or OpenXLab, deploy it locally or using Docker, and even retrain it using XTuner for fine-tuning. The tool also supports LMDeploy for quantization, OpenCompass for evaluation, and various other features for model deployment and evaluation. The project contributors have provided detailed documentation and guides for users to utilize the tool effectively.

github

: 153

MathVerse

MathVerse is an all-around visual math benchmark designed to evaluate the capabilities of Multi-modal Large Language Models (MLLMs) in visual math problem-solving. It collects high-quality math problems with diagrams to assess how well MLLMs can understand visual diagrams for mathematical reasoning. The benchmark includes 2,612 problems transformed into six versions each, contributing to 15K test samples. It also introduces a Chain-of-Thought (CoT) Evaluation strategy for fine-grained assessment of output answers.

github

: 115

Self-Iterative-Agent-System-for-Complex-Problem-Solving

The Self-Iterative Agent System for Complex Problem Solving is a solution developed for the Alibaba Mathematical Competition (AI Challenge). It involves multiple LLMs engaging in multi-round 'self-questioning' to iteratively refine the problem-solving process and select optimal solutions. The system consists of main and evaluation models, with a process that includes detailed problem-solving steps, feedback loops, and iterative improvements. The approach emphasizes communication and reasoning between sub-agents, knowledge extraction, and the importance of Agent-like architectures in complex tasks. While effective, there is room for improvement in model capabilities and error prevention mechanisms.

github

: 51

LLM4Opt

LLM4Opt is a collection of references and papers focusing on applying Large Language Models (LLMs) for diverse optimization tasks. The repository includes research papers, tutorials, workshops, competitions, and related collections related to LLMs in optimization. It covers a wide range of topics such as algorithm search, code generation, machine learning, science, industry, and more. The goal is to provide a comprehensive resource for researchers and practitioners interested in leveraging LLMs for optimization tasks.

github

: 125

Awesome-LLM-Strawberry

Awesome LLM Strawberry is a collection of research papers and blogs related to OpenAI Strawberry(o1) and Reasoning. The repository is continuously updated to track the frontier of LLM Reasoning.

github

: 6.3k

For similar jobs

sweep

Sweep is an AI junior developer that turns bugs and feature requests into code changes. It automatically handles developer experience improvements like adding type hints and improving test coverage.

github

: 7.1k

sourcegraph

Sourcegraph is a code search and navigation tool that helps developers read, write, and fix code in large, complex codebases. It provides features such as code search across all repositories and branches, code intelligence for navigation and refactoring, and the ability to fix and refactor code across multiple repositories at once.

github

: 10.0k

open-webui

Open WebUI is an extensible, feature-rich, and user-friendly self-hosted WebUI designed to operate entirely offline. It supports various LLM runners, including Ollama and OpenAI-compatible APIs. For more information, be sure to check out our Open WebUI Documentation.

github

: 111.1k

ray

Ray is a unified framework for scaling AI and Python applications. It consists of a core distributed runtime and a set of AI libraries for simplifying ML compute, including Data, Train, Tune, RLlib, and Serve. Ray runs on any machine, cluster, cloud provider, and Kubernetes, and features a growing ecosystem of community integrations. With Ray, you can seamlessly scale the same code from a laptop to a cluster, making it easy to meet the compute-intensive demands of modern ML workloads.

github

: 39.1k

litgpt

LitGPT is a command-line tool designed to easily finetune, pretrain, evaluate, and deploy 20+ LLMs **on your own data**. It features highly-optimized training recipes for the world's most powerful open-source large-language-models (LLMs).

github

: 12.7k

khoj

Khoj is an open-source, personal AI assistant that extends your capabilities by creating always-available AI agents. You can share your notes and documents to extend your digital brain, and your AI agents have access to the internet, allowing you to incorporate real-time information. Khoj is accessible on Desktop, Emacs, Obsidian, Web, and Whatsapp, and you can share PDF, markdown, org-mode, notion files, and GitHub repositories. You'll get fast, accurate semantic search on top of your docs, and your agents can create deeply personal images and understand your speech. Khoj is self-hostable and always will be.

github

: 28.5k

chronon

Chronon is a platform that simplifies and improves ML workflows by providing a central place to define features, ensuring point-in-time correctness for backfills, simplifying orchestration for batch and streaming pipelines, offering easy endpoints for feature fetching, and guaranteeing and measuring consistency. It offers benefits over other approaches by enabling the use of a broad set of data for training, handling large aggregations and other computationally intensive transformations, and abstracting away the infrastructure complexity of data plumbing.

github

: 766

rag-experiment-accelerator

The RAG Experiment Accelerator is a versatile tool that helps you conduct experiments and evaluations using Azure AI Search and RAG pattern. It offers a rich set of features, including experiment setup, integration with Azure AI Search, Azure Machine Learning, MLFlow, and Azure OpenAI, multiple document chunking strategies, query generation, multiple search types, sub-querying, re-ranking, metrics and evaluation, report generation, and multi-lingual support. The tool is designed to make it easier and faster to run experiments and evaluations of search queries and quality of response from OpenAI, and is useful for researchers, data scientists, and developers who want to test the performance of different search and OpenAI related hyperparameters, compare the effectiveness of various search strategies, fine-tune and optimize parameters, find the best combination of hyperparameters, and generate detailed reports and visualizations from experiment results.

github

: 242

Snap-Solver

README:

Snap-Solver

💫 项目简介

✨ 核心特性

📱 跨设备协同

🧠 多模型AI支持

🔍 精准识别

🌐 全球无障碍

💻 全平台兼容

⚙️ 高度可定制

🚀 快速开始

📋 前置要求

📥 开始使用

📱 访问方式

📖 使用指南

1️⃣ 首次配置

2️⃣ 截图解题

3️⃣ 查看解答

🎯 使用场景示例

🔧 技术架构

🧩 组件详情

⚙️ 高级配置

模型选择与优化

🛠️ 可调参数

❓ 常见问题

🤝 获取帮助

📜 开源协议

For Tasks:

For Jobs:

Alternative AI tools for Snap-Solver

Similar Open Source Tools

Snap-Solver

py-xiaozhi

MaiMBot

gez

chatless

Daily-DeepLearning

llm-action

CradleAI

godoos

chatwiki

DocTranslator

get_jobs

LLMAI-writer

vpnfast.github.io

AI-Drug-Discovery-Design

bella-openapi

For similar tasks

Snap-Solver

MiniCPM

SemanticKernel.Assistants

AMchat

MathVerse

Self-Iterative-Agent-System-for-Complex-Problem-Solving

LLM4Opt

Awesome-LLM-Strawberry

For similar jobs

sweep

sourcegraph

open-webui

ray

litgpt

khoj

chronon

rag-experiment-accelerator