handy-ollama
动手学Ollama,CPU玩转大模型部署,在线阅读地址:https://datawhalechina.github.io/handy-ollama/
Stars: 910
Handy-Ollama is a tutorial for deploying Ollama with hands-on practice, making the deployment of large language models accessible to everyone. The tutorial covers a wide range of content from basic to advanced usage, providing clear steps and practical tips for beginners and experienced developers to learn Ollama from scratch, deploy large models locally, and develop related applications. It aims to enable users to run large models on consumer-grade hardware, deploy models locally, and manage models securely and reliably.
README:
Learning to deploy Ollama with hands-on practice, making the deployment of large language models accessible to everyone!
动手学 Ollama 教程,轻松上手实现大模型本地化部署,快速在本地管理以及运行大模型,让 CPU 也可以玩转大模型推理部署!
本教程涵盖从基础入门到进阶使用的全方位内容,并通过实际应用案例深入理解和掌握大模型部署以及应用技术。我们的教程提供清晰的步骤和实用的技巧,无论是刚刚接触大模型部署的小白,还是有一定经验的开发者,都可以从零开始学习 Ollama ,实现本地部署大模型以及相关应用。
本项目主要内容:
- Ollama 介绍、安装和配置,包括在 macOS、Windows、Linux 和 Docker 下的安装与配置;
- Ollama 自定义导入模型,包括从 GGUF 导入、从 Pytorch 或 Safetensors 导入、由模型直接导入、自定义 Prompt;
- Ollama REST API,包括 Ollama API 使用指南、在 Python、Java、JavaScript 和 C++ 等语言中使用 Ollama API;
- Ollama 在 LangChain 中的使用,包括在 Python 和 JavaScript 中的集成;
- Ollama 可视化界面部署和应用案例,包括使用 FastAPI 和 WebUI 部署可视化对话界面,以及本地 RAG 应用、Agent 应用等。
热忱欢迎感兴趣的同学或者开发者们 提出 issue 或者 提交 pull request,让我们一起完善这个项目!
我们坚信:每一位对大模型充满热情的学习者,都应该有机会探索和实践。无论你的编程语言背景如何,无论你的计算资源如何,我们都希望能帮助你使用个人 PC 实现大模型部署。 让我们携手打破技术壁垒,共同开启 LLM 探索之旅!
目录结构说明:
docs ---------------------- Markdown 文档文件
notebook ------------------ Notebook 源代码文件以及部分 Python、Java 和 JavaScript 源文件
images -------------------- 图片
在线阅读:https://datawhalechina.github.io/handy-ollama/
随着大模型的飞速发展,市面上出现了越来越多的开源大模型,但是许多模型的部署需要利用 GPU 资源,如何让大模型时代的红利普惠到每一个人,让每一个人都可以部署属于自己的大模型。Ollama 是一个开源的大语言部署服务工具,只需 CPU 即可部署大模型。我们希望通过动手学 Ollama 这一开源教程,帮助学习者快速上手 Ollama ,让每一位大模型爱好者、学习者以及开发者都能在本地部署自己的大模型,进而开发一些大模型应用,让大模型赋能千行百业!
- 希望不受 GPU 资源限制,在本地运行大模型;
- 希望在消费级硬件上进行大模型有效的推理;
- 希望在本地部署大模型,开发大模型应用;
- 希望在本地管理大模型,让本地模型安全可靠。
本项目旨在使用 CPU 部署本地大模型,虽然目前已经有很多 LLM 相关的教程,但是这些教程中模型基本上都需要 GPU 资源,这对于很多资源受限的学习者不是很友好。因此,本项目通过动手学 Ollama ,帮助学习者快速上手本地 CPU 部署大模型。
- [x] 1 Ollama 介绍 @Youdon
- [x] 2 Ollama 安装与配置
- [x] 3 自定义使用 Ollama
- [x] 4 Ollama REST API
- [x] Ollama API 使用指南 @林通 @春阳
- [x] 在 Python 中使用 Ollama API @春阳
- [x] 在 Java 中使用 Ollama API @林通
- [x] 在 JavaScript 中使用 Ollama API @春阳
- [x] 在 C++ 中使用 Ollama API @林通
- [ ] 在 C# 中使用 Ollama API (待更新)
- [ ] 在 Go 中使用 Ollama API (待更新)
- [ ] 在 Rust 中使用 Ollama API(待更新)
- [ ] 在 Ruby 中使用 Ollama API(待更新)
- [ ] 在 R 中使用 Ollama API(待更新)
- [x] 5 Ollama 在 LangChain 中的使用
- [x] 在 Python 中的集成 @鑫民
- [x] 在 JavaScript 中的集成 @鑫民
- [x] 6 Ollama 可视化界面部署
- [ ] 7 应用案例
- [x] 搭建本地的 AI Copilot 编程助手 @越
- [x] Dify 接入 Ollama 部署的本地模型 @春阳
- [x] 使用 LangChain 搭建本地 RAG 应用 @舒凡
- [x] 使用 LlamaIndex 搭建本地 RAG 应用 @Youdon
- [x] 使用 LangChain 实现本地 Agent @Youdon
- [x] 使用 LlamaIndex 实现本地 Agent @Youdon
- [x] 使用 DeepSeek R1 和 Ollama 实现本地 RAG 应用 @Youdon
- [ ] 未完待续...
注:所有标记(待更新)的内容,以及其他相关的内容,热忱欢迎感兴趣的开发者们 提出 issue 或者 提交 pull request,让我们一起完善这个项目!
想要深度参与的同学可以联系我们,我们会将你加入到项目的维护者中。
Ollama 官方仓库:https://github.com/ollama/ollama
特别感谢以下为教程做出贡献的同学!
本作品采用知识共享署名-非商业性使用-相同方式共享 4.0 国际许可协议进行许可。
For Tasks:
Click tags to check more tools for each tasksFor Jobs:
Alternative AI tools for handy-ollama
Similar Open Source Tools
handy-ollama
Handy-Ollama is a tutorial for deploying Ollama with hands-on practice, making the deployment of large language models accessible to everyone. The tutorial covers a wide range of content from basic to advanced usage, providing clear steps and practical tips for beginners and experienced developers to learn Ollama from scratch, deploy large models locally, and develop related applications. It aims to enable users to run large models on consumer-grade hardware, deploy models locally, and manage models securely and reliably.
ai-tag
AI tag generator that combines 40,000 tags from Bilibili UP main Twelve Today is also very cute with Chinese translations from Novelai, providing Chinese search and tag generation services. It offers a tag community for magicians to directly copy and generate spells. Always free, no ads, no commercial use. The project includes a pure tag parsing library, independent spell parsing library, tag data repository, and a new gallery page with waterfall flow for viewing community images.
ap-plugin
AP-PLUGIN is an AI drawing plugin for the Yunzai series robot framework, allowing you to have a convenient AI drawing experience in the input box. It uses the open source Stable Diffusion web UI as the backend, deploys it for free, and generates a variety of images with richer functions.
ChatGPT-On-CS
ChatGPT-On-CS is an intelligent chatbot tool based on large models, supporting various platforms like WeChat, Taobao, Bilibili, Douyin, Weibo, and more. It can handle text, voice, and image inputs, access external resources through plugins, and customize enterprise AI applications based on proprietary knowledge bases. Users can set custom replies, utilize ChatGPT interface for intelligent responses, send images and binary files, and create personalized chatbots using knowledge base files. The tool also features platform-specific plugin systems for accessing external resources and supports enterprise AI applications customization.
ChatGPT-On-CS
This project is an intelligent dialogue customer service tool based on a large model, which supports access to platforms such as WeChat, Qianniu, Bilibili, Douyin Enterprise, Douyin, Doudian, Weibo chat, Xiaohongshu professional account operation, Xiaohongshu, Zhihu, etc. You can choose GPT3.5/GPT4.0/ Lazy Treasure Box (more platforms will be supported in the future), which can process text, voice and pictures, and access external resources such as operating systems and the Internet through plug-ins, and support enterprise AI applications customized based on their own knowledge base.
dify-helm
Deploy langgenius/dify, an LLM based chat bot app on kubernetes with helm chart.
midjourney-proxy
Midjourney-proxy is a proxy for the Discord channel of MidJourney, enabling API-based calls for AI drawing. It supports Imagine instructions, adding image base64 as a placeholder, Blend and Describe commands, real-time progress tracking, Chinese prompt translation, prompt sensitive word pre-detection, user-token connection to WSS, multi-account configuration, and more. For more advanced features, consider using midjourney-proxy-plus, which includes Shorten, focus shifting, image zooming, local redrawing, nearly all associated button actions, Remix mode, seed value retrieval, account pool persistence, dynamic maintenance, /info and /settings retrieval, account settings configuration, Niji bot robot, InsightFace face replacement robot, and an embedded management dashboard.
do-research-in-AI
This repository is a collection of research lectures and experience sharing posts from frontline researchers in the field of AI. It aims to help individuals upgrade their research skills and knowledge through insightful talks and experiences shared by experts. The content covers various topics such as evaluating research papers, choosing research directions, research methodologies, and tips for writing high-quality scientific papers. The repository also includes discussions on academic career paths, research ethics, and the emotional aspects of research work. Overall, it serves as a valuable resource for individuals interested in advancing their research capabilities in the field of AI.
yu-picture
The 'yu-picture' project is an educational project that provides complete video tutorials, text tutorials, resume writing, interview question solutions, and Q&A services to help you improve your project skills and enhance your resume. It is an enterprise-level intelligent collaborative cloud image library platform based on Vue 3 + Spring Boot + COS + WebSocket. The platform has a wide range of applications, including public image uploading and retrieval, image analysis for administrators, private image management for individual users, and real-time collaborative image editing for enterprises. The project covers file management, content retrieval, permission control, and real-time collaboration, using various programming concepts, architectural design methods, and optimization strategies to ensure high-speed iteration and stable operation.
ImageToolbox
ImageToolbox is a versatile image editing tool designed for efficient photo manipulation. It allows users to crop, apply filters, edit EXIF data, erase backgrounds, and even enhance images with AI. Ideal for both photographers and developers, the tool offers a simple interface with powerful capabilities.
my-neuro
The project aims to create a personalized AI character, a lifelike AI companion - shaping the ideal image of TA in your mind through your data imprint. The project is inspired by neuro sama, hence named my-neuro. The project can train voice, personality, and replace images. It serves as a workspace where you can use packaged tools to step by step draw and realize the ideal AI image in your mind. The deployment of the current document requires less than 6GB of VRAM, compatible with Windows systems, and requires an API-KEY. The project offers features like low latency, real-time interruption, emotion simulation, visual capabilities integration, voice model training support, desktop control, live streaming on platforms like Bilibili, and more. It aims to provide a comprehensive AI experience with features like long-term memory, AI customization, and emotional interactions.
CordysCRM
Cordys CRM is a next-generation open-source AI CRM system that integrates informatization, digitalization, and intelligence into a 'Customer Relationship Management System'. It offers modern user experience, flexible and configurable forms, processes, and permissions to help enterprises achieve sales automation easily. It ensures data security with private deployment, allowing complete control over customer data and business information. With BI capabilities from DataEase and intelligent querying from SQLBot, it enables efficient data analysis and visualization. Additionally, it provides AI capabilities through the MCP Server and MaxKB, facilitating various sales intelligence applications.
MathModelAgent
MathModelAgent is an agent designed specifically for mathematical modeling tasks. It automates the process of mathematical modeling and generates a complete paper that can be directly submitted. The tool features automatic problem analysis, code writing, error correction, and paper writing. It supports various models, offers low costs, and allows customization through prompt inject. The tool is ideal for individuals or teams working on mathematical modeling projects.
Tianji
Tianji is a free, non-commercial artificial intelligence system developed by SocialAI for tasks involving worldly wisdom, such as etiquette, hospitality, gifting, wishes, communication, awkwardness resolution, and conflict handling. It includes four main technical routes: pure prompt, Agent architecture, knowledge base, and model training. Users can find corresponding source code for these routes in the tianji directory to replicate their own vertical domain AI applications. The project aims to accelerate the penetration of AI into various fields and enhance AI's core competencies.
NarratoAI
NarratoAI is an automated video narration tool that provides an all-in-one solution for script writing, automated video editing, voice-over, and subtitle generation. It is powered by LLM to enhance efficient content creation. The tool aims to simplify the process of creating film commentary and editing videos by automating various tasks such as script writing and voice-over generation. NarratoAI offers a user-friendly interface for users to easily generate video scripts, edit videos, and customize video parameters. With future plans to optimize story generation processes and support additional large models, NarratoAI is a versatile tool for content creators looking to streamline their video production workflow.
AIMedia
AIMedia is a fully automated AI media software that automatically fetches hot news, generates news, and publishes on various platforms. It supports hot news fetching from platforms like Douyin, NetEase News, Weibo, The Paper, China Daily, and Sohu News. Additionally, it enables AI-generated images for text-only news to enhance originality and reading experience. The tool is currently commercialized with plans to support video auto-generation for platform publishing in the future. It requires a minimum CPU of 4 cores or above, 8GB RAM, and supports Windows 10 or above. Users can deploy the tool by cloning the repository, modifying the configuration file, creating a virtual environment using Conda, and starting the web interface. Feedback and suggestions can be submitted through issues or pull requests.
For similar tasks
ktransformers
KTransformers is a flexible Python-centric framework designed to enhance the user's experience with advanced kernel optimizations and placement/parallelism strategies for Transformers. It provides a Transformers-compatible interface, RESTful APIs compliant with OpenAI and Ollama, and a simplified ChatGPT-like web UI. The framework aims to serve as a platform for experimenting with innovative LLM inference optimizations, focusing on local deployments constrained by limited resources and supporting heterogeneous computing opportunities like GPU/CPU offloading of quantized models.
handy-ollama
Handy-Ollama is a tutorial for deploying Ollama with hands-on practice, making the deployment of large language models accessible to everyone. The tutorial covers a wide range of content from basic to advanced usage, providing clear steps and practical tips for beginners and experienced developers to learn Ollama from scratch, deploy large models locally, and develop related applications. It aims to enable users to run large models on consumer-grade hardware, deploy models locally, and manage models securely and reliably.
For similar jobs
weave
Weave is a toolkit for developing Generative AI applications, built by Weights & Biases. With Weave, you can log and debug language model inputs, outputs, and traces; build rigorous, apples-to-apples evaluations for language model use cases; and organize all the information generated across the LLM workflow, from experimentation to evaluations to production. Weave aims to bring rigor, best-practices, and composability to the inherently experimental process of developing Generative AI software, without introducing cognitive overhead.
LLMStack
LLMStack is a no-code platform for building generative AI agents, workflows, and chatbots. It allows users to connect their own data, internal tools, and GPT-powered models without any coding experience. LLMStack can be deployed to the cloud or on-premise and can be accessed via HTTP API or triggered from Slack or Discord.
VisionCraft
The VisionCraft API is a free API for using over 100 different AI models. From images to sound.
kaito
Kaito is an operator that automates the AI/ML inference model deployment in a Kubernetes cluster. It manages large model files using container images, avoids tuning deployment parameters to fit GPU hardware by providing preset configurations, auto-provisions GPU nodes based on model requirements, and hosts large model images in the public Microsoft Container Registry (MCR) if the license allows. Using Kaito, the workflow of onboarding large AI inference models in Kubernetes is largely simplified.
PyRIT
PyRIT is an open access automation framework designed to empower security professionals and ML engineers to red team foundation models and their applications. It automates AI Red Teaming tasks to allow operators to focus on more complicated and time-consuming tasks and can also identify security harms such as misuse (e.g., malware generation, jailbreaking), and privacy harms (e.g., identity theft). The goal is to allow researchers to have a baseline of how well their model and entire inference pipeline is doing against different harm categories and to be able to compare that baseline to future iterations of their model. This allows them to have empirical data on how well their model is doing today, and detect any degradation of performance based on future improvements.
tabby
Tabby is a self-hosted AI coding assistant, offering an open-source and on-premises alternative to GitHub Copilot. It boasts several key features: * Self-contained, with no need for a DBMS or cloud service. * OpenAPI interface, easy to integrate with existing infrastructure (e.g Cloud IDE). * Supports consumer-grade GPUs.
spear
SPEAR (Simulator for Photorealistic Embodied AI Research) is a powerful tool for training embodied agents. It features 300 unique virtual indoor environments with 2,566 unique rooms and 17,234 unique objects that can be manipulated individually. Each environment is designed by a professional artist and features detailed geometry, photorealistic materials, and a unique floor plan and object layout. SPEAR is implemented as Unreal Engine assets and provides an OpenAI Gym interface for interacting with the environments via Python.
Magick
Magick is a groundbreaking visual AIDE (Artificial Intelligence Development Environment) for no-code data pipelines and multimodal agents. Magick can connect to other services and comes with nodes and templates well-suited for intelligent agents, chatbots, complex reasoning systems and realistic characters.