LangChain-SearXNG

AI Q&A Search Engine ➡️ 基于LangChain和SearXNG打造的开源AI搜索引擎

Stars: 83

Visit

LangChain-SearXNG is an open-source AI search engine built on LangChain and SearXNG. It supports faster and more accurate search and question-answering functionalities. Users can deploy SearXNG and set up Python environment to run LangChain-SearXNG. The tool integrates AI models like OpenAI and ZhipuAI for search queries. It offers two search modes: Searxng and ZhipuWebSearch, allowing users to control the search workflow based on input parameters. LangChain-SearXNG v2 version enhances response speed and content quality compared to the previous version, providing a detailed configuration guide and showcasing the effectiveness of different search modes through comparisons.

README:

🔍 LangChain-SearXNG

简体中文 | English

基于LangChain和SearXNG打造的开源AI搜索引擎

🌟🌟🌟
重要更新： LangChain-SearXNG 支持 Docker 部署, 支持 docker-compose 一键部署体验🚀🔥💥
🌟🌟🌟

🚀 Quick Install

🛫 项目支持三种部署方式，可以按需选取

docker-compose 部署
分开 SearXNG 和 LangChain-SearXNG Docker 部署
手动部署

1.docker-compose 部署

由于 SearXNG 需要访问外网，建议部署准备好外网环境

拉取项目完整代码

git clone https://github.com/ptonlix/LangChain-SearXNG.git --recursive
cd LangChain-SearXNG/searxng-docker

# 录入邮箱，域名可不配置
vim .env

# 修改searxng配置文件
vim searxng/settings.yml

# 修改 secret_key
openssl rand -hex 32 # 生成密钥填入

# 修改 limiter 和search，其它参数保持原配置文件不变
# see https://docs.searxng.org/admin/settings/settings.html#settings-use-default-settings
use_default_settings: true
server:
  limiter: false  # can be disabled for a private instance
search:
  formats:
    - html
    - json

新增配置文件 settings-pro.yaml

详情可以参考配置文件介绍
配置文件修改

启动 docker

cd LangChain-SearXNG
docker compose up

2.分开部署 SearXNG 和 LangChain-SearXNG

1.单独部署 SearXNG

由于 SearXNG 需要访问外网，建议部署选择外网服务器以下部署示例选择以腾讯云轻量服务器-Centos 系统为例

根据 searxng-docker教程，按照以下操作，容器化部署 SearXNG

# 拉取代码
git clone https://github.com/searxng/searxng-docker.git
cd searxng-docker

# 修改域名和录入邮箱
vim .env

# 其余配置同上docker-compose 部署

2.LangChain-SearXNG Docker 部署

拉取镜像

docker pull ptonlix/langchain-searxng:v0.1.8

#通过外部配置文件挂载启动容器
docker run -p 8002:8002 -p 8501:8501 \
 -v ./settings-pro.yaml:/app/config/settings.yaml \
 --name langchain-searxng \
langchain-searxng:v0.1.8

配置文件 settings-pro.yaml

详情可以参考配置文件介绍
配置文件修改

访问 WebUI

http://localhost:8501

3.手动部署

1.部署 Python 环境

安装 miniconda

mkdir ~/miniconda3
wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh -O ~/miniconda3/miniconda.sh
bash ~/miniconda3/miniconda.sh -b -u -p ~/miniconda3
rm -rf ~/miniconda3/miniconda.sh
~/miniconda3/bin/conda init bash

创建虚拟环境

# 创建环境
conda create -n LangChain-SearXNG python==3.10.11

安装 poetry

# 安装
curl -sSL https://install.python-poetry.org | python3 -

2. 运行 LangChain-SearXNG

安装依赖

# 克隆项目代码到本地
git clone https://github.com/ptonlix/LangChain-SearXNG.git
conda activate LangChain-SearXNG # 激活环境
cd LangChain-SearXNG # 进入项目
poetry install # 安装依赖

3.修改配置文件

OpenAI 文档
 ZhipuAI 文档
 DeepSeek 文档
 LangSmith API

# settings.yaml

配置文件录入或通过环境变量设置以下变量,建议配置三个大模型API，在体验时可以自由选择模型来体验

# 根据自身环境选择合适的 大模型API
# OPENAI 大模型API
OPENAI_API_BASE
OPENAI_API_KEY

# ZHIPUAI 智谱API
ZHIPUAI_API_KEY
ZHIPUAI_API_BASE

# DeepSeek 大模型API
DEEPSPEAK_API_KEY
DEEPSPEAK_API_BASE

# LangChain调试 API
LANGCHAIN_API_KEY

# SearXNG请求地址,docker-compose 部署无需修改该变量
SEARX_HOST

详情配置文件介绍见: LangChain-SearXNG 配置

4.启动项目

# 启动项目
python -m langchain_searxng

# 查看API
访问: http://localhost:8002/docs 获取 API 信息

# 启动前端页面
cd webui
streamlit run webui.py

WebUI展示

搜索问答模式

目前项目搜索 API 已升级到 v2 版本，下面例子请求使用 v2 版本进行体验

请求参数：

{
    "question": "目前中国新能源汽车厂商排行榜是什么", #提问问题
    "chat_history": [], #历史聊天记录
    "network": true, #是否开启联网
    "conversation_id": "", #提问的UUID
    "llm": "zhipuai", #采用的大模型
    "retriever": "searx" # 采用的召回模式

}

目前支持两种搜索模式 Searxng 和 智谱WebSearch,这两种模式启用主要根据输入的请求参数llmh 和retriever控制

I. 开启智谱 WebSearch

对应 Webui 页面 ➡️ 智谱搜索

{
    ...
    "llm": "zhipuwebsearch", #大模型必须选择zhipuwebsearch (智谱搜索定制模型)
    "retriever": "zhipuwebsearch" #召回模式选择 zhipuwebsearch

}

Ⅱ. 开启 AI+SearXNG V2 版本

对应 Webui 页面 ➡️ SearXNG 搜索

{
    ...
    "llm": "deepseek", #可选：默认openai,可选zhipuai,可选deepseek
    "retriever": "searx" #可选：默认searx

}

🆚 搜索模式效果对比

🎨 能力	AI+SearXNGv1	AI+SearXNGv2	智谱 WebSearch	360AI 搜索
🚀 响应速度	🌟🌟🌟	🌟🌟🌟🌟	🌟🌟🌟🌟🌟	🌟🌟🌟🌟
📝 内容质量	🌟🌟🌟	🌟🌟🌟🌟	🌟🌟🌟	🌟🌟🌟🌟🌟
💦 流式响应	1. 搜索过程支持 2.搜索结果支持	1. 搜索过程支持 2.搜索结果支持	1. 搜索过程不支持 2.搜索结果支持	1. 搜索过程支持 2.搜索结果支持

AI+SearXNGv2相较于上个版本从响应速度和内容质量均有明显提升，距离 360AI 搜索更进一步了 💪

详细评测分析: AI 搜索模式对比测试

⛓️ 项目介绍

本项目通过构建 SearXNG 搜索引擎 Tool + LangChain LCEL 调用方式构建-AI 搜索引擎 Agent，以 Fastapi 对外提供服务

1.AI+SearXNGv2 工作流介绍

v1 版本介绍

AI+SearXNG v2版本工作流

通过用户输入的参数控制搜索工作流程，主流程分为联网搜索问答和模型内搜索问答
模型内搜索问答: 通过获取用户输入chat_history question 构建 Prompt 输入到 LLM 生成问答结果并返回
联网搜索问答: 主要分为三个部分 condense question chain 搜索召回 response synthesizer chain
1. 如果输入的 chat_history 不为空，则进入condense question chain 工作流，根据聊天上下文生成最合适的搜索 query
2. 通过 query 进入搜索召回 工作流:分为 searxng search select search result Data processing三个部分
- 通过 LLM 根据搜索 query 选择最合适的 searxng 搜索参数，调用 searxng api 搜索结果（通常 20 ～ 30 个搜索结果）
- 再根据上一步搜索到的结果，通过 LLM 进一步筛选出最合适回答该 query 的搜索结果，通常 6 个
- 根据确定最合适的搜索结果，进行数据处理：检查可访问性->获取 html->生成 Documents ->format 格式化，最终输出问答上下文context
1. 通过搜索召回的上下文 context和用户输入的chat_history question一起进入response synthesizer chain工作流，最终生成搜索响应

v2 对比 v1 版本的差别

v1 版本主要是搜索获取数据，再通过过滤筛选出最佳数据，但如果一开始源数据质量不佳，则后续工作效果就会大减，而且基础源数据数量不多，向量化过滤时间很长。
v2 版本一个主要原则是确保源数据质量，搜索结果尽可能符合搜索关键字，所以精心构建了搜索召回工作流，让大模型参与获取最佳搜索结果。同时由于已经筛选出最佳搜索数据，不需要向量化过滤，可以直接 LLM 让生成结果。（Token 越来越便宜也是一大趋势）
v2 版本还优化了搜索网页加载流程，尽可能快的获取到搜索数据
v2 版本增加搜索过程可视化，在流式返回中，可以动态显示搜索进度，展示更丰富的内容

2. 目录结构

├── docs  # 文档
├── langchain_searxng
│   ├── components #自定义组件
│   ├── server # API服务
│   ├── settings # 配置服务
│   ├── utils
│   ├── constants.py
│   ├── di.py
│   ├── launcher.py
│   ├── main.py
│   ├── paths.py
│   ├── __init__.py
│   ├── __main__.py #入口
│   └── __version__.py
├── log # 日志目录
├── wwebui # 前端展示页面

3. 功能介绍

支持查询结果 http sse 流式和非流式（整体）返回
支持联网查询 QA 和直接 QA 切换
支持 Token 计算（含 embedding）
支持 openai、zhipuai、deepseek 三种大模型
支持配置文件动态加载
支持智谱 AI 新推出的 WebSearch 功能

🚩 Roadmap

[x] 搭建 LangChain-SearXNG 初步框架，完善基本功能
[x] 支持配置文件动态加载，方便更改相关参数
[x] 完善网站页面内容爬取效果
[x] 支持网络访问异常处理，方便国内环境使用
[x] 支持智谱 WebSearch 功能
[x] 升级 SearXNG 搜索问题,支持更快更精准的回答
[x] 搭建前端 Web Demo
[ ] Docker 化项目，便于部署传播
[x] 支持视频搜索
[ ] 优化 Prompt，支持输出更丰富的内容

🌏 项目交流讨论

🎉 扫码联系作者，如果你也对本项目感兴趣
🎉 欢迎加入 LangChain-X (帝阅开发社区) 项目群参与讨论交流

💥 贡献

欢迎大家贡献力量，一起共建 LangChain-SearXNG，您可以做任何有益事情

报告错误
建议改进
文档贡献
代码贡献
...
👏👏👏

帝阅介绍

「帝阅」
是一款个人专属知识管理与创造的 AI Native 产品
为用户打造一位专属的侍读助理，帮助提升用户获取知识效率和发挥创造力
让用户更好地去积累知识、管理知识、运用知识

LangChain-SearXNG 是帝阅项目一个子项目，我们决定开源出来，与大家交流学习

同时，欢迎大家前往体验帝阅给我们提出宝贵的建议

帝阅DeepRead

For Tasks:

Click tags to check more tools for each tasks

run ai search deploy search engine configure search settings compare search modes enhance search performance

For Jobs:

data scientist ai engineer software developer machine learning engineer research scientist

Alternative AI tools for LangChain-SearXNG

Similar Open Source Tools

LangChain-SearXNG

github

: 83

chatgpt-web

ChatGPT Web is a web application that provides access to the ChatGPT API. It offers two non-official methods to interact with ChatGPT: through the ChatGPTAPI (using the `gpt-3.5-turbo-0301` model) or through the ChatGPTUnofficialProxyAPI (using a web access token). The ChatGPTAPI method is more reliable but requires an OpenAI API key, while the ChatGPTUnofficialProxyAPI method is free but less reliable. The application includes features such as user registration and login, synchronization of conversation history, customization of API keys and sensitive words, and management of users and keys. It also provides a user interface for interacting with ChatGPT and supports multiple languages and themes.

github

: 1.4k

nekro-agent

Nekro Agent is an AI chat plugin and proxy execution bot that is highly scalable, offers high freedom, and has minimal deployment requirements. It features context-aware chat for group/private chats, custom character settings, sandboxed execution environment, interactive image resource handling, customizable extension development interface, easy deployment with docker-compose, integration with Stable Diffusion for AI drawing capabilities, support for various file types interaction, hot configuration updates and command control, native multimodal understanding, visual application management control panel, CoT (Chain of Thought) support, self-triggered timers and holiday greetings, event notification understanding, and more. It allows for third-party extensions and AI-generated extensions, and includes features like automatic context trigger based on LLM, and a variety of basic commands for bot administrators.

github

: 141

TelegramForwarder

Telegram Forwarder is a message forwarding tool that allows you to forward messages from specified chats to other chats without the need for a bot to enter the corresponding channels/groups to listen. It can be used for information stream integration filtering, message reminders, content archiving, and more. The tool supports multiple sources forwarding, keyword filtering in whitelist and blacklist modes, regular expression matching, message content modification, AI processing using major vendors' AI interfaces, media file filtering, and synchronization with a universal forum blocking plugin to achieve three-end blocking.

github

: 193

Streamer-Sales

Streamer-Sales is a large model for live streamers that can explain products based on their characteristics and inspire users to make purchases. It is designed to enhance sales efficiency and user experience, whether for online live sales or offline store promotions. The model can deeply understand product features and create tailored explanations in vivid and precise language, sparking user's desire to purchase. It aims to revolutionize the shopping experience by providing detailed and unique product descriptions to engage users effectively.

github

: 2.4k

dify-chat

Dify Chat Web is an AI conversation web app based on the Dify API, compatible with DeepSeek, Dify Chatflow/Workflow applications, and Agent Mind Chain output information. It supports multiple scenarios, flexible deployment without backend dependencies, efficient integration with reusable React components, and style customization for unique business system styles.

github

: 78

meet-libai

The 'meet-libai' project aims to promote and popularize the cultural heritage of the Chinese poet Li Bai by constructing a knowledge graph of Li Bai and training a professional AI intelligent body using large models. The project includes features such as data preprocessing, knowledge graph construction, question-answering system development, and visualization exploration of the graph structure. It also provides code implementations for large models and RAG retrieval enhancement.

github

: 1.1k

chatgpt-web-sea

ChatGPT Web Sea is an open-source project based on ChatGPT-web for secondary development. It supports all models that comply with the OpenAI interface standard, allows for model selection, configuration, and extension, and is compatible with OneAPI. The tool includes a Chinese ChatGPT tuning guide, supports file uploads, and provides model configuration options. Users can interact with the tool through a web interface, configure models, and perform tasks such as model selection, API key management, and chat interface setup. The project also offers Docker deployment options and instructions for manual packaging.

github

: 52

AivisSpeech-Engine

AivisSpeech-Engine is a powerful open-source tool for speech recognition and synthesis. It provides state-of-the-art algorithms for converting speech to text and text to speech. The tool is designed to be user-friendly and customizable, allowing developers to easily integrate speech capabilities into their applications. With AivisSpeech-Engine, users can transcribe audio recordings, create voice-controlled interfaces, and generate natural-sounding speech output. Whether you are building a virtual assistant, developing a speech-to-text application, or experimenting with voice technology, AivisSpeech-Engine offers a comprehensive solution for all your speech processing needs.

github

: 97

MINI_LLM

This project is a personal implementation and reproduction of a small-parameter Chinese LLM. It mainly refers to these two open source projects: https://github.com/charent/Phi2-mini-Chinese and https://github.com/DLLXW/baby-llama2-chinese. It includes the complete process of pre-training, SFT instruction fine-tuning, DPO, and PPO (to be done). I hope to share it with everyone and hope that everyone can work together to improve it!

github

: 413

ailab

The 'ailab' project is an experimental ground for code generation combining AI (especially coding agents) and Deno. It aims to manage configuration files defining coding rules and modes in Deno projects, enhancing the quality and efficiency of code generation by AI. The project focuses on defining clear rules and modes for AI coding agents, establishing best practices in Deno projects, providing mechanisms for type-safe code generation and validation, applying test-driven development (TDD) workflow to AI coding, and offering implementation examples utilizing design patterns like adapter pattern.

github

: 280

AivisSpeech

AivisSpeech is a Japanese text-to-speech software based on the VOICEVOX editor UI. It incorporates the AivisSpeech Engine for generating emotionally rich voices easily. It supports AIVMX format voice synthesis model files and specific model architectures like Style-Bert-VITS2. Users can download AivisSpeech and AivisSpeech Engine for Windows and macOS PCs, with minimum memory requirements specified. The development follows the latest version of VOICEVOX, focusing on minimal modifications, rebranding only where necessary, and avoiding refactoring. The project does not update documentation, maintain test code, or refactor unused features to prevent conflicts with VOICEVOX.

github

: 325

bce-qianfan-sdk

The Qianfan SDK provides best practices for large model toolchains, allowing AI workflows and AI-native applications to access the Qianfan large model platform elegantly and conveniently. The core capabilities of the SDK include three parts: large model reasoning, large model training, and general and extension: * `Large model reasoning`: Implements interface encapsulation for reasoning of Yuyan (ERNIE-Bot) series, open source large models, etc., supporting dialogue, completion, Embedding, etc. * `Large model training`: Based on platform capabilities, it supports end-to-end large model training process, including training data, fine-tuning/pre-training, and model services. * `General and extension`: General capabilities include common AI development tools such as Prompt/Debug/Client. The extension capability is based on the characteristics of Qianfan to adapt to common middleware frameworks.

github

: 342

spring-boot-init-template

github

: 305

YesImBot

YesImBot, also known as Athena, is a Koishi plugin designed to allow large AI models to participate in group chat discussions. It offers easy customization of the bot's name, personality, emotions, and other messages. The plugin supports load balancing multiple API interfaces for large models, provides immersive context awareness, blocks potentially harmful messages, and automatically fetches high-quality prompts. Users can adjust various settings for the bot and customize system prompt words. The ultimate goal is to seamlessly integrate the bot into group chats without detection, with ongoing improvements and features like message recognition, emoji sending, multimodal image support, and more.

github

: 78

ddddocr

ddddocr is a Rust version of a simple OCR API server that provides easy deployment for captcha recognition without relying on the OpenCV library. It offers a user-friendly general-purpose captcha recognition Rust library. The tool supports recognizing various types of captchas, including single-line text, transparent black PNG images, target detection, and slider matching algorithms. Users can also import custom OCR training models and utilize the OCR API server for flexible OCR result control and range limitation. The tool is cross-platform and can be easily deployed.

github

: 80

For similar tasks

LangChain-SearXNG

github

: 83

For similar jobs

sweep

Sweep is an AI junior developer that turns bugs and feature requests into code changes. It automatically handles developer experience improvements like adding type hints and improving test coverage.

github

: 7.1k

teams-ai

The Teams AI Library is a software development kit (SDK) that helps developers create bots that can interact with Teams and Microsoft 365 applications. It is built on top of the Bot Framework SDK and simplifies the process of developing bots that interact with Teams' artificial intelligence capabilities. The SDK is available for JavaScript/TypeScript, .NET, and Python.

github

: 502

ai-guide

This guide is dedicated to Large Language Models (LLMs) that you can run on your home computer. It assumes your PC is a lower-end, non-gaming setup.

github

: 159

classifai

Supercharge WordPress Content Workflows and Engagement with Artificial Intelligence. Tap into leading cloud-based services like OpenAI, Microsoft Azure AI, Google Gemini and IBM Watson to augment your WordPress-powered websites. Publish content faster while improving SEO performance and increasing audience engagement. ClassifAI integrates Artificial Intelligence and Machine Learning technologies to lighten your workload and eliminate tedious tasks, giving you more time to create original content that matters.

github

: 620

chatbot-ui

Chatbot UI is an open-source AI chat app that allows users to create and deploy their own AI chatbots. It is easy to use and can be customized to fit any need. Chatbot UI is perfect for businesses, developers, and anyone who wants to create a chatbot.

github

: 27.7k

BricksLLM

BricksLLM is a cloud native AI gateway written in Go. Currently, it provides native support for OpenAI, Anthropic, Azure OpenAI and vLLM. BricksLLM aims to provide enterprise level infrastructure that can power any LLM production use cases. Here are some use cases for BricksLLM: * Set LLM usage limits for users on different pricing tiers * Track LLM usage on a per user and per organization basis * Block or redact requests containing PIIs * Improve LLM reliability with failovers, retries and caching * Distribute API keys with rate limits and cost limits for internal development/production use cases * Distribute API keys with rate limits and cost limits for students

github

: 953

uAgents

uAgents is a Python library developed by Fetch.ai that allows for the creation of autonomous AI agents. These agents can perform various tasks on a schedule or take action on various events. uAgents are easy to create and manage, and they are connected to a fast-growing network of other uAgents. They are also secure, with cryptographically secured messages and wallets.

github

: 1.3k

griptape

Griptape is a modular Python framework for building AI-powered applications that securely connect to your enterprise data and APIs. It offers developers the ability to maintain control and flexibility at every step. Griptape's core components include Structures (Agents, Pipelines, and Workflows), Tasks, Tools, Memory (Conversation Memory, Task Memory, and Meta Memory), Drivers (Prompt and Embedding Drivers, Vector Store Drivers, Image Generation Drivers, Image Query Drivers, SQL Drivers, Web Scraper Drivers, and Conversation Memory Drivers), Engines (Query Engines, Extraction Engines, Summary Engines, Image Generation Engines, and Image Query Engines), and additional components (Rulesets, Loaders, Artifacts, Chunkers, and Tokenizers). Griptape enables developers to create AI-powered applications with ease and efficiency.

github

: 2.2k