WeKnora

LLM-powered framework for deep document understanding, semantic retrieval, and context-aware answers using RAG paradigm.

Stars: 5779

Visit

WeKnora is a document understanding and semantic retrieval framework based on large language models (LLM), designed specifically for scenarios with complex structures and heterogeneous content. The framework adopts a modular architecture, integrating multimodal preprocessing, semantic vector indexing, intelligent recall, and large model generation reasoning to build an efficient and controllable document question-answering process. The core retrieval process is based on the RAG (Retrieval-Augmented Generation) mechanism, combining context-relevant segments with language models to achieve higher-quality semantic answers. It supports various document formats, intelligent inference, flexible extension, efficient retrieval, ease of use, and security and control. Suitable for enterprise knowledge management, scientific literature analysis, product technical support, legal compliance review, and medical knowledge assistance.

README:

| English | 简体中文 | 日本語 |

Overview • Architecture • Key Features • Getting Started • API Reference • Developer Guide

💡 WeKnora - LLM-Powered Document Understanding & Retrieval Framework

📌 Overview

WeKnora is an LLM-powered framework designed for deep document understanding and semantic retrieval, especially for handling complex, heterogeneous documents.

It adopts a modular architecture that combines multimodal preprocessing, semantic vector indexing, intelligent retrieval, and large language model inference. At its core, WeKnora follows the RAG (Retrieval-Augmented Generation) paradigm, enabling high-quality, context-aware answers by combining relevant document chunks with model reasoning.

Website: https://weknora.weixin.qq.com

🔒 Security Notice

Important: Starting from v0.1.3, WeKnora includes login authentication functionality to enhance system security. For production deployments, we strongly recommend:

Deploy WeKnora services in internal/private network environments rather than public internet
Avoid exposing the service directly to public networks to prevent potential information leakage
Configure proper firewall rules and access controls for your deployment environment
Regularly update to the latest version for security patches and improvements

🏗️ Architecture

WeKnora employs a modern modular design to build a complete document understanding and retrieval pipeline. The system primarily includes document parsing, vector processing, retrieval engine, and large model inference as core modules, with each component being flexibly configurable and extendable.

🎯 Key Features

🔍 Precise Understanding: Structured content extraction from PDFs, Word documents, images and more into unified semantic views
🧠 Intelligent Reasoning: Leverages LLMs to understand document context and user intent for accurate Q&A and multi-turn conversations
🔧 Flexible Extension: All components from parsing and embedding to retrieval and generation are decoupled for easy customization
⚡ Efficient Retrieval: Hybrid retrieval strategies combining keywords, vectors, and knowledge graphs
🎯 User-Friendly: Intuitive web interface and standardized APIs for zero technical barriers
🔒 Secure & Controlled: Support for local deployment and private cloud, ensuring complete data sovereignty

📊 Application Scenarios

Scenario	Applications	Core Value
Enterprise Knowledge Management	Internal document retrieval, policy Q&A, operation manual search	Improve knowledge discovery efficiency, reduce training costs
Academic Research Analysis	Paper retrieval, research report analysis, scholarly material organization	Accelerate literature review, assist research decisions
Product Technical Support	Product manual Q&A, technical documentation search, troubleshooting	Enhance customer service quality, reduce support burden
Legal & Compliance Review	Contract clause retrieval, regulatory policy search, case analysis	Improve compliance efficiency, reduce legal risks
Medical Knowledge Assistance	Medical literature retrieval, treatment guideline search, case analysis	Support clinical decisions, improve diagnosis quality

🧩 Feature Matrix

Module	Support	Description
Document Formats	✅ PDF / Word / Txt / Markdown / Images (with OCR / Caption)	Support for structured and unstructured documents with text extraction from images
Embedding Models	✅ Local models, BGE / GTE APIs, etc.	Customizable embedding models, compatible with local deployment and cloud vector generation APIs
Vector DB Integration	✅ PostgreSQL (pgvector), Elasticsearch	Support for mainstream vector index backends, flexible switching for different retrieval scenarios
Retrieval Strategies	✅ BM25 / Dense Retrieval / GraphRAG	Support for sparse/dense recall and knowledge graph-enhanced retrieval with customizable retrieve-rerank-generate pipelines
LLM Integration	✅ Support for Qwen, DeepSeek, etc., with thinking/non-thinking mode switching	Compatible with local models (e.g., via Ollama) or external API services with flexible inference configuration
QA Capabilities	✅ Context-aware, multi-turn dialogue, prompt templates	Support for complex semantic modeling, instruction control and chain-of-thought Q&A with configurable prompts and context windows
E2E Testing	✅ Retrieval+generation process visualization and metric evaluation	End-to-end testing tools for evaluating recall hit rates, answer coverage, BLEU/ROUGE and other metrics
Deployment Modes	✅ Support for local deployment / Docker images	Meets private, offline deployment and flexible operation requirements
User Interfaces	✅ Web UI + RESTful API	Interactive interface and standard API endpoints, suitable for both developers and business users

🚀 Getting Started

🛠 Prerequisites

Make sure the following tools are installed on your system:

📦 Installation

① Clone the repository

# Clone the main repository
git clone https://github.com/Tencent/WeKnora.git
cd WeKnora

② Configure environment variables

# Copy example env file
cp .env.example .env

# Edit .env and set required values
# All variables are documented in the .env.example comments

③ Start the services

# Start all services (Ollama + backend containers)
./scripts/start_all.sh
# Or
make start-all

③ Start the services (backup)

# Start ollama services (Optional)
ollama serve > /dev/null 2>&1 &

# Start the service
docker compose up -d

④ Stop the services

./scripts/start_all.sh --stop
# Or
make stop-all

🌐 Access Services

Once started, services will be available at:

Web UI: http://localhost
Backend API: http://localhost:8080
Jaeger Tracing: http://localhost:16686

🔌 Using WeChat Dialog Open Platform

WeKnora serves as the core technology framework for the WeChat Dialog Open Platform, providing a more convenient usage approach:

Zero-code Deployment: Simply upload knowledge to quickly deploy intelligent Q&A services within the WeChat ecosystem, achieving an "ask and answer" experience
Efficient Question Management: Support for categorized management of high-frequency questions, with rich data tools to ensure accurate, reliable, and easily maintainable answers
WeChat Ecosystem Integration: Through the WeChat Dialog Open Platform, WeKnora's intelligent Q&A capabilities can be seamlessly integrated into WeChat Official Accounts, Mini Programs, and other WeChat scenarios, enhancing user interaction experiences

🔗 Access WeKnora via MCP Server

1️⃣ Clone the repository

git clone https://github.com/Tencent/WeKnora

2️⃣ Configure MCP Server

Configure the MCP client to connect to the server:

{
  "mcpServers": {
    "weknora": {
      "args": [
        "path/to/WeKnora/mcp-server/run_server.py"
      ],
      "command": "python",
      "env":{
        "WEKNORA_API_KEY":"Enter your WeKnora instance, open developer tools, check the request header x-api-key starting with sk",
        "WEKNORA_BASE_URL":"http(s)://your-weknora-address/api/v1"
      }
    }
  }
}

Run directly using stdio command:

pip install weknora-mcp-server
python -m weknora-mcp-server

🔧 Initialization Configuration Guide

To help users quickly configure various models and reduce trial-and-error costs, we've improved the original configuration file initialization method by adding a Web UI interface for model configuration. Before using, please ensure the code is updated to the latest version. The specific steps are as follows: If this is your first time using this project, you can skip steps ①② and go directly to steps ③④.

① Stop the services

./scripts/start_all.sh --stop

② Clear existing data tables (recommended when no important data exists)

make clean-db

③ Compile and start services

./scripts/start_all.sh

④ Access Web UI

http://localhost

On first access, it will automatically redirect to the initialization configuration page. After configuration is complete, it will automatically redirect to the knowledge base page. Please follow the page instructions to complete model configuration.

📱 Interface Showcase

Web UI Interface

Knowledge Upload	Q&A Entry
Rich Text & Image Responses

Knowledge Base Management: Support for dragging and dropping various documents, automatically identifying document structures and extracting core knowledge to establish indexes. The system clearly displays processing progress and document status, achieving efficient knowledge base management.

Document Knowledge Graph

WeKnora supports transforming documents into knowledge graphs, displaying the relationships between different sections of the documents. Once the knowledge graph feature is enabled, the system analyzes and constructs an internal semantic association network that not only helps users understand document content but also provides structured support for indexing and retrieval, enhancing the relevance and breadth of search results.

MCP Server Integration Effects

📘 API Reference

Troubleshooting FAQ: Troubleshooting FAQ

Detailed API documentation is available at: API Docs

🧭 Developer Guide

📁 Directory Structure

WeKnora/
├── cmd/         # Main entry point
├── internal/    # Core business logic
├── config/      # Configuration files
├── migrations/  # DB migration scripts
├── scripts/     # Shell scripts
├── services/    # Microservice logic
├── frontend/    # Frontend app
└── docs/        # Project documentation

🔧 Common Commands

# Wipe all data from DB (use with caution)
make clean-db

🤝 Contributing

We welcome community contributions! For suggestions, bugs, or feature requests, please submit an Issue or directly create a Pull Request.

🎯 How to Contribute

🐛 Bug Fixes: Discover and fix system defects
✨ New Features: Propose and implement new capabilities
📚 Documentation: Improve project documentation
🧪 Test Cases: Write unit and integration tests
🎨 UI/UX Enhancements: Improve user interface and experience

📋 Contribution Process

Fork the project to your GitHub account
Create a feature branch git checkout -b feature/amazing-feature
Commit changes git commit -m 'Add amazing feature'
Push branch git push origin feature/amazing-feature
Create a Pull Request with detailed description of changes

🎨 Code Standards

Follow Go Code Review Comments
Format code using gofmt
Add necessary unit tests
Update relevant documentation

📝 Commit Guidelines

Use Conventional Commits standard:

feat: Add document batch upload functionality
fix: Resolve vector retrieval precision issue
docs: Update API documentation
test: Add retrieval engine test cases
refactor: Restructure document parsing module

📄 License

This project is licensed under the MIT License. You are free to use, modify, and distribute the code with proper attribution.

For Tasks:

Click tags to check more tools for each tasks

find internal documents analyze research reports answer product manual queries search legal contract terms retrieve medical literature

For Jobs:

knowledge manager research analyst technical support specialist legal compliance officer medical researcher

Alternative AI tools for WeKnora

Similar Open Source Tools

WeKnora

github

: 5.8k

instill-core

Instill Core is an open-source orchestrator comprising a collection of source-available projects designed to streamline every aspect of building versatile AI features with unstructured data. It includes Instill VDP (Versatile Data Pipeline) for unstructured data, AI, and pipeline orchestration, Instill Model for scalable MLOps and LLMOps for open-source or custom AI models, and Instill Artifact for unified unstructured data management. Instill Core can be used for tasks such as building, testing, and sharing pipelines, importing, serving, fine-tuning, and monitoring ML models, and transforming documents, images, audio, and video into a unified AI-ready format.

github

: 2.3k

FuzzyAI

The FuzzyAI Fuzzer is a powerful tool for automated LLM fuzzing, designed to help developers and security researchers identify jailbreaks and mitigate potential security vulnerabilities in their LLM APIs. It supports various fuzzing techniques, provides input generation capabilities, can be easily integrated into existing workflows, and offers an extensible architecture for customization and extension. The tool includes attacks like ArtPrompt, Taxonomy-based paraphrasing, Many-shot jailbreaking, Genetic algorithm, Hallucinations, DAN (Do Anything Now), WordGame, Crescendo, ActorAttack, Back To The Past, Please, Thought Experiment, and Default. It supports models from providers like Anthropic, OpenAI, Gemini, Azure, Bedrock, AI21, and Ollama, with the ability to add support for newer models. The tool also supports various cloud APIs and datasets for testing and experimentation.

github

: 411

coreply

Coreply is an open-source Android app that provides texting suggestions while typing, enhancing the typing experience with intelligent, context-aware suggestions. It supports various texting apps and offers real-time AI suggestions, customizable LLM settings, and ensures no data collection. Users can install the app, configure it with an API key, and start receiving suggestions while typing in messaging apps. The tool supports different AI models from providers like OpenAI, Google AI Studio, Openrouter, Groq, and Codestral for chat completion and fill-in-the-middle tasks.

github

: 95

ApeRAG

ApeRAG is a production-ready platform for Retrieval-Augmented Generation (RAG) that combines Graph RAG, vector search, and full-text search with advanced AI agents. It is ideal for building Knowledge Graphs, Context Engineering, and deploying intelligent AI agents for autonomous search and reasoning across knowledge bases. The platform offers features like advanced index types, intelligent AI agents with MCP support, enhanced Graph RAG with entity normalization, multimodal processing, hybrid retrieval engine, MinerU integration for document parsing, production-grade deployment with Kubernetes, enterprise management features, MCP integration, and developer-friendly tools for customization and contribution.

github

: 780

Starmoon

Starmoon is an affordable, compact AI-enabled device that can understand and respond to your emotions with empathy. It offers supportive conversations and personalized learning assistance. The device is cost-effective, voice-enabled, open-source, compact, and aims to reduce screen time. Users can assemble the device themselves using off-the-shelf components and deploy it locally for data privacy. Starmoon integrates various APIs for AI language models, speech-to-text, text-to-speech, and emotion intelligence. The hardware setup involves components like ESP32S3, microphone, amplifier, speaker, LED light, and button, along with software setup instructions for developers. The project also includes a web app, backend API, and background task dashboard for monitoring and management.

github

: 457

glide

Glide is a cloud-native LLM gateway that provides a unified REST API for accessing various large language models (LLMs) from different providers. It handles LLMOps tasks such as model failover, caching, key management, and more, making it easy to integrate LLMs into applications. Glide supports popular LLM providers like OpenAI, Anthropic, Azure OpenAI, AWS Bedrock (Titan), Cohere, Google Gemini, OctoML, and Ollama. It offers high availability, performance, and observability, and provides SDKs for Python and NodeJS to simplify integration.

github

: 110

fastRAG

fastRAG is a research framework designed to build and explore efficient retrieval-augmented generative models. It incorporates state-of-the-art Large Language Models (LLMs) and Information Retrieval to empower researchers and developers with a comprehensive tool-set for advancing retrieval augmented generation. The framework is optimized for Intel hardware, customizable, and includes key features such as optimized RAG pipelines, efficient components, and RAG-efficient components like ColBERT and Fusion-in-Decoder (FiD). fastRAG supports various unique components and backends for running LLMs, making it a versatile tool for research and development in the field of retrieval-augmented generation.

github

: 1.3k

eairp

Next generation artificial intelligent ERP system. On the basis of ERP business, we have expanded GPT-3.5. Individually or company can fine-tune your model through our system. You can provide fully automated business form submission operations through your simple description, and you can chat, interact, and consult information with GPT. You can deploy through Docker to quickly start and use. Completely free project. Enginsh / 简体中文.

github

: 105

dive

Dive is an AI toolkit for Go that enables the creation of specialized teams of AI agents and seamless integration with leading LLMs. It offers a CLI and APIs for easy integration, with features like creating specialized agents, hierarchical agent systems, declarative configuration, multiple LLM support, extended reasoning, model context protocol, advanced model settings, tools for agent capabilities, tool annotations, streaming, CLI functionalities, thread management, confirmation system, deep research, and semantic diff. Dive also provides semantic diff analysis, unified interface for LLM providers, tool system with annotations, custom tool creation, and support for various verified models. The toolkit is designed for developers to build AI-powered applications with rich agent capabilities and tool integrations.

github

: 91

dotclaude

A sophisticated multi-agent configuration system for Claude Code that provides specialized agents and command templates to accelerate code review, refactoring, security audits, tech-lead-guidance, and UX evaluations. It offers essential commands, directory structure details, agent system overview, command templates, usage patterns, collaboration philosophy, sync management, advanced usage guidelines, and FAQ. The tool aims to streamline development workflows, enhance code quality, and facilitate collaboration between developers and AI agents.

github

: 149

chunkr

Chunkr is an open-source document intelligence API that provides a production-ready service for document layout analysis, OCR, and semantic chunking. It allows users to convert PDFs, PPTs, Word docs, and images into RAG/LLM-ready chunks. The API offers features such as layout analysis, OCR with bounding boxes, structured HTML and markdown output, and VLM processing controls. Users can interact with Chunkr through a Python SDK, enabling them to upload documents, process them, and export results in various formats. The tool also supports self-hosted deployment options using Docker Compose or Kubernetes, with configurations for different AI models like OpenAI, Google AI Studio, and OpenRouter. Chunkr is dual-licensed under the GNU Affero General Public License v3.0 (AGPL-3.0) and a commercial license, providing flexibility for different usage scenarios.

github

: 2.1k

zo2

ZO2 (Zeroth-Order Offloading) is an innovative framework designed to enhance the fine-tuning of large language models (LLMs) using zeroth-order (ZO) optimization techniques and advanced offloading technologies. It is tailored for setups with limited GPU memory, enabling the fine-tuning of models with over 175 billion parameters on single GPUs with as little as 18GB of memory. ZO2 optimizes CPU offloading, incorporates dynamic scheduling, and has the capability to handle very large models efficiently without extra time costs or accuracy losses.

github

: 72

evalchemy

Evalchemy is a unified and easy-to-use toolkit for evaluating language models, focusing on post-trained models. It integrates multiple existing benchmarks such as RepoBench, AlpacaEval, and ZeroEval. Key features include unified installation, parallel evaluation, simplified usage, and results management. Users can run various benchmarks with a consistent command-line interface and track results locally or integrate with a database for systematic tracking and leaderboard submission.

github

: 317

BodhiApp

Bodhi App runs Open Source Large Language Models locally, exposing LLM inference capabilities as OpenAI API compatible REST APIs. It leverages llama.cpp for GGUF format models and huggingface.co ecosystem for model downloads. Users can run fine-tuned models for chat completions, create custom aliases, and convert Huggingface models to GGUF format. The CLI offers commands for environment configuration, model management, pulling files, serving API, and more.

github

: 118

SoM-LLaVA

SoM-LLaVA is a new data source and learning paradigm for Multimodal LLMs, empowering open-source Multimodal LLMs with Set-of-Mark prompting and improved visual reasoning ability. The repository provides a new dataset that is complementary to existing training sources, enhancing multimodal LLMs with Set-of-Mark prompting and improved general capacity. By adding 30k SoM data to the visual instruction tuning stage of LLaVA, the tool achieves 1% to 6% relative improvements on all benchmarks. Users can train SoM-LLaVA via command line and utilize the implementation to annotate COCO images with SoM. Additionally, the tool can be loaded in Huggingface for further usage.

github

: 92

For similar tasks

WeKnora

github

: 5.8k

For similar jobs

Mindolph

Mindolph is an open source personal knowledge management software for all desktop platforms. It allows users to create and manage their own files in separate workspaces with saving in their local storage, organize their files as a tree in their workspaces, and have multiple tabs for opening files instead of a single file window. Mindolph supports Mind Map, Markdown, PlantUML, CSV sheet, and plain text file formats. It also has features such as quickly navigating to files and searching text in files under a specific folder, editing mind maps easily and quickly with key shortcuts, supporting themes and providing some pre-defined themes, importing from other mind map formats, and exporting to other file formats.

github

: 167

book

Podwise is an AI knowledge management app designed specifically for podcast listeners. With the Podwise platform, you only need to follow your favorite podcasts, such as "Hardcore Hackers". When a program is released, Podwise will use AI to transcribe, extract, summarize, and analyze the podcast content, helping you to break down the hard-core podcast knowledge. At the same time, it is connected to platforms such as Notion, Obsidian, Logseq, and Readwise, embedded in your knowledge management workflow, and integrated with content from other channels including news, newsletters, and blogs, helping you to improve your second brain 🧠.

github

: 1.0k

sycamore

Sycamore is a conversational search and analytics platform for complex unstructured data, such as documents, presentations, transcripts, embedded tables, and internal knowledge repositories. It retrieves and synthesizes high-quality answers through bringing AI to data preparation, indexing, and retrieval. Sycamore makes it easy to prepare unstructured data for search and analytics, providing a toolkit for data cleaning, information extraction, enrichment, summarization, and generation of vector embeddings that encapsulate the semantics of data. Sycamore uses your choice of generative AI models to make these operations simple and effective, and it enables quick experimentation and iteration. Additionally, Sycamore uses OpenSearch for indexing, enabling hybrid (vector + keyword) search, retrieval-augmented generation (RAG) pipelining, filtering, analytical functions, conversational memory, and other features to improve information retrieval.

github

: 489

deep-seek

DeepSeek is a new experimental architecture for a large language model (LLM) powered internet-scale retrieval engine. Unlike current research agents designed as answer engines, DeepSeek aims to process a vast amount of sources to collect a comprehensive list of entities and enrich them with additional relevant data. The end result is a table with retrieved entities and enriched columns, providing a comprehensive overview of the topic. DeepSeek utilizes both standard keyword search and neural search to find relevant content, and employs an LLM to extract specific entities and their associated contents. It also includes a smaller answer agent to enrich the retrieved data, ensuring thoroughness. DeepSeek has the potential to revolutionize research and information gathering by providing a comprehensive and structured way to access information from the vastness of the internet.

github

: 279

swirl-search

Swirl is an open-source software that allows users to simultaneously search multiple content sources and receive AI-ranked results. It connects to various data sources, including databases, public data services, and enterprise sources, and utilizes AI and LLMs to generate insights and answers based on the user's data. Swirl is easy to use, requiring only the download of a YML file, starting in Docker, and searching with Swirl. Users can add credentials to preloaded SearchProviders to access more sources. Swirl also offers integration with ChatGPT as a configured AI model. It adapts and distributes user queries to anything with a search API, re-ranking the unified results using Large Language Models without extracting or indexing anything. Swirl includes five Google Programmable Search Engines (PSEs) to get users up and running quickly. Key features of Swirl include Microsoft 365 integration, SearchProvider configurations, query adaptation, synchronous or asynchronous search federation, optional subscribe feature, pipelining of Processor stages, results stored in SQLite3 or PostgreSQL, built-in Query Transformation support, matching on word stems and handling of stopwords, duplicate detection, re-ranking of unified results using Cosine Vector Similarity, result mixers, page through all results requested, sample data sets, optional spell correction, optional search/result expiration service, easily extensible Connector and Mixer objects, and a welcoming community for collaboration and support.

github

: 2.7k

databerry

Chaindesk is a no-code platform that allows users to easily set up a semantic search system for personal data without technical knowledge. It supports loading data from various sources such as raw text, web pages, files (Word, Excel, PowerPoint, PDF, Markdown, Plain Text), and upcoming support for web sites, Notion, and Airtable. The platform offers a user-friendly interface for managing datastores, querying data via a secure API endpoint, and auto-generating ChatGPT Plugins for each datastore. Chaindesk utilizes a Vector Database (Qdrant), Openai's text-embedding-ada-002 for embeddings, and has a chunk size of 1024 tokens. The technology stack includes Next.js, Joy UI, LangchainJS, PostgreSQL, Prisma, and Qdrant, inspired by the ChatGPT Retrieval Plugin.

github

: 2.9k

knowledge

Knowledge is a tool for saving, searching, accessing, exploring and chatting with all of your favorite websites, documents and files. Dive into a more interactive learning experience with Knowledge's new Chat feature! Engage in dynamic conversations with your Projects and Sources, leveraging the power of Large Language Models. The Chat feature is designed to transform the way you interact with your data, offering a more engaging and exploratory approach to learning. Unleash the power of context with the built-in Chromium browser. Transform your browsing into knowledge gathering effortlessly.

github

: 1.3k

obsidian-arcana

Arcana is a plugin for Obsidian that offers a collection of AI-powered tools inspired by famous historical figures to enhance creativity and productivity. It includes tools for conversation, text-to-speech transcription, speech-to-text replies, metadata markup, text generation, file moving, flashcard generation, auto tagging, and note naming. Users can interact with these tools using the command palette and sidebar views, with an OpenAI API key required for usage. The plugin aims to assist users in various note-taking and knowledge management tasks within the Obsidian vault environment.

github

: 78