
docutranslate
文档(小说、论文、字幕)翻译工具(支持 pdf/word/excel/json/epub/srt...)Document (Novel, Thesis, Subtitle) Translation Tool (Supports pdf/word/excel/json/epub/srt...)
Stars: 71

Docutranslate is a versatile tool for translating documents efficiently. It supports multiple file formats and languages, making it ideal for businesses and individuals needing quick and accurate translations. The tool uses advanced algorithms to ensure high-quality translations while maintaining the original document's formatting. With its user-friendly interface, Docutranslate simplifies the translation process and saves time for users. Whether you need to translate legal documents, technical manuals, or personal letters, Docutranslate is the go-to solution for all your document translation needs.
README:
A lightweight local file translation tool based on a large language model.
- ✅ Supports Multiple Formats: Can translate various files such as
pdf
,docx
,xlsx
,md
,txt
,json
,epub
,srt
, and more. - ✅ Automatic Glossary Generation: Supports automatic generation of glossaries to ensure term alignment.
- ✅ PDF Table, Formula, and Code Recognition: With the
docling
andmineru
PDF parsing engines, it can recognize and translate tables, formulas, and code frequently found in academic papers. - ✅ JSON Translation: Supports specifying the values to be translated in JSON via JSON paths (using
jsonpath-ng
syntax). - ✅ Word/Excel Format-Preserving Translation: Supports translating
docx
andxlsx
files (currently notdoc
orxls
files) while preserving the original formatting. - ✅ Multi-AI Platform Support: Supports most AI platforms, enabling high-performance, concurrent AI translation with custom prompts.
- ✅ Asynchronous Support: Designed for high-performance scenarios, it offers complete asynchronous support, providing service interfaces for parallel multitasking.
- ✅ LAN and Multi-user Support: Supports simultaneous use by multiple users on a local area network.
- ✅ Interactive Web Interface: Provides an out-of-the-box Web UI and RESTful API for easy integration and use.
- ✅ Small-Footprint, Multi-Platform "Lazy" Packages: Windows and Mac "lazy" packages under 40MB (for versions not
using
docling
for local PDF parsing).
When translating
QQ Discussion Group: 1047781902
For users who want to get started quickly, we provide all-in-one packages on GitHub Releases. Simply download, unzip, and enter your AI platform API-Key to start using.
-
DocuTranslate: Standard version, uses the online
minerU
engine to parse PDF documents. Choose this version if you don't need local PDF parsing (recommended). -
DocuTranslate_full: Full version, includes the built-in
docling
local PDF parsing engine. Choose this version if you need local PDF parsing.
# Basic installation
pip install docutranslate
# To use docling for local PDF parsing
pip install docutranslate[docling]
# Initialize environment
uv init
# Basic installation
uv add docutranslate
# Install docling extension
uv add docutranslate[docling]
# Initialize environment
git clone https://github.com/xunbu/docutranslate.git
cd docutranslate
uv sync
The core of the new DocuTranslate is the Workflow. Each workflow is a complete, end-to-end translation pipeline designed specifically for a particular file type. You no longer interact with a monolithic class; instead, you select and configure a suitable workflow based on your file type.
The basic usage process is as follows:
-
Select a Workflow: Choose a workflow based on your input file type (e.g., PDF/Word or TXT), such as
MarkdownBasedWorkflow
orTXTWorkflow
. -
Build the Configuration: Create the corresponding configuration object for the selected workflow (e.g.,
MarkdownBasedWorkflowConfig
). This configuration object contains all the necessary sub-configurations, such as:- Converter Config: Defines how to convert the original file (e.g., PDF) to Markdown.
- Translator Config: Defines which LLM, API-Key, target language, etc., to use.
- Exporter Config: Defines specific options for the output format (e.g., HTML).
- Instantiate the Workflow: Create a workflow instance using the configuration object.
-
Execute the Translation: Call the workflow's
.read_*()
and.translate()
/.translate_async()
methods. -
Export/Save the Result: Call the
.export_to_*()
or.save_as_*()
methods to get or save the translation result.
Workflow | Use Case | Input Formats | Output Formats | Core Config Class |
---|---|---|---|---|
MarkdownBasedWorkflow |
Processes rich text documents like PDF, Word, images, etc. The process is: File -> Markdown -> Translate -> Export . |
.pdf , .docx , .md , .png , .jpg , etc. |
.md , .zip , .html
|
MarkdownBasedWorkflowConfig |
TXTWorkflow |
Processes plain text documents. The process is: txt -> Translate -> Export . |
.txt and other plain text formats |
.txt , .html
|
TXTWorkflowConfig |
JsonWorkflow |
Processes JSON files. The process is: json -> Translate -> Export . |
.json |
.json , .html
|
JsonWorkflowConfig |
DocxWorkflow |
Processes docx files. The process is: docx -> Translate -> Export . |
.docx |
.docx , .html
|
docxWorkflowConfig |
XlsxWorkflow |
Processes xlsx files. The process is: xlsx -> Translate -> Export . |
.xlsx , .csv
|
.xlsx , .html
|
XlsxWorkflowConfig |
SrtWorkflow |
Processes srt files. The process is: srt -> Translate -> Export . |
.srt |
.srt , .html
|
SrtWorkflowConfig |
EpubWorkflow |
Processes epub files. The process is: epub -> Translate -> Export . |
.epub |
.epub , .html
|
EpubWorkflowConfig |
HtmlWorkflow |
Processes html files. The process is: html -> Translate -> Export . |
.html , .htm
|
.html |
HtmlWorkflowConfig |
In the interactive interface, you can export to PDF format.
For ease of use, DocuTranslate provides a full-featured Web interface and RESTful API.
Starting the Service:
# Start the service, listening on port 8010 by default
docutranslate -i
# Start on a specific port
docutranslate -i -p 8011
# You can also specify the port via an environment variable
export DOCUTRANSLATE_PORT=8011
docutranslate -i
-
Interactive Interface: After starting the service, please visit
http://127.0.0.1:8010
(or your specified port) in your browser. -
API Documentation: The complete API documentation (Swagger UI) is available at
http://127.0.0.1:8010/docs
.
This is the most common use case. We will use the minerU
engine to convert the PDF to Markdown, and then use an LLM
for translation. Here is an example using the asynchronous approach.
import asyncio
from docutranslate.workflow.md_based_workflow import MarkdownBasedWorkflow, MarkdownBasedWorkflowConfig
from docutranslate.converter.x2md.converter_mineru import ConverterMineruConfig
from docutranslate.translator.ai_translator.md_translator import MDTranslatorConfig
from docutranslate.exporter.md.md2html_exporter import MD2HTMLExporterConfig
async def main():
# 1. Build the translator configuration
translator_config = MDTranslatorConfig(
base_url="https://open.bigmodel.cn/api/paas/v4", # AI platform Base URL
api_key="YOUR_ZHIPU_API_KEY", # AI platform API Key
model_id="glm-4-air", # Model ID
to_lang="English", # Target language
chunk_size=3000, # Text chunk size
concurrent=10, # Concurrency
# glossary_generate_enable=True, # Enable automatic glossary generation
# glossary_dict={"Jobs":"乔布斯"} # Pass in a glossary
)
# 2. Build the converter configuration (using minerU)
converter_config = ConverterMineruConfig(
mineru_token="YOUR_MINERU_TOKEN", # Your minerU Token
formula_ocr=True # Enable formula recognition
)
# 3. Build the main workflow configuration
workflow_config = MarkdownBasedWorkflowConfig(
convert_engine="mineru", # Specify the parsing engine
converter_config=converter_config, # Pass in the converter configuration
translator_config=translator_config, # Pass in the translator configuration
html_exporter_config=MD2HTMLExporterConfig(cdn=True) # HTML export configuration
)
# 4. Instantiate the workflow
workflow = MarkdownBasedWorkflow(config=workflow_config)
# 5. Read the file and execute the translation
print("Starting to read and translate the file...")
workflow.read_path("path/to/your/document.pdf")
await workflow.translate_async()
# Or use the synchronous method
# workflow.translate()
print("Translation complete!")
# 6. Save the results
workflow.save_as_html(name="translated_document.html")
workflow.save_as_markdown_zip(name="translated_document.zip")
workflow.save_as_markdown(name="translated_document.md") # Markdown with embedded images
print("Files have been saved to the ./output folder.")
# Or get the content strings directly
html_content = workflow.export_to_html()
markdown_content = workflow.export_to_markdown()
# print(html_content)
if __name__ == "__main__":
asyncio.run(main())
For plain text files, the process is simpler as it doesn't require a document parsing (conversion) step. Here is an example using the asynchronous approach.
import asyncio
from docutranslate.workflow.txt_workflow import TXTWorkflow, TXTWorkflowConfig
from docutranslate.translator.ai_translator.txt_translator import TXTTranslatorConfig
from docutranslate.exporter.txt.txt2html_exporter import TXT2HTMLExporterConfig
async def main():
# 1. Build the translator configuration
translator_config = TXTTranslatorConfig(
base_url="https://api.openai.com/v1/",
api_key="YOUR_OPENAI_API_KEY",
model_id="gpt-4o",
to_lang="Chinese",
)
# 2. Build the main workflow configuration
workflow_config = TXTWorkflowConfig(
translator_config=translator_config,
html_exporter_config=TXT2HTMLExporterConfig(cdn=True)
)
# 3. Instantiate the workflow
workflow = TXTWorkflow(config=workflow_config)
# 4. Read the file and execute the translation
workflow.read_path("path/to/your/notes.txt")
await workflow.translate_async()
# Or use the synchronous method
# workflow.translate()
# 5. Save the result
workflow.save_as_txt(name="translated_notes.txt")
print("TXT file has been saved.")
# You can also export the translated plain text
text = workflow.export_to_txt()
if __name__ == "__main__":
asyncio.run(main())
Here is an example using the asynchronous approach. The json_paths
item in JsonTranslatorConfig
needs to specify the
JSON paths to be translated (satisfying the jsonpath-ng
syntax). Only values matching the JSON paths will be
translated.
import asyncio
from docutranslate.exporter.js.json2html_exporter import Json2HTMLExporterConfig
from docutranslate.translator.ai_translator.json_translator import JsonTranslatorConfig
from docutranslate.workflow.json_workflow import JsonWorkflowConfig, JsonWorkflow
async def main():
# 1. Build the translator configuration
translator_config = JsonTranslatorConfig(
base_url="https://api.openai.com/v1/",
api_key="YOUR_OPENAI_API_KEY",
model_id="gpt-4o",
to_lang="Chinese",
json_paths=["$.*", "$.name"] # Satisfies jsonpath-ng syntax, values at matching paths will be translated
)
# 2. Build the main workflow configuration
workflow_config = JsonWorkflowConfig(
translator_config=translator_config,
html_exporter_config=Json2HTMLExporterConfig(cdn=True)
)
# 3. Instantiate the workflow
workflow = JsonWorkflow(config=workflow_config)
# 4. Read the file and execute the translation
workflow.read_path("path/to/your/notes.json")
await workflow.translate_async()
# Or use the synchronous method
# workflow.translate()
# 5. Save the result
workflow.save_as_json(name="translated_notes.json")
print("JSON file has been saved.")
# You can also export the translated JSON text
text = workflow.export_to_json()
if __name__ == "__main__":
asyncio.run(main())
Here is an example using the asynchronous approach.
import asyncio
from docutranslate.exporter.docx.docx2html_exporter import Docx2HTMLExporterConfig
from docutranslate.translator.ai_translator.docx_translator import DocxTranslatorConfig
from docutranslate.workflow.docx_workflow import DocxWorkflowConfig, DocxWorkflow
async def main():
# 1. Build the translator configuration
translator_config = DocxTranslatorConfig(
base_url="https://api.openai.com/v1/",
api_key="YOUR_OPENAI_API_KEY",
model_id="gpt-4o",
to_lang="Chinese",
insert_mode="replace", # Options: "replace", "append", "prepend"
separator="\n", # Separator used in "append" and "prepend" modes
)
# 2. Build the main workflow configuration
workflow_config = DocxWorkflowConfig(
translator_config=translator_config,
html_exporter_config=Docx2HTMLExporterConfig(cdn=True)
)
# 3. Instantiate the workflow
workflow = DocxWorkflow(config=workflow_config)
# 4. Read the file and execute the translation
workflow.read_path("path/to/your/notes.docx")
await workflow.translate_async()
# Or use the synchronous method
# workflow.translate()
# 5. Save the result
workflow.save_as_docx(name="translated_notes.docx")
print("docx file has been saved.")
# You can also export the translated docx as binary
text_bytes = workflow.export_to_docx()
if __name__ == "__main__":
asyncio.run(main())
Here is an example using the asynchronous approach.
import asyncio
from docutranslate.exporter.xlsx.xlsx2html_exporter import Xlsx2HTMLExporterConfig
from docutranslate.translator.ai_translator.xlsx_translator import XlsxTranslatorConfig
from docutranslate.workflow.xlsx_workflow import XlsxWorkflowConfig, XlsxWorkflow
async def main():
# 1. Build the translator configuration
translator_config = XlsxTranslatorConfig(
base_url="https://api.openai.com/v1/",
api_key="YOUR_OPENAI_API_KEY",
model_id="gpt-4o",
to_lang="Chinese",
insert_mode="replace", # Options: "replace", "append", "prepend"
separator="\n", # Separator used in "append" and "prepend" modes
)
# 2. Build the main workflow configuration
workflow_config = XlsxWorkflowConfig(
translator_config=translator_config,
html_exporter_config=Xlsx2HTMLExporterConfig(cdn=True)
)
# 3. Instantiate the workflow
workflow = XlsxWorkflow(config=workflow_config)
# 4. Read the file and execute the translation
workflow.read_path("path/to/your/notes.xlsx")
await workflow.translate_async()
# Or use the synchronous method
# workflow.translate()
# 5. Save the result
workflow.save_as_xlsx(name="translated_notes.xlsx")
print("xlsx file has been saved.")
# You can also export the translated xlsx as binary
text_bytes = workflow.export_to_xlsx()
if __name__ == "__main__":
asyncio.run(main())
The translation functionality relies on large language models. You need to obtain a base_url
, api_key
, and
model_id
from the respective AI platform.
Recommended models: Volcengine's
doubao-seed-1-6-250615
,doubao-seed-1-6-flash-250715
, Zhipu'sglm-4-flash
, Alibaba Cloud'sqwen-plus
,qwen-turbo
, Deepseek'sdeepseek-chat
, etc.
Platform Name | Get API Key | baseurl |
---|---|---|
ollama | http://127.0.0.1:11434/v1 | |
lm studio | http://127.0.0.1:1234/v1 | |
openrouter | Click to get | https://openrouter.ai/api/v1 |
openai | Click to get | https://api.openai.com/v1/ |
gemini | Click to get | https://generativelanguage.googleapis.com/v1beta/openai/ |
deepseek | Click to get | https://api.deepseek.com/v1 |
Zhipu AI | Click to get | https://open.bigmodel.cn/api/paas/v4 |
Tencent Hunyuan | Click to get | https://api.hunyuan.cloud.tencent.com/v1 |
Alibaba Cloud Bailian | Click to get | https://dashscope.aliyuncs.com/compatible-mode/v1 |
Volcengine | Click to get | https://ark.cn-beijing.volces.com/api/v3 |
SiliconFlow | Click to get | https://api.siliconflow.cn/v1 |
DMXAPI | Click to get | https://www.dmxapi.cn/v1 |
If you choose mineru
as the document parsing engine (convert_engine="mineru"
), you need to apply for a free token.
- Visit the minerU official website to register and apply for an API.
- Create a new API Token in the API Token management interface.
Note: minerU tokens have a 14-day validity period. Please re-create them after they expire.
If you choose docling
as the document parsing engine (convert_engine="docling"
), it will download the required
models from Hugging Face on first use.
A better option is to download
docling_artifact.zip
from GitHub releases and unzip it to your working directory.
Solution for network issues when downloading docling
models:
- Set a Hugging Face mirror (recommended):
-
Method A (environment variable): Set the system environment variable
HF_ENDPOINT
and restart your IDE or terminal.HF_ENDPOINT=https://hf-mirror.com
- Method B (set in code): Add the following code at the beginning of your Python script.
import os
os.environ['HF_ENDPOINT'] = 'https://hf-mirror.com'
- Offline use (download the model package in advance):
- Download
docling_artifact.zip
from GitHub Releases. - Unzip it to your project directory.
- Specify the model path in the configuration (if the model is not in the same directory as the script):
from docutranslate.converter.x2md.converter_docling import ConverterDoclingConfig
converter_config = ConverterDoclingConfig(
artifact="./docling_artifact", # Point to the unzipped folder
code_ocr=True,
formula_ocr=True
)
Q: What if port 8010 is occupied?
A: Use the -p
parameter to specify a new port, or set the DOCUTRANSLATE_PORT
environment variable.
Q: Does it support translation of scanned PDFs?
A: Yes. Please use the mineru
parsing engine, which has powerful OCR capabilities.
Q: Why is the first PDF translation so slow?
A: If you are using the docling
engine, it needs to download models from Hugging Face on its first run. Please refer
to the "Network Issues Solution" above to speed up this process.
Q: How to use it in an intranet (offline) environment? A: It is entirely possible. You need to meet the following conditions:
-
Local LLM: Use tools like Ollama or LM Studio to deploy a language
model locally, and fill in the
base_url
of the local model inTranslatorConfig
. -
Local PDF parsing engine (only needed for parsing PDFs): Use the
docling
engine and follow the "Offline use" instructions above to download the model package in advance.
Q: How does the PDF parsing cache mechanism work?
A: MarkdownBasedWorkflow
automatically caches the results of document parsing (file to Markdown conversion) to avoid
repeated parsing that consumes time and resources. The cache is stored in memory by default and records the last 10
parses. You can modify the cache size using the DOCUTRANSLATE_CACHE_NUM
environment variable.
Q: How to make the software go through a proxy?
A: The software does not use a proxy by default. You can enable it by setting the environment variable
DOCUTRANSLATE_PROXY_ENABLED
to true
.
Welcome to support the author. Please kindly mention the reason for your appreciation in the notes. ❤
For Tasks:
Click tags to check more tools for each tasksFor Jobs:
Alternative AI tools for docutranslate
Similar Open Source Tools

docutranslate
Docutranslate is a versatile tool for translating documents efficiently. It supports multiple file formats and languages, making it ideal for businesses and individuals needing quick and accurate translations. The tool uses advanced algorithms to ensure high-quality translations while maintaining the original document's formatting. With its user-friendly interface, Docutranslate simplifies the translation process and saves time for users. Whether you need to translate legal documents, technical manuals, or personal letters, Docutranslate is the go-to solution for all your document translation needs.

ruler
Ruler is a tool designed to centralize AI coding assistant instructions, providing a single source of truth for managing instructions across multiple AI coding tools. It helps in avoiding inconsistent guidance, duplicated effort, context drift, onboarding friction, and complex project structures by automatically distributing instructions to the right configuration files. With support for nested rule loading, Ruler can handle complex project structures with context-specific instructions for different components. It offers features like centralised rule management, nested rule loading, automatic distribution, targeted agent configuration, MCP server propagation, .gitignore automation, and a simple CLI for easy configuration management.

LLMTSCS
LLMLight is a novel framework that employs Large Language Models (LLMs) as decision-making agents for Traffic Signal Control (TSC). The framework leverages the advanced generalization capabilities of LLMs to engage in a reasoning and decision-making process akin to human intuition for effective traffic control. LLMLight has been demonstrated to be remarkably effective, generalizable, and interpretable against various transportation-based and RL-based baselines on nine real-world and synthetic datasets.

obsei
Obsei is an open-source, low-code, AI powered automation tool that consists of an Observer to collect unstructured data from various sources, an Analyzer to analyze the collected data with various AI tasks, and an Informer to send analyzed data to various destinations. The tool is suitable for scheduled jobs or serverless applications as all Observers can store their state in databases. Obsei is still in alpha stage, so caution is advised when using it in production. The tool can be used for social listening, alerting/notification, automatic customer issue creation, extraction of deeper insights from feedbacks, market research, dataset creation for various AI tasks, and more based on creativity.

factorio-learning-environment
Factorio Learning Environment is an open source framework designed for developing and evaluating LLM agents in the game of Factorio. It provides two settings: Lab-play with structured tasks and Open-play for building large factories. Results show limitations in spatial reasoning and automation strategies. Agents interact with the environment through code synthesis, observation, action, and feedback. Tools are provided for game actions and state representation. Agents operate in episodes with observation, planning, and action execution. Tasks specify agent goals and are implemented in JSON files. The project structure includes directories for agents, environment, cluster, data, docs, eval, and more. A database is used for checkpointing agent steps. Benchmarks show performance metrics for different configurations.

AI-Agent-Starter-Kit
AI Agent Starter Kit is a modern full-stack AI-enabled template using Next.js for frontend and Express.js for backend, with Telegram and OpenAI integrations. It offers AI-assisted development, smart environment variable setup assistance, intelligent error resolution, context-aware code completion, and built-in debugging helpers. The kit provides a structured environment for developers to interact with AI tools seamlessly, enhancing the development process and productivity.

dive
Dive is an AI toolkit for Go that enables the creation of specialized teams of AI agents and seamless integration with leading LLMs. It offers a CLI and APIs for easy integration, with features like creating specialized agents, hierarchical agent systems, declarative configuration, multiple LLM support, extended reasoning, model context protocol, advanced model settings, tools for agent capabilities, tool annotations, streaming, CLI functionalities, thread management, confirmation system, deep research, and semantic diff. Dive also provides semantic diff analysis, unified interface for LLM providers, tool system with annotations, custom tool creation, and support for various verified models. The toolkit is designed for developers to build AI-powered applications with rich agent capabilities and tool integrations.

wzry_ai
This is an open-source project for playing the game King of Glory with an artificial intelligence model. The first phase of the project has been completed, and future upgrades will be built upon this foundation. The second phase of the project has started, and progress is expected to proceed according to plan. For any questions, feel free to join the QQ exchange group: 687853827. The project aims to learn artificial intelligence and strictly prohibits cheating. Detailed installation instructions are available in the doc/README.md file. Environment installation video: (bilibili) Welcome to follow, like, tip, comment, and provide your suggestions.

llm-context.py
LLM Context is a tool designed to assist developers in quickly injecting relevant content from code/text projects into Large Language Model chat interfaces. It leverages `.gitignore` patterns for smart file selection and offers a streamlined clipboard workflow using the command line. The tool also provides direct integration with Large Language Models through the Model Context Protocol (MCP). LLM Context is optimized for code repositories and collections of text/markdown/html documents, making it suitable for developers working on projects that fit within an LLM's context window. The tool is under active development and aims to enhance AI-assisted development workflows by harnessing the power of Large Language Models.

evalchemy
Evalchemy is a unified and easy-to-use toolkit for evaluating language models, focusing on post-trained models. It integrates multiple existing benchmarks such as RepoBench, AlpacaEval, and ZeroEval. Key features include unified installation, parallel evaluation, simplified usage, and results management. Users can run various benchmarks with a consistent command-line interface and track results locally or integrate with a database for systematic tracking and leaderboard submission.

quantalogic
QuantaLogic is a ReAct framework for building advanced AI agents that seamlessly integrates large language models with a robust tool system. It aims to bridge the gap between advanced AI models and practical implementation in business processes by enabling agents to understand, reason about, and execute complex tasks through natural language interaction. The framework includes features such as ReAct Framework, Universal LLM Support, Secure Tool System, Real-time Monitoring, Memory Management, and Enterprise Ready components.

paperless-gpt
paperless-gpt is a tool designed to generate accurate and meaningful document titles and tags for paperless-ngx using Large Language Models (LLMs). It supports multiple LLM providers, including OpenAI and Ollama. With paperless-gpt, you can streamline your document management by automatically suggesting appropriate titles and tags based on the content of your scanned documents. The tool offers features like multiple LLM support, customizable prompts, easy integration with paperless-ngx, user-friendly interface for reviewing and applying suggestions, dockerized deployment, automatic document processing, and an experimental OCR feature.

yomitoku
YomiToku is a Japanese-focused AI document image analysis engine that provides full-text OCR and layout analysis capabilities for images. It recognizes, extracts, and converts text information and figures in images. It includes 4 AI models trained on Japanese datasets for tasks such as detecting text positions, recognizing text strings, analyzing layouts, and recognizing table structures. The models are specialized for Japanese document images, supporting recognition of over 7000 Japanese characters and analyzing layout structures specific to Japanese documents. It offers features like layout analysis, table structure analysis, and reading order estimation to extract information from document images without disrupting their semantic structure. YomiToku supports various output formats such as HTML, markdown, JSON, and CSV, and can also extract figures, tables, and images from documents. It operates efficiently in GPU environments, enabling fast and effective analysis of document transcriptions without requiring high-end GPUs.

sim
Sim is a platform that allows users to build and deploy AI agent workflows quickly and easily. It provides cloud-hosted and self-hosted options, along with support for local AI models. Users can set up the application using Docker Compose, Dev Containers, or manual setup with PostgreSQL and pgvector extension. The platform utilizes technologies like Next.js, Bun, PostgreSQL with Drizzle ORM, Better Auth for authentication, Shadcn and Tailwind CSS for UI, Zustand for state management, ReactFlow for flow editor, Fumadocs for documentation, Turborepo for monorepo management, Socket.io for real-time communication, and Trigger.dev for background jobs.

Zero
Zero is an open-source AI email solution that allows users to self-host their email app while integrating external services like Gmail. It aims to modernize and enhance emails through AI agents, offering features like open-source transparency, AI-driven enhancements, data privacy, self-hosting freedom, unified inbox, customizable UI, and developer-friendly extensibility. Built with modern technologies, Zero provides a reliable tech stack including Next.js, React, TypeScript, TailwindCSS, Node.js, Drizzle ORM, and PostgreSQL. Users can set up Zero using standard setup or Dev Container setup for VS Code users, with detailed environment setup instructions for Better Auth, Google OAuth, and optional GitHub OAuth. Database setup involves starting a local PostgreSQL instance, setting up database connection, and executing database commands for dependencies, tables, migrations, and content viewing.

desktop
E2B Desktop Sandbox is a secure virtual desktop environment powered by E2B, allowing users to create isolated sandboxes with customizable dependencies. It provides features such as streaming the desktop screen, mouse and keyboard control, taking screenshots, opening files, and running bash commands. The environment is based on Linux and Xfce, offering a fast and lightweight experience that can be fully customized to create unique desktop environments.
For similar tasks

docutranslate
Docutranslate is a versatile tool for translating documents efficiently. It supports multiple file formats and languages, making it ideal for businesses and individuals needing quick and accurate translations. The tool uses advanced algorithms to ensure high-quality translations while maintaining the original document's formatting. With its user-friendly interface, Docutranslate simplifies the translation process and saves time for users. Whether you need to translate legal documents, technical manuals, or personal letters, Docutranslate is the go-to solution for all your document translation needs.
For similar jobs

ChatFAQ
ChatFAQ is an open-source comprehensive platform for creating a wide variety of chatbots: generic ones, business-trained, or even capable of redirecting requests to human operators. It includes a specialized NLP/NLG engine based on a RAG architecture and customized chat widgets, ensuring a tailored experience for users and avoiding vendor lock-in.

anything-llm
AnythingLLM is a full-stack application that enables you to turn any document, resource, or piece of content into context that any LLM can use as references during chatting. This application allows you to pick and choose which LLM or Vector Database you want to use as well as supporting multi-user management and permissions.

ai-guide
This guide is dedicated to Large Language Models (LLMs) that you can run on your home computer. It assumes your PC is a lower-end, non-gaming setup.

classifai
Supercharge WordPress Content Workflows and Engagement with Artificial Intelligence. Tap into leading cloud-based services like OpenAI, Microsoft Azure AI, Google Gemini and IBM Watson to augment your WordPress-powered websites. Publish content faster while improving SEO performance and increasing audience engagement. ClassifAI integrates Artificial Intelligence and Machine Learning technologies to lighten your workload and eliminate tedious tasks, giving you more time to create original content that matters.

mikupad
mikupad is a lightweight and efficient language model front-end powered by ReactJS, all packed into a single HTML file. Inspired by the likes of NovelAI, it provides a simple yet powerful interface for generating text with the help of various backends.

glide
Glide is a cloud-native LLM gateway that provides a unified REST API for accessing various large language models (LLMs) from different providers. It handles LLMOps tasks such as model failover, caching, key management, and more, making it easy to integrate LLMs into applications. Glide supports popular LLM providers like OpenAI, Anthropic, Azure OpenAI, AWS Bedrock (Titan), Cohere, Google Gemini, OctoML, and Ollama. It offers high availability, performance, and observability, and provides SDKs for Python and NodeJS to simplify integration.

onnxruntime-genai
ONNX Runtime Generative AI is a library that provides the generative AI loop for ONNX models, including inference with ONNX Runtime, logits processing, search and sampling, and KV cache management. Users can call a high level `generate()` method, or run each iteration of the model in a loop. It supports greedy/beam search and TopP, TopK sampling to generate token sequences, has built in logits processing like repetition penalties, and allows for easy custom scoring.

firecrawl
Firecrawl is an API service that takes a URL, crawls it, and converts it into clean markdown. It crawls all accessible subpages and provides clean markdown for each, without requiring a sitemap. The API is easy to use and can be self-hosted. It also integrates with Langchain and Llama Index. The Python SDK makes it easy to crawl and scrape websites in Python code.