ocrbase

ocrbase

Easy to use state of the art VLM(ocr+llm) via SDK & API. (đź“„ PDF ->.MD/.JSON). Self-hostable.

Stars: 869

Visit
 screenshot

ocrbase is a tool designed to turn PDFs into structured data at scale. It utilizes the PaddleOCR-VL-1.5 0.9B OCR model for accurate text extraction and allows users to define schemas for structured extraction, receiving JSON outputs. The tool is built for scalability with queue-based scaling using BullMQ and provides real-time updates through WebSocket notifications. Users can self-host the tool on their own infrastructure using the provided Self-Hosting Guide, making it a versatile solution for document processing needs.

README:

ocrbase

Turn PDFs into structured data at scale. Powered by frontier open-weight OCR models.

Features

  • Best-in-class OCR - PaddleOCR-VL-1.5 0.9B for accurate text extraction
  • Structured extraction - Define schemas, get JSON back
  • Built for scale - Queue-based scaling using BullMQ
  • Real-time updates - WebSocket notifications for job progress
  • Self-hostable - Run on your own infrastructure using Self-Hosting Guide

SDK

NOTE: TS SDK is currently moving to ocrbase-typescript

API Docs

  • OpenAPI UI: https://api.ocrbase.dev/openapi
  • OpenAPI JSON: https://api.ocrbase.dev/openapi/json

API Usage

# Parse a document
curl -X POST https://api.ocrbase.dev/v1/parse \
  -H "Authorization: Bearer sk_xxx" \
  -F "[email protected]"

# Extract with schema
curl -X POST https://api.ocrbase.dev/v1/extract \
  -H "Authorization: Bearer sk_xxx" \
  -F "[email protected]" \
  -F "schemaId=inv_schema_123"

NOTE: Jobs are processed asynchronously.

Realtime Updates

# Subscribe to job status updates
wscat -c "wss://api.ocrbase.dev/v1/realtime?job_id=job_xxx" \
  -H "Authorization: Bearer sk_xxx"

Health Checks

  • GET /v1/health/live
  • GET /v1/health/ready

LLM Integration

Best practice: Parse documents with ocrbase before sending to LLMs. Raw PDF binary wastes tokens and produces poor results.

Self-Hosting

See Self-Hosting Guide for deployment instructions.

Requirements: Docker, Bun

Architecture

Architecture Diagram

License

MIT - See LICENSE for details.

Contact

For API access, on-premise deployment, or questions: [email protected]

For Tasks:

Click tags to check more tools for each tasks

For Jobs:

Alternative AI tools for ocrbase

Similar Open Source Tools

No tools available

For similar tasks

For similar jobs