indexify

indexify

A realtime serving engine for Data-Intensive Generative AI Applications

Stars: 950

Visit
 screenshot

Indexify is an open-source engine for building fast data pipelines for unstructured data (video, audio, images, and documents) using reusable extractors for embedding, transformation, and feature extraction. LLM Applications can query transformed content friendly to LLMs by semantic search and SQL queries. Indexify keeps vector databases and structured databases (PostgreSQL) updated by automatically invoking the pipelines as new data is ingested into the system from external data sources. **Why use Indexify** * Makes Unstructured Data **Queryable** with **SQL** and **Semantic Search** * **Real-Time** Extraction Engine to keep indexes **automatically** updated as new data is ingested. * Create **Extraction Graph** to describe **data transformation** and extraction of **embedding** and **structured extraction**. * **Incremental Extraction** and **Selective Deletion** when content is deleted or updated. * **Extractor SDK** allows adding new extraction capabilities, and many readily available extractors for **PDF**, **Image**, and **Video** indexing and extraction. * Works with **any LLM Framework** including **Langchain**, **DSPy**, etc. * Runs on your laptop during **prototyping** and also scales to **1000s of machines** on the cloud. * Works with many **Blob Stores**, **Vector Stores**, and **Structured Databases** * We have even **Open Sourced Automation** to deploy to Kubernetes in production.

README:

Indexify

Discord

Create and Deploy Durable, Data-Intensive Agentic Workflows

Indexify simplifies building and serving durable, multi-stage workflows as inter-connected Python functions and automagically deploys them as APIs.

A workflow encodes data ingestion and transformation stages that can be implemented using Python functions. Each of these functions is a logical compute unit that can be retried upon failure or assigned to specific hardware.


PDF Extraction Demo

To give you a taste of the project, in the above video - Indexify running PDF Extraction on a cluster of 3 machines. top left - A GPU accelerated machine running document layout and OCR model on a PDF, bottom left - chunking texts, embedding image and text using CLIP and a text embedding model. top right - A function writing image and text embeddings to ChromaDB. All three functions of the workflow are running in parallel and coordinated by the Indexify server.

[!NOTE]
Indexify is the Open-Source core compute engine that powers Tensorlake's Serverless Workflow Engine for processing unstructured data.

💡 Use Cases

Indexify is a versatile data processing framework for all kinds of use cases, including:

⭐ Key Features

  • Dynamic Routing: Route data to different specialized models based on conditional branching logic.
  • Local Inference: Execute LLMs directly within workflow functions using LLamaCPP, vLLM, or Hugging Face Transformers.
  • Distributed Processing: Run functions in parallel across machines so that results across functions can be combined as they complete.
  • Workflow Versioning: Version compute graphs to update previously processed data to reflect the latest functions and models.
  • Resource Allocation: Span workflows across GPU and CPU instances so that functions can be assigned to their optimal hardware.
  • Request Optimization: Maximize GPU utilization by automatically queuing and batching invocations in parallel.

⚙️ Installation

Install Indexify's SDK and CLI into your development environment:

pip install indexify

📚 A Minimal Example

Define a workflow by implementing its data transformation as composable Python functions. Functions decorated with @indexify_function(). These functions form the edges of a Graph, which is the representation of a compute graph.
Functions serve as discrete units within a Graph, defining the boundaries for retry attempts and resource allocation. They separate computationally heavy tasks like LLM inference from lightweight ones like database writes.
The example below is a pipeline that calculates the sum of squares for the first consecutive whole numbers.

from pydantic import BaseModel
from indexify import indexify_function, indexify_router, Graph
from typing import List, Union

class Document(BaseModel):
   pages: List[str]

# Parse a pdf and extract text
@indexify_function()
def process_document(file: File) -> Document:
    # Process a PDF and extract pages

class TextChunk(BaseModel):
   chunk: str
   page_number: int

# Chunk the pages for embedding and retreival
@indexify_function()
def chunk_document(document: Document) -> List[TextChunk]:
    # Split the pages

# Embed a single chunk.
# Note: (Automatic Map) Indexify automatically parallelize functions when they consume an element
# from functions that produces a List
@indexify_functions()
def embed_and_write(chunk: TextChunk) -> ChunkEmbedding:
    # run an embedding model on the chunk
    # write_to_db

# Constructs a compute graph connecting the three functions defined above into a workflow that generates
# runs them as a pipeline
graph = Graph(name="document_ingestion_pipeline", start_node=process_document, description="...")
graph.add_edge(process_document, chunk_document)
graph.add_edge(chunk_document, embed_and_write)

Read the Docs to learn more about how to test, deploy and create API endpoints for Workflows.

📖 Next Steps

🗺️ Roadmap

⏳ Scheduler

  • Function Batching: Process multiple functions in a single batch to improve efficiency.
  • Data Localized Execution: Boost performance by prioritizing execution on machines where intermediate outputs exist already.
  • Reducer Optimizations: Optimize performance by batching the serial execution of reduced function calls.
  • Parallel Scheduling: Reduce latency by enabling parallel execution across multiple machines.
  • Cyclic Graph Support: Enable more flexible agentic behaviors by leveraging cycles in graphs.
  • Ephemeral Graphs: Perform multi-stage inference and retrieval without persisting intermediate outputs.
  • Data Loader Functions: Stream values into graphs over time using the yield keyword.

🛠️ SDK

  • TypeScript SDK: Build an SDK for writing workflows in Typescript.

Star History

Star History Chart

Contributors

contributors

↑ Back to Top ↑

For Tasks:

Click tags to check more tools for each tasks

For Jobs:

Alternative AI tools for indexify

Similar Open Source Tools

For similar tasks

For similar jobs