island-ai

island-ai

None

Stars: 134

Visit
 screenshot

island-ai is a TypeScript toolkit tailored for developers engaging with structured outputs from Large Language Models. It offers streamlined processes for handling, parsing, streaming, and leveraging AI-generated data across various applications. The toolkit includes packages like zod-stream for interfacing with LLM streams, stream-hooks for integrating streaming JSON data into React applications, and schema-stream for JSON streaming parsing based on Zod schemas. Additionally, related packages like @instructor-ai/instructor-js focus on data validation and retry mechanisms, enhancing the reliability of data processing workflows.

README:

Island AI

> A TypeScript toolkit for building structured LLM data handling pipelines


docs llm-polyglot zod-stream evalz stream-hooks schema-stream docs

Overview

Island AI is a collection of low-level utilities and high-level tools for handling structured data streams from LLMs. The packages range from basic JSON streaming parsers to complete LLM clients, giving you the flexibility to build custom solutions or use pre-built integrations.

Core Packages

1. schema-stream

A foundational streaming JSON parser that enables immediate data access through structured stubs.

Key Features:

  • Streaming JSON parser with typed outputs
  • Default value support
  • Path completion tracking
  • Nested object and array support
import { SchemaStream } from "schema-stream";
import { z } from "zod";

// Define complex nested schemas
const schema = z.object({
  layer1: z.object({
    layer2: z.object({
      value: z.string(),
      layer3: z.object({
        layer4: z.object({
          layer5: z.string()
        })
      })
    })
  }),
  someArray: z.array(z.object({
    someString: z.string(),
    someNumber: z.number()
  }))
});

// Get a readable stream of json (from an api or otherwise)
async function getSomeStreamOfJson(
  jsonString: string
): Promise<{ body: ReadableStream }> {
  const stream = new ReadableStream({
    start(controller) {
      const encoder = new TextEncoder()
      const jsonBytes = encoder.encode(jsonString)

      for (let i = 0; i < jsonBytes.length; ) {
        const chunkSize = Math.floor(Math.random() * 5) + 2
        const chunk = jsonBytes.slice(i, i + chunkSize)
        controller.enqueue(chunk)
        i += chunkSize
      }
      controller.close()
    },
  })

  return { body: stream }
}


// Create parser with completion tracking
const parser = new SchemaStream(schema, {
  onKeyComplete({ completedPaths }) {
    console.log('Completed paths:', completedPaths);
  }
});

// Get the readabale stream to parse
const readableStream = await getSomeStreamOfJson(
  `{"someString": "Hello schema-stream", "someNumber": 42000000}`
)

// Parse streaming data
const stream = parser.parse();
readableStream.pipeThrough(stream);

// Get typed results
const reader = stream.readable.getReader();
const decoder = new TextDecoder()
let result = {}
let complete = false

while (true) {
  const { value, done } = await reader.read();
  complete = done
  
  if (complete) break;
  
  result = JSON.parse(decoder.decode(value));
  // result is fully typed based on schema
}

2. zod-stream

Extends schema-stream with OpenAI integration and Zod-specific features.

Key Features:

  • OpenAI completion streaming
  • Multiple response modes (TOOLS, FUNCTIONS, JSON, etc.)
  • Schema validation during streaming
import { OAIStream } from "zod-stream";
import { withResponseModel } from "zod-stream";
import { z } from "zod";

// Define extraction schema
const ExtractionSchema = z.object({
  users: z.array(z.object({
    name: z.string(),
    handle: z.string(),
    twitter: z.string()
  })).min(3),
  location: z.string(),
  budget: z.number()
});

// Configure OpenAI params with schema
const params = withResponseModel({
  response_model: { 
    schema: ExtractionSchema, 
    name: "Extract" 
  },
  params: {
    messages: [{ role: "user", content: textBlock }],
    model: "gpt-4"
  },
  mode: "TOOLS"
});

// Stream completions
const stream = OAIStream({ 
  res: await oai.chat.completions.create({
    ...params,
    stream: true
  })
});

// Process results
const client = new ZodStream();
const extractionStream = await client.create({
  completionPromise: () => stream,
  response_model: { 
    schema: ExtractionSchema, 
    name: "Extract" 
  }
});

for await (const data of extractionStream) {
  console.log('Progressive update:', data);
}

3. stream-hooks

React hooks for consuming streaming JSON data with Zod schema validation.

Key Features:

  • Ready-to-use React hooks
  • Automatic schema validation
  • Progress tracking
  • Error handling
import { useJsonStream } from "stream-hooks";

function DataViewer() {
  const { loading, startStream, data, error } = useJsonStream({
    schema: ExtractionSchema,
    onReceive: (update) => {
      console.log('Progressive update:', update);
    },
  });

  return (
    <div>
      {loading && <div>Loading...</div>}
      {error && <div>Error: {error.message}</div>}
      {data && (
        <pre>{JSON.stringify(data, null, 2)}</pre>
      )}
      <button onClick={() => startStream({
        url: "/api/extract",
        method: "POST",
        body: { text: "..." }
      })}>
        Start Extraction
      </button>
    </div>
  );
}

4. evalz

Structured evaluation tools for assessing LLM outputs across multiple dimensions. Built with TypeScript and integrated with OpenAI and Instructor, it enables both automated evaluation and human-in-the-loop assessment workflows.

Key Features:

  • 🎯 Model-Graded Evaluation: Leverage LLMs to assess response quality
  • 📊 Accuracy Measurement: Compare outputs using semantic and lexical similarity
  • 🔍 Context Validation: Evaluate responses against source materials
  • ⚖️ Composite Assessment: Combine multiple evaluation types with custom weights
// Combine different evaluator types
const compositeEval = createWeightedEvaluator({
  evaluators: {
    entities: createContextEvaluator({ type: "entities-recall" }),
    accuracy: createAccuracyEvaluator({
      weights: { 
        factual: 0.9,   // High weight on exact matches
        semantic: 0.1    // Low weight on similar terms
      }
    }),
    quality: createEvaluator({
      client: oai,
      model: "gpt-4-turbo",
      evaluationDescription: "Rate quality"
    })
  },
  weights: {
    entities: 0.3,
    accuracy: 0.4,
    quality: 0.3
  }
});

// Must provide all required fields for each evaluator type
await compositeEval({
  data: [{
    prompt: "Summarize the earnings call",
    completion: "CEO Jane Smith announced 15% growth",
    expectedCompletion: "The CEO reported strong growth",
    groundTruth: "CEO discussed Q3 performance",
    contexts: [
      "CEO Jane Smith presented Q3 results",
      "Company saw 15% growth in Q3 2023"
    ]
  }]
});

5. llm-polyglot

A universal LLM client that extends the OpenAI SDK to provide consistent interfaces across different providers that may not follow the OpenAI API specification.

Native API Support Status:

Provider API Status Chat Basic Stream Functions/Tool calling Function streaming Notes
OpenAI Direct SDK proxy
Anthropic Claude models
Google Gemini models + context caching
Azure 🚧 OpenAI model hosting
Cohere - - - - Not supported
AI21 - - - - Not supported

Stream Types:

  • Basic Stream: Simple text streaming
  • Partial JSON Stream: Progressive JSON object construction during streaming
  • Function Stream: Streaming function/tool calls and their results

OpenAI-Compatible Hosting Providers:

These providers use the OpenAI SDK format, so they work directly with the OpenAI client configuration:

Provider How to Use Available Models
Together Use OpenAI client with Together base URL Mixtral, Llama, OpenChat, Yi, others
Anyscale Use OpenAI client with Anyscale base URL Mistral, Llama, others
Perplexity Use OpenAI client with Perplexity base URL pplx-* models
Replicate Use OpenAI client with Replicate base URL Various open models

Key Features:

  • OpenAI-compatible interface for non-OpenAI providers
  • Support for major providers:
    • OpenAI (direct SDK proxy)
    • Anthropic (Claude models)
    • Google (Gemini models)
    • Together
    • Microsoft/Azure
    • Anyscale
  • Streaming support across providers
  • Function/tool calling compatibility
  • Context caching for Gemini
  • Structured output support

Basic OpenAI-Style Usage

import { createLLMClient } from "llm-polyglot";

// Create provider-specific client
const anthropicClient = createLLMClient({
  provider: "anthropic"
});

// Use consistent OpenAI-style API
const completion = await anthropicClient.chat.completions.create({
  model: "claude-3-opus-20240229",
  max_tokens: 1000,
  messages: [{ role: "user", content: "Extract data..." }]
});

Streaming with Different Providers

// Anthropic Streaming
const stream = await anthropicClient.chat.completions.create({
  model: "claude-3-opus-20240229",
  max_tokens: 1000,
  stream: true,
  messages: [{ role: "user", content: "Stream some content..." }]
});

let content = "";
for await (const chunk of stream) {
  content += chunk.choices?.[0]?.delta?.content ?? "";
}

// Google/Gemini with Context Caching
const googleClient = createLLMClient({
  provider: "google"
});

// Create a context cache
const cache = await googleClient.cacheManager.create({
  model: "gemini-1.5-flash-8b",
  messages: [{ 
    role: "user", 
    content: "What is the capital of Montana?" 
  }],
  ttlSeconds: 3600, // Cache for 1 hour
  max_tokens: 1000
});

// Use cached context in new completion
const completion = await googleClient.chat.completions.create({
  model: "gemini-1.5-flash-8b",
  messages: [{ 
    role: "user", 
    content: "What state is it in?" 
  }],
  additionalProperties: {
    cacheName: cache.name
  },
  max_tokens: 1000
});

Function/Tool Calling

const completion = await anthropicClient.chat.completions.create({
  model: "claude-3-opus-20240229",
  max_tokens: 1000,
  messages: [{ 
    role: "user", 
    content: "Extract user information..." 
  }],
  tool_choice: {
    type: "function",
    function: { name: "extract_user" }
  },
  tools: [{
    type: "function",
    function: {
      name: "extract_user",
      description: "Extract user information",
      parameters: {
        type: "object",
        properties: {
          name: { type: "string" },
          age: { type: "number" }
        },
        required: ["name", "age"]
      }
    }
  }]
});

// Using with OpenAI-compatible providers:

const client = createLLMClient({
  provider: "openai",
  apiKey: "your_api_key",
  baseURL: "https://api.together.xyz/v1",  // or other provider URLs
});

Provider-Specific Features

  1. Anthropic (Claude)

    • Full function/tool calling support
    • Message streaming
    • OpenAI-compatible responses
  2. Google (Gemini)

    • Context caching for token optimization
    • Streaming support
    • Function calling
    • Optional OpenAI compatibility mode
    • Grounding (i.e. Google Search) support
  3. OpenAI

    • Direct SDK proxy
    • All native OpenAI features supported

zod-stream/schema-stream vs Instructor

The core Island AI packages provides more low-level utilities for building custom LLM clients and data handling pipelines (schema-stream, zod-stream, stream-hooks). For a complete, ready-to-use solution, check out Instructor, which composes some of these tools into a full-featured client.

When to use core packages:

  • You need direct access to the HTTP stream for custom transport (e.g., not using SSE/WebSockets)
  • You want to build a custom LLM client
  • You need fine-grained control over streaming and parsing
  • You're implementing server-side streaming with client-side parsing
  • You need a structured evaluation tool
  • You want to use different LLM providers that don't support the OpenAI SDK format

When to use Instructor:

  • You want a complete solution for structured extraction
  • You're using WebSocket-based streaming from server to client
  • You're requests are only on the server
  • You need the full async generator pattern for progressive object updates
  • You want OpenAI SDK compatibility out of the box

Transport Patterns

Direct HTTP Streaming

For cases where you need direct control over the HTTP stream, you can use the core packages to build your own streaming endpoints:

import { OAIStream } from "zod-stream";
import { withResponseModel } from "zod-stream";
import OpenAI from "openai";
import { z } from "zod";

const oai = new OpenAI({
  apiKey: process.env.OPENAI_API_KEY,
  organization: process.env.OPENAI_ORG_ID
});

// Define your schema
const schema = z.object({
  content: z.string(),
  users: z.array(z.object({
    name: z.string(),
  })),
});

// API Route Example (Next.js)
export async function POST(request: Request) {
  const { messages } = await request.json();

  // Configure OpenAI parameters with schema
  const params = withResponseModel({
    response_model: { 
      schema: schema, 
      name: "Users extraction and message" 
    },
    params: {
      messages,
      model: "gpt-4",
    },
    mode: "TOOLS",
  });

  // Create streaming completion
  const extractionStream = await oai.chat.completions.create({
    ...params,
    stream: true,
  });

  // Return streaming response
  return new Response(
    OAIStream({ res: extractionStream })
  );
}

// Client-side consumption
async function consumeStream() {
  const response = await fetch('/api/extract', {
    method: 'POST',
    body: JSON.stringify({
      messages: [{ role: 'user', content: 'Your prompt here' }]
    })
  });

  const parser = new SchemaStream(schema);
  const stream = parser.parse();

  response.body
    ?.pipeThrough(stream)
    .pipeTo(new WritableStream({
      write(chunk) {
        const data = JSON.parse(new TextDecoder().decode(chunk));
        // Use partial data as it arrives
        console.log('Partial data:', data);
      }
    }));
}

Using Instructor

schema-stream instructor-js

Instructor provides a high-level client that composes Island AI's core packages into a complete solution for structured extraction. It extends the OpenAI client with streaming and schema validation capabilities.

import Instructor from "@instructor-ai/instructor";
import OpenAI from "openai";
import { z } from "zod";

// Define your extraction schema
const ExtractionSchema = z.object({
  users: z.array(
    z.object({
      name: z.string(),
      handle: z.string(),
      twitter: z.string()
    })
  ).min(3),
  location: z.string(),
  budget: z.number()
});

type Extraction = Partial<z.infer<typeof ExtractionSchema>>;

// Initialize OpenAI client with Instructor
const oai = new OpenAI({
  apiKey: process.env.OPENAI_API_KEY,
  organization: process.env.OPENAI_ORG_ID
});

const client = Instructor({
  client: oai,
  mode: "TOOLS"
});

// Stream completions with structured output
const extractionStream = await client.chat.completions.create({
  messages: [{ 
    role: "user", 
    content: "Your text content here..." 
  }],
  model: "gpt-4",
  response_model: { 
    schema: ExtractionSchema, 
    name: "Extract" 
  },
  max_retries: 3,
  stream: true,
  stream_options: {
    include_usage: true
  }
});

// Process streaming results
let extraction: Extraction = {};
for await (const result of extractionStream) {
  extraction = result;
  console.log('Progressive update:', result);
}

console.log('Final extraction:', extraction);

Key Differences

  1. Instructor

    • Provides a complete solution built on top of the OpenAI SDK
    • Handles retries, validation, and streaming automatically
    • Returns an async generator for progressive updates
    • Ideal for WebSocket-based streaming from server to client
    • Simpler integration when you don't need low-level control
  2. Direct HTTP Streaming

    • Gives you direct access to the HTTP stream
    • Allows custom transport mechanisms
    • Enables server-side streaming with client-side parsing
    • More flexible for custom implementations
    • Better for scenarios where you need to minimize server processing

Contributing

We welcome contributions! Check out our issues labeled as good-first-issue or help-wanted.

Documentation

For detailed documentation, visit https://island.hack.dance

License

MIT License - see LICENSE file for details.

For Tasks:

Click tags to check more tools for each tasks

For Jobs:

Alternative AI tools for island-ai

Similar Open Source Tools

For similar tasks

For similar jobs