lmstudio.js

LM Studio TypeScript SDK (pre-release public alpha)

Stars: 408

Visit

lmstudio.js is a pre-release alpha client SDK for LM Studio, allowing users to use local LLMs in JS/TS/Node. It is currently undergoing rapid development with breaking changes expected. Users can follow LM Studio's announcements on Twitter and Discord. The SDK provides API usage for loading models, predicting text, setting up the local LLM server, and more. It supports features like custom loading progress tracking, model unloading, structured output prediction, and cancellation of predictions. Users can interact with LM Studio through the CLI tool 'lms' and perform tasks like text completion, conversation, and getting prediction statistics.

README:

Use local LLMs in JS/TS/Node

LM Studio Client SDK - Pre-Release

Pre-Release Alpha

lmstudio.js is in pre-release alpha, and is undergoing rapid and continuous development. Expect breaking changes!

Follow along for our upcoming announcements about lmstudio.js on Twitter and Discord. Read the Docs.

Discuss all things lmstudio.js in #dev-chat in LM Studio's Community Discord server.

Installation

npm install @lmstudio/sdk

Quick project setup

npx lmstudio install-cli # open a new terminal window after installation...
lms create

API Usage

import { LMStudioClient } from "@lmstudio/sdk";

const client = new LMStudioClient();

async function main() {
  const modelPath = "lmstudio-community/Meta-Llama-3-8B-Instruct-GGUF";
  const llama3 = await client.llm.load(modelPath, { config: { gpuOffload: "max" } });
  const prediction = llama3.respond([
    { role: "system", content: "Always answer in rhymes." },
    { role: "user", content: "Please introduce yourself." },
  ]);

  for await (const text of prediction) {
    process.stdout.write(text);
  }

  const { stats } = await prediction;
  console.log(stats);
}

main();

Getting Started

Set up `lms` (CLI)

lms is the CLI tool for LM Studio. It is shipped with the latest versions of LM Studio. To set it up, run:

npx lmstudio install-cli

To check if the bootstrapping was successful, run the following in a 👉 new terminal window 👈:

lms

[!NOTE]

lms is only shipped with the latest version of LM Studio (v0.2.22 and onwards). Please make sure you have the latest version installed.

Start the local LLM server

Node.js script

Start the server by running:

lms server start

Web app

If you are developing a web application and/or need to enable CORS (Cross Origin Resource Sharing), run this instead:

lms server start --cors=true

Override the default port

lms server start --port 12345

Examples

Loading an LLM and Predicting with It

This example loads a model "lmstudio-community/Meta-Llama-3-8B-Instruct-GGUF" and predicts text with it.

import { LMStudioClient } from "@lmstudio/sdk";

const client = new LMStudioClient();

// Load a model
const llama3 = await client.llm.load("lmstudio-community/Meta-Llama-3-8B-Instruct-GGUF");

// Create a text completion prediction
const prediction = llama3.complete("The meaning of life is");

// Stream the response
for await (const text of prediction) {
  process.stdout.write(text);
}

[!NOTE]

About process.stdout.write

process.stdout.write is a Node.js-specific function that allows you to print text without a newline.

On the browser, you might want to do something like:
// Get the element where you want to display the output
const outputElement = document.getElementById("output");

for await (const text of prediction) {
  outputElement.textContent += text;
}

Using a Non-Default LM Studio Server Port

This example shows how to connect to LM Studio running on a different port (e.g., 8080).

import { LMStudioClient } from "@lmstudio/sdk";

const client = new LMStudioClient({
  baseUrl: "ws://127.0.0.1:8080",
});

// client.llm.load(...);

Loading a Model and Keeping It Loaded After Client Exit (daemon mode)

By default, when your client disconnects from LM Studio, all models loaded by that client are unloaded. You can prevent this by setting the noHup option to true.

await client.llm.load("lmstudio-community/Meta-Llama-3-8B-Instruct-GGUF", {
  noHup: true,
});

// The model stays loaded even after the client disconnects

Giving a Loaded Model a Friendly Name

You can set an identifier for a model when loading it. This identifier can be used to refer to the model later.

await client.llm.load("lmstudio-community/Meta-Llama-3-8B-Instruct-GGUF", {
  identifier: "my-model",
});

// You can refer to the model later using the identifier
const myModel = await client.llm.get("my-model");
// myModel.complete(...);

Loading a Model with a Custom Configuration

By default, the load configuration for a model comes from the preset associated with the model (Can be changed on the "My Models" page in LM Studio).

const llama3 = await client.llm.load("lmstudio-community/Meta-Llama-3-8B-Instruct-GGUF", {
  config: {
    contextLength: 1024,
    gpuOffload: 0.5, // Offloads 50% of the computation to the GPU
  },
});

// llama3.complete(...);

Loading a Model with a Specific Preset

The preset determines the default load configuration and the default inference configuration for a model. By default, the preset associated with the model is used. (Can be changed on the "My Models" page in LM Studio). You can change the preset used by specifying the preset option.

const llama3 = await client.llm.load("lmstudio-community/Meta-Llama-3-8B-Instruct-GGUF", {
  preset: "My ChatML",
});

Custom Loading Progress

You can track the loading progress of a model by providing an onProgress callback.

const llama3 = await client.llm.load("lmstudio-community/Meta-Llama-3-8B-Instruct-GGUF", {
  verbose: false, // Disables the default progress logging
  onProgress: progress => {
    console.log(`Progress: ${(progress * 100).toFixed(1)}%`);
  },
});

Listing all Models that can be Loaded

If you wish to find all models that are available to be loaded, you can use the listDownloadedModel method on the system object.

const downloadedModels = await client.system.listDownloadedModels();
const downloadedLLMs = downloadedModels.filter(model => model.type === "llm");

// Load the first model
const model = await client.llm.load(downloadedLLMs[0].path);
// model.complete(...);

Canceling a Load

You can cancel a load by using an AbortController.

const controller = new AbortController();

try {
  const llama3 = await client.llm.load("lmstudio-community/Meta-Llama-3-8B-Instruct-GGUF", {
    signal: controller.signal,
  });
  // llama3.complete(...);
} catch (error) {
  console.error(error);
}

// Somewhere else in your code:
controller.abort();

[!NOTE]

About AbortController

AbortController is a standard JavaScript API that allows you to cancel asynchronous operations. It is supported in modern browsers and Node.js. For more information, see the MDN Web Docs.

Unloading a Model

You can unload a model by calling the unload method.

const llama3 = await client.llm.load("lmstudio-community/Meta-Llama-3-8B-Instruct-GGUF", {
  identifier: "my-model",
});

// ...Do stuff...

await client.llm.unload("my-model");

Note, by default, all models loaded by a client are unloaded when the client disconnects. Therefore, unless you want to precisely control the lifetime of a model, you do not need to unload them manually.

[!NOTE]

Keeping a Model Loaded After Disconnection

If you wish to keep a model loaded after disconnection, you can set the noHup option to true when loading the model.

Using an Already Loaded Model

To look up an already loaded model by its identifier, use the following:

const myModel = await client.llm.get({ identifier: "my-model" });
// Or just
const myModel = await client.llm.get("my-model");

// myModel.complete(...);

To look up an already loaded model by its path, use the following:

// Matches any quantization
const llama3 = await client.llm.get({ path: "lmstudio-community/Meta-Llama-3-8B-Instruct-GGUF" });

// Or if a specific quantization is desired:
const llama3 = await client.llm.get({
  path: "lmstudio-community/Meta-Llama-3-8B-Instruct-GGUF/Meta-Llama-3-8B-Instruct-Q4_K_M.gguf",
});

// llama3.complete(...);

Using any Loaded Model

If you do not have a specific model in mind, and just want to use any loaded model, you can simply pass in an empty object to client.llm.get.

const anyModel = await client.llm.get({});
// anyModel.complete(...);

Listing All Loaded Models

To list all loaded models, use the client.llm.listLoaded method.

const loadedModels = await client.llm.listLoaded();

if (loadedModels.length === 0) {
  throw new Error("No models loaded");
}

// Use the first one
const firstModel = await client.llm.get({ identifier: loadedModels[0].identifier });
// firstModel.complete(...);

Example loadedModels Response:

[
  {
    "identifier": "lmstudio-community/Meta-Llama-3-8B-Instruct-GGUF",
    "path": "lmstudio-community/Meta-Llama-3-8B-Instruct-GGUF",
  },
  {
    "identifier": "microsoft/Phi-3-mini-4k-instruct-gguf/Phi-3-mini-4k-instruct-q4.gguf",
    "path": "microsoft/Phi-3-mini-4k-instruct-gguf/Phi-3-mini-4k-instruct-q4.gguf",
  },
]

Text Completion

To perform text completion, use the complete method:

const prediction = model.complete("The meaning of life is");

for await (const text of prediction) {
  process.stdout.write(text);
}

By default, the inference parameters in the preset is used for the prediction. You can override them like this:

const prediction = anyModel.complete("Meaning of life is", {
  contextOverflowPolicy: "stopAtLimit",
  maxPredictedTokens: 100,
  prePrompt: "Some pre-prompt",
  stopStrings: ["\n"],
  temperature: 0.7,
});

// ...Do stuff with the prediction...

Conversation

To perform a conversation, use the respond method:

const prediction = anyModel.respond([
  { role: "system", content: "Answer the following questions." },
  { role: "user", content: "What is the meaning of life?" },
]);

for await (const text of prediction) {
  process.stdout.write(text);
}

Similarly, you can override the inference parameters for the conversation (Note the available options are different from text completion):

const prediction = anyModel.respond(
  [
    { role: "system", content: "Answer the following questions." },
    { role: "user", content: "What is the meaning of life?" },
  ],
  {
    contextOverflowPolicy: "stopAtLimit",
    maxPredictedTokens: 100,
    stopStrings: ["\n"],
    temperature: 0.7,
    inputPrefix: "Q: ",
    inputSuffix: "\nA:",
  },
);

// ...Do stuff with the prediction...

[!IMPORTANT]

Always Provide the Full History/Context

LLMs are stateless. They do not remember or retain information from previous inputs. Therefore, when predicting with an LLM, you should always provide the full history/context.

Getting Prediction Stats

If you wish to get the prediction statistics, you can await on the prediction object to get a PredictionResult, through which you can access the stats via the stats property.

const prediction = model.complete("The meaning of life is");

for await (const text of prediction) {
  process.stdout.write(text);
}

const { stats } = await prediction;
console.log(stats);

[!NOTE]

No Extra Waiting

When you have already consumed the prediction stream, awaiting on the prediction object will not cause any extra waiting, as the result is cached within the prediction object.

On the other hand, if you only care about the final result, you don't need to iterate through the stream. Instead, you can await on the prediction object directly to get the final result.
const prediction = model.complete("The meaning of life is");
const result = await prediction;
const content = result.content;
const stats = result.stats;

// Or just:

const { content, stats } = await model.complete("The meaning of life is");

console.log(stats)

Example output for stats:

{
  "stopReason": "eosFound",
  "tokensPerSecond": 26.644333102146646,
  "numGpuLayers": 33,
  "timeToFirstTokenSec": 0.146,
  "promptTokensCount": 5,
  "predictedTokensCount": 694,
  "totalTokensCount": 699
}

Producing JSON (Structured Output)

LM Studio supports structured prediction, which will force the model to produce content that conforms to a specific structure. To enable structured prediction, you should set the structured field. It is available for both complete and respond methods.

Here is an example of how to use structured prediction:

const prediction = model.complete("Here is a joke in JSON:", {
  maxPredictedTokens: 100,
  structured: { type: "json" },
});

const result = await prediction;
try {
  // Although the LLM is guaranteed to only produce valid JSON, when it is interrupted, the
  // partial result might not be. Always check for errors. (See caveats below)
  const parsed = JSON.parse(result.content);
  console.info(parsed);
} catch (e) {
  console.error(e);
}

Example output:

{
 "title": "The Shawshank Redemption",
 "genre": [ "drama", "thriller" ],
 "release_year": 1994,
 "cast": [
   { "name": "Tim Robbins", "role": "Andy Dufresne" },
   { "name": "Morgan Freeman", "role": "Ellis Boyd" }
 ]
}

Sometimes, any JSON is not enough. You might want to enforce a specific JSON schema. You can do this by providing a JSON schema to the structured field. Read more about JSON schema at json-schema.org.

const bookSchema =  {
  type: "object",
  properties: {
    bookTitle: { type: "string" },
    author: { type: "string" },
    genre: { type: "string" },
    pageCount: { type: "number" },
  },
  required: ["bookTitle", "author", "genre"],
};

const prediction = model.complete("Books that were turned into movies:", {
  maxPredictedTokens: 100,
  structured: { type: "json", jsonSchema: bookSchema },
});

const result = await prediction;
try {
  const parsed = JSON.parse(result.content);

  console.info(parsed); // see example response below
  console.info("The bookTitle is", parsed.bookTitle); // The bookTitle is The Help
  console.info("The author is", parsed.author); // The author is Tina
  console.info("The genre is", parsed.genre); // The genre is Historical Fiction
  console.info("The pageCount is", parsed.pageCount); // The pageCount is 320

} catch (e) {
  console.error(e);
}

Example response for parsed:

{
  "author": "J.K. Rowling",
  "bookTitle": "Harry Potter and the Philosopher's Stone",
  "genre": "Fantasy",
  "pageCount": 320
}

[!IMPORTANT]

Caveats with Structured Prediction

Although the model is forced to generate predictions that conform to the specified structure, the prediction may be interrupted (for example, if the user stops the prediction). When that happens, the partial result may not conform to the specified structure. Thus, always check the prediction result before using it, for example, by wrapping the JSON.parse inside a try-catch block.

In certain cases, the model may get stuck. For example, when forcing it to generate valid JSON, it may generate a opening brace { but never generate a closing brace }. In such cases, the prediction will go on forever until the context length is reached, which can take a long time. Therefore, it is recommended to always set a maxPredictedTokens limit. This also contributes to the point above.

Canceling/Aborting a Prediction

A prediction may be canceled by calling the cancel method on the prediction object.

const prediction = model.complete("The meaning of life is");

// ...Do stuff...

prediction.cancel();

When a prediction is canceled, the prediction will stop normally but with stopReason set to "userStopped". You can detect cancellation like so:

for await (const text of prediction) {
  process.stdout.write(text);
}
const { stats } = await prediction;
if (stats.stopReason === "userStopped") {
  console.log("Prediction was canceled by the user");
}

For Tasks:

Click tags to check more tools for each tasks

load models predict text set up local server track loading progress unload models

For Jobs:

software developer data scientist machine learning engineer ai researcher web developer

Alternative AI tools for lmstudio.js

Similar Open Source Tools

lmstudio.js

github

: 408

instructor

Instructor is a Python library that makes it a breeze to work with structured outputs from large language models (LLMs). Built on top of Pydantic, it provides a simple, transparent, and user-friendly API to manage validation, retries, and streaming responses. Get ready to supercharge your LLM workflows!

github

: 7.5k

parrot.nvim

Parrot.nvim is a Neovim plugin that prioritizes a seamless out-of-the-box experience for text generation. It simplifies functionality and focuses solely on text generation, excluding integration of DALLE and Whisper. It supports persistent conversations as markdown files, custom hooks for inline text editing, multiple providers like Anthropic API, perplexity.ai API, OpenAI API, Mistral API, and local/offline serving via ollama. It allows custom agent definitions, flexible API credential support, and repository-specific instructions with a `.parrot.md` file. It does not have autocompletion or hidden requests in the background to analyze files.

github

: 222

aiavatarkit

AIAvatarKit is a tool for building AI-based conversational avatars quickly. It supports various platforms like VRChat and cluster, along with real-world devices. The tool is extensible, allowing unlimited capabilities based on user needs. It requires VOICEVOX API, Google or Azure Speech Services API keys, and Python 3.10. Users can start conversations out of the box and enjoy seamless interactions with the avatars.

github

: 154

LLM.swift

LLM.swift is a simple and readable library that allows you to interact with large language models locally with ease for macOS, iOS, watchOS, tvOS, and visionOS. It's a lightweight abstraction layer over `llama.cpp` package, so that it stays as performant as possible while is always up to date. Theoretically, any model that works on `llama.cpp` should work with this library as well. It's only a single file library, so you can copy, study and modify the code however you want.

github

: 303

client-python

The Mistral Python Client is a tool inspired by cohere-python that allows users to interact with the Mistral AI API. It provides functionalities to access and utilize the AI capabilities offered by Mistral. Users can easily install the client using pip and manage dependencies using poetry. The client includes examples demonstrating how to use the API for various tasks, such as chat interactions. To get started, users need to obtain a Mistral API Key and set it as an environment variable. Overall, the Mistral Python Client simplifies the integration of Mistral AI services into Python applications.

github

: 451

vim-ai

vim-ai is a plugin that adds Artificial Intelligence (AI) capabilities to Vim and Neovim. It allows users to generate code, edit text, and have interactive conversations with GPT models powered by OpenAI's API. The plugin uses OpenAI's API to generate responses, requiring users to set up an account and obtain an API key. It supports various commands for text generation, editing, and chat interactions, providing a seamless integration of AI features into the Vim text editor environment.

github

: 609

LlamaIndexTS

LlamaIndex.TS is a data framework for your LLM application. Use your own data with large language models (LLMs, OpenAI ChatGPT and others) in Typescript and Javascript.

github

: 1.8k

parea-sdk-py

Parea AI provides a SDK to evaluate & monitor AI applications. It allows users to test, evaluate, and monitor their AI models by defining and running experiments. The SDK also enables logging and observability for AI applications, as well as deploying prompts to facilitate collaboration between engineers and subject-matter experts. Users can automatically log calls to OpenAI and Anthropic, create hierarchical traces of their applications, and deploy prompts for integration into their applications.

github

: 59

modelfusion

ModelFusion is an abstraction layer for integrating AI models into JavaScript and TypeScript applications, unifying the API for common operations such as text streaming, object generation, and tool usage. It provides features to support production environments, including observability hooks, logging, and automatic retries. You can use ModelFusion to build AI applications, chatbots, and agents. ModelFusion is a non-commercial open source project that is community-driven. You can use it with any supported provider. ModelFusion supports a wide range of models including text generation, image generation, vision, text-to-speech, speech-to-text, and embedding models. ModelFusion infers TypeScript types wherever possible and validates model responses. ModelFusion provides an observer framework and logging support. ModelFusion ensures seamless operation through automatic retries, throttling, and error handling mechanisms. ModelFusion is fully tree-shakeable, can be used in serverless environments, and only uses a minimal set of dependencies.

github

: 918

LightRAG

LightRAG is a PyTorch library designed for building and optimizing Retriever-Agent-Generator (RAG) pipelines. It follows principles of simplicity, quality, and optimization, offering developers maximum customizability with minimal abstraction. The library includes components for model interaction, output parsing, and structured data generation. LightRAG facilitates tasks like providing explanations and examples for concepts through a question-answering pipeline.

github

: 562

gen.nvim

gen.nvim is a tool that allows users to generate text using Language Models (LLMs) with customizable prompts. It requires Ollama with models like `llama3`, `mistral`, or `zephyr`, along with Curl for installation. Users can use the `Gen` command to generate text based on predefined or custom prompts. The tool provides key maps for easy invocation and allows for follow-up questions during conversations. Additionally, users can select a model from a list of installed models and customize prompts as needed.

github

: 1.1k

llm-rag-workshop

The LLM RAG Workshop repository provides a workshop on using Large Language Models (LLMs) and Retrieval-Augmented Generation (RAG) to generate and understand text in a human-like manner. It includes instructions on setting up the environment, indexing Zoomcamp FAQ documents, creating a Q&A system, and using OpenAI for generation based on retrieved information. The repository focuses on enhancing language model responses with retrieved information from external sources, such as document databases or search engines, to improve factual accuracy and relevance of generated text.

github

: 144

deepgram-js-sdk

Deepgram JavaScript SDK. Power your apps with world-class speech and Language AI models.

github

: 129

langserve_ollama

LangServe Ollama is a tool that allows users to fine-tune Korean language models for local hosting, including RAG. Users can load HuggingFace gguf files, create model chains, and monitor GPU usage. The tool provides a seamless workflow for customizing and deploying language models in a local environment.

github

: 86

syncode

SynCode is a novel framework for the grammar-guided generation of Large Language Models (LLMs) that ensures syntactically valid output with respect to defined Context-Free Grammar (CFG) rules. It supports general-purpose programming languages like Python, Go, SQL, JSON, and more, allowing users to define custom grammars using EBNF syntax. The tool compares favorably to other constrained decoders and offers features like fast grammar-guided generation, compatibility with HuggingFace Language Models, and the ability to work with various decoding strategies.

github

: 165

For similar tasks

spandrel

Spandrel is a library for loading and running pre-trained PyTorch models. It automatically detects the model architecture and hyperparameters from model files, and provides a unified interface for running models.

github

: 107

wllama

Wllama is a WebAssembly binding for llama.cpp, a high-performance and lightweight language model library. It enables you to run inference directly on the browser without the need for a backend or GPU. Wllama provides both high-level and low-level APIs, allowing you to perform various tasks such as completions, embeddings, tokenization, and more. It also supports model splitting, enabling you to load large models in parallel for faster download. With its Typescript support and pre-built npm package, Wllama is easy to integrate into your React Typescript projects.

github

: 323

lmstudio.js

github

: 408

For similar jobs

weave

Weave is a toolkit for developing Generative AI applications, built by Weights & Biases. With Weave, you can log and debug language model inputs, outputs, and traces; build rigorous, apples-to-apples evaluations for language model use cases; and organize all the information generated across the LLM workflow, from experimentation to evaluations to production. Weave aims to bring rigor, best-practices, and composability to the inherently experimental process of developing Generative AI software, without introducing cognitive overhead.

github

: 660

LLMStack

LLMStack is a no-code platform for building generative AI agents, workflows, and chatbots. It allows users to connect their own data, internal tools, and GPT-powered models without any coding experience. LLMStack can be deployed to the cloud or on-premise and can be accessed via HTTP API or triggered from Slack or Discord.

github

: 1.4k

VisionCraft

The VisionCraft API is a free API for using over 100 different AI models. From images to sound.

github

: 94

kaito

Kaito is an operator that automates the AI/ML inference model deployment in a Kubernetes cluster. It manages large model files using container images, avoids tuning deployment parameters to fit GPU hardware by providing preset configurations, auto-provisions GPU nodes based on model requirements, and hosts large model images in the public Microsoft Container Registry (MCR) if the license allows. Using Kaito, the workflow of onboarding large AI inference models in Kubernetes is largely simplified.

github

: 380

PyRIT

PyRIT is an open access automation framework designed to empower security professionals and ML engineers to red team foundation models and their applications. It automates AI Red Teaming tasks to allow operators to focus on more complicated and time-consuming tasks and can also identify security harms such as misuse (e.g., malware generation, jailbreaking), and privacy harms (e.g., identity theft). The goal is to allow researchers to have a baseline of how well their model and entire inference pipeline is doing against different harm categories and to be able to compare that baseline to future iterations of their model. This allows them to have empirical data on how well their model is doing today, and detect any degradation of performance based on future improvements.

github

: 1.7k

tabby

Tabby is a self-hosted AI coding assistant, offering an open-source and on-premises alternative to GitHub Copilot. It boasts several key features: * Self-contained, with no need for a DBMS or cloud service. * OpenAPI interface, easy to integrate with existing infrastructure (e.g Cloud IDE). * Supports consumer-grade GPUs.

github

: 21.1k

spear

SPEAR (Simulator for Photorealistic Embodied AI Research) is a powerful tool for training embodied agents. It features 300 unique virtual indoor environments with 2,566 unique rooms and 17,234 unique objects that can be manipulated individually. Each environment is designed by a professional artist and features detailed geometry, photorealistic materials, and a unique floor plan and object layout. SPEAR is implemented as Unreal Engine assets and provides an OpenAI Gym interface for interacting with the environments via Python.

github

: 224

Magick

Magick is a groundbreaking visual AIDE (Artificial Intelligence Development Environment) for no-code data pipelines and multimodal agents. Magick can connect to other services and comes with nodes and templates well-suited for intelligent agents, chatbots, complex reasoning systems and realistic characters.

github

: 675

lmstudio.js

README:

Pre-Release Alpha

Installation

Quick project setup

API Usage

Getting Started

Set up lms (CLI)

Start the local LLM server

Node.js script

Web app

Override the default port

Examples

Loading an LLM and Predicting with It

Using a Non-Default LM Studio Server Port

Loading a Model and Keeping It Loaded After Client Exit (daemon mode)

Giving a Loaded Model a Friendly Name

Loading a Model with a Custom Configuration

Loading a Model with a Specific Preset

Custom Loading Progress

Listing all Models that can be Loaded

Canceling a Load

Unloading a Model

Using an Already Loaded Model

Using any Loaded Model

Listing All Loaded Models

Text Completion

Conversation

Getting Prediction Stats

Producing JSON (Structured Output)

Canceling/Aborting a Prediction

For Tasks:

For Jobs:

Alternative AI tools for lmstudio.js

Similar Open Source Tools

lmstudio.js

instructor

parrot.nvim

aiavatarkit

LLM.swift

client-python

vim-ai

LlamaIndexTS

parea-sdk-py

modelfusion

LightRAG

gen.nvim

llm-rag-workshop

deepgram-js-sdk

langserve_ollama

syncode

For similar tasks

spandrel

wllama

lmstudio.js

For similar jobs

weave

LLMStack

VisionCraft

kaito

PyRIT

tabby

spear

Magick

Set up `lms` (CLI)