deepgram-js-sdk

Official JavaScript SDK for Deepgram's automated speech recognition APIs.

Stars: 145

Visit

Deepgram JavaScript SDK. Power your apps with world-class speech and Language AI models.

README:

Deepgram JavaScript SDK

Official JavaScript SDK for Deepgram. Power your apps with world-class speech and Language AI models.

Migrating from earlier versions
- V2 to V3
- V3.* to V3.4
Installation
- UMD
- ESM
Initialization
- Getting an API Key
Scoped Configuration
- 1. Global Defaults
- 2. Namespace-specific Configurations
- 3. Transport Options
- 4. Examples
Transcription (Synchronous)
- Remote Files
- Local Files
Transcription (Asynchronous / Callbacks)
- Remote Files
- Local Files
Transcription (Live / Streaming)
- Live Audio
Transcribing to captions
Text to Speech
Text Intelligence
Projects
- Get Projects
- Get Project
- Update Project
- Delete Project
Keys
- List Keys
- Get Key
- Create Key
- Delete Key
Members
- Get Members
- Remove Member
Scopes
- Get Member Scopes
- Update Scope
Invitations
- List Invites
- Send Invite
- Delete Invite
- Leave Project
Usage
- Get All Requests
- Get Request
- Summarize Usage
- Get Fields
Billing
- Get All Balances
- Get Balance
On-Prem APIs
- List On-Prem credentials
- Get On-Prem credentials
- Create On-Prem credentials
- Delete On-Prem credentials
Backwards Compatibility
Development and Contributing
- Debugging and making changes locally
Getting Help

Migrating from earlier versions

V2 to V3

We have published a migration guide on our docs, showing how to move from v2 to v3.

V3.* to V3.4

We recommend using only documented interfaces, as we strictly follow semantic versioning (semver) and breaking changes may occur for undocumented interfaces. To ensure compatibility, consider pinning your versions if you need to use undocumented interfaces.

Installation

You can install this SDK directly from npm.

npm install @deepgram/sdk
# - or -
# yarn add @deepgram/sdk

UMD

You can now use plain <script>s to import deepgram from CDNs, like:

<script src="https://cdn.jsdelivr.net/npm/@deepgram/sdk"></script>

or even:

<script src="https://unpkg.com/@deepgram/sdk"></script>

Then you can use it from a global deepgram variable:

<script>
  const { createClient } = deepgram;
  const _deepgram = createClient("deepgram-api-key");

  console.log("Deepgram Instance: ", _deepgram);
  // ...
</script>

ESM

You can now use type="module" <script>s to import deepgram from CDNs, like:

<script type="module">
  import { createClient } from "https://cdn.jsdelivr.net/npm/@deepgram/sdk/+esm";
  const deepgram = createClient("deepgram-api-key");

  console.log("Deepgram Instance: ", deepgram);
  // ...
</script>

Initialization

import { createClient } from "@deepgram/sdk";
// - or -
// const { createClient } = require("@deepgram/sdk");

const deepgram = createClient(DEEPGRAM_API_KEY);

Getting an API Key

🔑 To access the Deepgram API you will need a free Deepgram API Key.

Scoped Configuration

The SDK supports scoped configuration. You'll be able to configure various aspects of each namespace of the SDK from the initialization. Below outlines a flexible and customizable configuration system for the Deepgram SDK. Here’s how the namespace configuration works:

1. Global Defaults

The global namespace serves as the foundational configuration applicable across all other namespaces unless overridden.
Includes general settings like URL and headers applicable for all API calls.
If no specific configurations are provided for other namespaces, the global defaults are used.

2. Namespace-specific Configurations

Each namespace (listen, manage, onprem, read, speak) can have its specific configurations which override the global settings within their respective scopes.
Allows for detailed control over different parts of the application interacting with various Deepgram API endpoints.

3. Transport Options

Configurations for both fetch and websocket can be specified under each namespace, allowing different transport mechanisms for different operations.
For example, the fetch configuration can have its own URL and proxy settings distinct from the websocket.
The generic interfaces define a structure for transport options which include a client (like a fetch or WebSocket instance) and associated options (like headers, URL, proxy settings).

This configuration system enables robust customization where defaults provide a foundation, but every aspect of the client's interaction with the API can be finely controlled and tailored to specific needs through namespace-specific settings. This enhances the maintainability and scalability of the application by localizing configurations to their relevant contexts.

4. Examples

Change the API url used for all SDK methods

Useful for using different API environments (for e.g. beta).

import { createClient } from "@deepgram/sdk";
// - or -
// const { createClient } = require("@deepgram/sdk");

const deepgram = createClient(DEEPGRAM_API_KEY, {
  global: { fetch: { options: { url: "https://api.beta.deepgram.com" } } },
});

Change the API url used for transcription only

Useful for on-prem installations. Only affects requests to /listen endpoints.

import { createClient } from "@deepgram/sdk";
// - or -
// const { createClient } = require("@deepgram/sdk");

const deepgram = createClient(DEEPGRAM_API_KEY, {
  listen: { fetch: { options: { url: "http://localhost:8080" } } },
});

Override fetch transmitter

Useful for providing a custom http client.

import { createClient } from "@deepgram/sdk";
// - or -
// const { createClient } = require("@deepgram/sdk");

const yourFetch = async () => {
  return Response("...etc");
};

const deepgram = createClient(DEEPGRAM_API_KEY, {
  global: { fetch: { client: yourFetch } },
});

Proxy requests in the browser

This SDK now works in the browser. If you'd like to make REST-based requests (pre-recorded transcription, on-premise, and management requests), then you'll need to use a proxy as we do not support custom CORS origins on our API. To set up your proxy, you configure the SDK like so:

import { createClient } from "@deepgram/sdk";

const deepgram = createClient("proxy", {
  global: { fetch: { options: { proxy: { url: "http://localhost:8080" } } } },
});

Important: You must pass "proxy" as your API key, and use the proxy to set the Authorization header to your Deepgram API key.

Your proxy service should replace the Authorization header with Authorization: token <DEEPGRAM_API_KEY> and return results verbatim to the SDK.

Check out our example Node-based proxy here: Deepgram Node Proxy.

Set custom headers for fetch

Useful for many things.

import { createClient } from "@deepgram/sdk";

const deepgram = createClient("proxy", {
  global: { fetch: { options: { headers: { "x-custom-header": "foo" } } } },
});

Transcription (Synchronous)

Remote Files

const { result, error } = await deepgram.listen.prerecorded.transcribeUrl(
  {
    url: "https://dpgr.am/spacewalk.wav",
  },
  {
    model: "nova",
  }
);

See our API reference for more info.

Local Files

const { result, error } = await deepgram.listen.prerecorded.transcribeFile(
  fs.createReadStream("./examples/spacewalk.wav"),
  {
    model: "nova",
  }
);

const { result, error } = await deepgram.listen.prerecorded.transcribeFile(
  fs.readFileSync("./examples/spacewalk.wav"),
  {
    model: "nova",
  }
);

See our API reference for more info.

Transcription (Asynchronous / Callbacks)

Remote Files

import { CallbackUrl } from "@deepgram/sdk";

const { result, error } = await deepgram.listen.prerecorded.transcribeUrlCallback(
  {
    url: "https://dpgr.am/spacewalk.wav",
  },
  new CallbackUrl("http://callback/endpoint"),
  {
    model: "nova",
  }
);

See our API reference for more info.

Local Files

import { CallbackUrl } from "@deepgram/sdk";

const { result, error } = await deepgram.listen.prerecorded.transcribeFileCallback(
  fs.createReadStream("./examples/spacewalk.wav"),
  new CallbackUrl("http://callback/endpoint"),
  {
    model: "nova",
  }
);

import { CallbackUrl } from "@deepgram/sdk";

const { result, error } = await deepgram.listen.prerecorded.transcribeFileCallback(
  fs.readFileSync("./examples/spacewalk.wav"),
  new CallbackUrl("http://callback/endpoint"),
  {
    model: "nova",
  }
);

See our API reference for more info.

Transcription (Live / Streaming)

Live Audio

const dgConnection = deepgram.listen.live({ model: "nova" });

dgConnection.on(LiveTranscriptionEvents.Open, () => {
  dgConnection.on(LiveTranscriptionEvents.Transcript, (data) => {
    console.log(data);
  });

  source.addListener("got-some-audio", async (event) => {
    dgConnection.send(event.raw_audio_data);
  });
});

To see an example, check out our Node.js example or our Browser example.

See our API reference for more info.

Transcribing to captions

import { webvtt /* , srt */ } from "@deepgram/captions";

const { result, error } = await deepgram.listen.prerecorded.transcribeUrl(
  {
    url: "https://dpgr.am/spacewalk.wav",
  },
  {
    model: "nova",
  }
);

const vttOutput = webvtt(result);
// const srtOutput = srt(result);

See our standalone captions library for more information.

Text to Speech

Rest

const { result } = await deepgram.speak.request({ text }, { model: "aura-asteria-en" });

Websocket

const dgConnection = deepgram.speak.live({ model: "aura-asteria-en" });

dgConnection.on(LiveTTSEvents.Open, () => {
  console.log("Connection opened");

  // Send text data for TTS synthesis
  dgConnection.sendText(text);

  // Send Flush message to the server after sending the text
  dgConnection.flush();

  dgConnection.on(LiveTTSEvents.Close, () => {
    console.log("Connection closed");
  });
});

See our API reference for more info.

Text Intelligence

const text = `The history of the phrase 'The quick brown fox jumps over the
lazy dog'. The earliest known appearance of the phrase was in The Boston
Journal. In an article titled "Current Notes" in the February 9, 1885, edition,
the phrase is mentioned as a good practice sentence for writing students: "A
favorite copy set by writing teachers for their pupils is the following,
because it contains every letter of the alphabet: 'A quick brown fox jumps over
the lazy dog.'" Dozens of other newspapers published the phrase over the
next few months, all using the version of the sentence starting with "A" rather
than "The". The earliest known use of the phrase starting with "The" is from
the 1888 book Illustrative Shorthand by Linda Bronson.[3] The modern form
(starting with "The") became more common even though it is slightly longer than
the original (starting with "A").`;

const { result, error } = await deepgram.read.analyzeText(
  { text },
  { language: "en", topics: true, sentiment: true }
);

See our API reference for more info.

Projects

Get Projects

Returns all projects accessible by the API key.

const { result, error } = await deepgram.manage.getProjects();

See our API reference for more info.

Get Project

Retrieves a specific project based on the provided project_id.

const { result, error } = await deepgram.manage.getProject(projectId);

See our API reference for more info.

Update Project

Update a project.

const { result, error } = await deepgram.manage.updateProject(projectId, options);

See our API reference for more info.

Delete Project

Delete a project.

const { error } = await deepgram.manage.deleteProject(projectId);

See our API reference for more info.

Keys

List Keys

Retrieves all keys associated with the provided project_id.

const { result, error } = await deepgram.manage.getProjectKeys(projectId);

See our API reference for more info.

Get Key

Retrieves a specific key associated with the provided project_id.

const { result, error } = await deepgram.manage.getProjectKey(projectId, projectKeyId);

See our API reference for more info.

Create Key

Creates an API key with the provided scopes.

const { result, error } = await deepgram.manage.createProjectKey(projectId, options);

See our API reference for more info.

Delete Key

Deletes a specific key associated with the provided project_id.

const { error } = await deepgram.manage.deleteProjectKey(projectId, projectKeyId);

See our API reference for more info.

Members

Get Members

Retrieves account objects for all of the accounts in the specified project_id.

const { result, error } = await deepgram.manage.getProjectMembers(projectId);

See our API reference for more info.

Remove Member

Removes member account for specified member_id.

const { error } = await deepgram.manage.removeProjectMember(projectId, projectMemberId);

See our API reference for more info.

Scopes

Get Member Scopes

Retrieves scopes of the specified member in the specified project.

const { result, error } = await deepgram.manage.getProjectMemberScopes(projectId, projectMemberId);

See our API reference for more info.

Update Scope

Updates the scope for the specified member in the specified project.

const { result, error } = await deepgram.manage.updateProjectMemberScope(
  projectId,
  projectMemberId,
  options
);

See our API reference for more info.

Invitations

List Invites

Retrieves all invitations associated with the provided project_id.

const { result, error } = await deepgram.manage.getProjectInvites(projectId);

See our API reference for more info.

Send Invite

Sends an invitation to the provided email address.

const { result, error } = await deepgram.manage.sendProjectInvite(projectId, options);

See our API reference for more info.

Delete Invite

Removes the specified invitation from the project.

const { error } = await deepgram.manage.deleteProjectInvite(projectId, email);

See our API reference for more info.

Leave Project

Removes the authenticated user from the project.

const { result, error } = await deepgram.manage.leaveProject(projectId);

See our API reference for more info.

Usage

Get All Requests

Retrieves all requests associated with the provided project_id based on the provided options.

const { result, error } = await deepgram.manage.getProjectUsageRequest(projectId, requestId);

See our API reference for more info.

Get Request

Retrieves a specific request associated with the provided project_id.

const { result, error } = await deepgram.manage.getProjectUsageRequest(projectId, requestId);

See our API reference for more info.

Summarize Usage

Retrieves usage associated with the provided project_id based on the provided options.

const { result, error } = await deepgram.manage.getProjectUsageSummary(projectId, options);

See our API reference for more info.

Get Fields

Lists the features, models, tags, languages, and processing method used for requests in the specified project.

const { result, error } = await deepgram.manage.getProjectUsageFields(projectId, options);

See our API reference for more info.

Billing

Get All Balances

Retrieves the list of balance info for the specified project.

const { result, error } = await deepgram.manage.getProjectBalances(projectId);

See our API reference for more info.

Get Balance

Retrieves the balance info for the specified project and balance_id.

const { result, error } = await deepgram.manage.getProjectBalance(projectId, balanceId);

See our API reference for more info.

On-Prem APIs

List On-Prem credentials

const { result, error } = await deepgram.onprem.listCredentials(projectId);

Get On-Prem credentials

const { result, error } = await deepgram.onprem.getCredentials(projectId, credentialId);

Create On-Prem credentials

const { result, error } = await deepgram.onprem.createCredentials(projectId, options);

Delete On-Prem credentials

const { result, error } = await deepgram.onprem.deleteCredentials(projectId, credentialId);

Backwards Compatibility

Older SDK versions will receive Priority 1 (P1) bug support only. Security issues, both in our code and dependencies, are promptly addressed. Significant bugs without clear workarounds are also given priority attention.

We strictly follow semver, and will not introduce breaking changes to the publicly documented interfaces of the SDK. Use internal and undocumented interfaces without pinning your version, at your own risk.

Development and Contributing

Interested in contributing? We ❤️ pull requests!

To make sure our community is safe for all, be sure to review and agree to our Code of Conduct. Then see the Contribution guidelines for more information.

Debugging and making changes locally

If you want to make local changes to the SDK and run the examples/, you'll need to npm run build first, to ensure that your changes are included in the examples that are running.

Getting Help

We love to hear from you so if you have questions, comments or find a bug in the project, let us know! You can either:

For Tasks:

Click tags to check more tools for each tasks

transcribe audio translate speech summarize text generate text classify text

For Jobs:

transcriptionist speech recognition engineer natural language processing engineer data scientist machine learning engineer

Alternative AI tools for deepgram-js-sdk

Similar Open Source Tools

deepgram-js-sdk

Deepgram JavaScript SDK. Power your apps with world-class speech and Language AI models.

github

: 145

client-js

The Mistral JavaScript client is a library that allows you to interact with the Mistral AI API. With this client, you can perform various tasks such as listing models, chatting with streaming, chatting without streaming, and generating embeddings. To use the client, you can install it in your project using npm and then set up the client with your API key. Once the client is set up, you can use it to perform the desired tasks. For example, you can use the client to chat with a model by providing a list of messages. The client will then return the response from the model. You can also use the client to generate embeddings for a given input. The embeddings can then be used for various downstream tasks such as clustering or classification.

github

: 173

suno-api

Suno AI API is an open-source project that allows developers to integrate the music generation capabilities of Suno.ai into their own applications. The API provides a simple and convenient way to generate music, lyrics, and other audio content using Suno.ai's powerful AI models. With Suno AI API, developers can easily add music generation functionality to their apps, websites, and other projects.

github

: 1.7k

js-genai

The Google Gen AI JavaScript SDK is an experimental SDK for TypeScript and JavaScript developers to build applications powered by Gemini. It supports both the Gemini Developer API and Vertex AI. The SDK is designed to work with Gemini 2.0 features. Users can access API features through the GoogleGenAI classes, which provide submodules for querying models, managing caches, creating chats, uploading files, and starting live sessions. The SDK also allows for function calling to interact with external systems. Users can find more samples in the GitHub samples directory.

github

: 56

agent-toolkit

The Stripe Agent Toolkit enables popular agent frameworks to integrate with Stripe APIs through function calling. It includes support for Python and TypeScript, built on top of Stripe Python and Node SDKs. The toolkit provides tools for LangChain, CrewAI, and Vercel's AI SDK, allowing users to configure actions like creating payment links, invoices, refunds, and more. Users can pass the toolkit as a list of tools to agents for integration with Stripe. Context values can be provided for making requests, such as specifying connected accounts for API calls. The toolkit also supports metered billing for Vercel's AI SDK, enabling billing events submission based on customer ID and input/output meters.

github

: 605

model.nvim

model.nvim is a tool designed for Neovim users who want to utilize AI models for completions or chat within their text editor. It allows users to build prompts programmatically with Lua, customize prompts, experiment with multiple providers, and use both hosted and local models. The tool supports features like provider agnosticism, programmatic prompts in Lua, async and multistep prompts, streaming completions, and chat functionality in 'mchat' filetype buffer. Users can customize prompts, manage responses, and context, and utilize various providers like OpenAI ChatGPT, Google PaLM, llama.cpp, ollama, and more. The tool also supports treesitter highlights and folds for chat buffers.

github

: 274

shortest

Shortest is an AI-powered natural language end-to-end testing framework built on Playwright. It provides a seamless testing experience by allowing users to write tests in natural language and execute them using Anthropic Claude API. The framework also offers GitHub integration with 2FA support, making it suitable for testing web applications with complex authentication flows. Shortest simplifies the testing process by enabling users to run tests locally or in CI/CD pipelines, ensuring the reliability and efficiency of web applications.

github

: 4.4k

genaiscript

GenAIScript is a scripting environment designed to facilitate file ingestion, prompt development, and structured data extraction. Users can define metadata and model configurations, specify data sources, and define tasks to extract specific information. The tool provides a convenient way to analyze files and extract desired content in a structured format. It offers a user-friendly interface for working with data and automating data extraction processes, making it suitable for various data processing tasks.

github

: 2.5k

parrot.nvim

Parrot.nvim is a Neovim plugin that prioritizes a seamless out-of-the-box experience for text generation. It simplifies functionality and focuses solely on text generation, excluding integration of DALLE and Whisper. It supports persistent conversations as markdown files, custom hooks for inline text editing, multiple providers like Anthropic API, perplexity.ai API, OpenAI API, Mistral API, and local/offline serving via ollama. It allows custom agent definitions, flexible API credential support, and repository-specific instructions with a `.parrot.md` file. It does not have autocompletion or hidden requests in the background to analyze files.

github

: 558

gateway

Adaline Gateway is a fully local production-grade Super SDK that offers a unified interface for calling over 200+ LLMs. It is production-ready, supports batching, retries, caching, callbacks, and OpenTelemetry. Users can create custom plugins and providers for seamless integration with their infrastructure.

github

: 419

auto-playwright

Auto Playwright is a tool that allows users to run Playwright tests using AI. It eliminates the need for selectors by determining actions at runtime based on plain-text instructions. Users can automate complex scenarios, write tests concurrently with or before functionality development, and benefit from rapid test creation. The tool supports various Playwright actions and offers additional options for debugging and customization. It uses HTML sanitization to reduce costs and improve text quality when interacting with the OpenAI API.

github

: 298

lmstudio.js

lmstudio.js is a pre-release alpha client SDK for LM Studio, allowing users to use local LLMs in JS/TS/Node. It is currently undergoing rapid development with breaking changes expected. Users can follow LM Studio's announcements on Twitter and Discord. The SDK provides API usage for loading models, predicting text, setting up the local LLM server, and more. It supports features like custom loading progress tracking, model unloading, structured output prediction, and cancellation of predictions. Users can interact with LM Studio through the CLI tool 'lms' and perform tasks like text completion, conversation, and getting prediction statistics.

github

: 663

react-native-vercel-ai

Run Vercel AI package on React Native, Expo, Web and Universal apps. Currently React Native fetch API does not support streaming which is used as a default on Vercel AI. This package enables you to use AI library on React Native but the best usage is when used on Expo universal native apps. On mobile you get back responses without streaming with the same API of `useChat` and `useCompletion` and on web it will fallback to `ai/react`

github

: 117

IntelliNode

IntelliNode is a javascript module that integrates cutting-edge AI models like ChatGPT, LLaMA, WaveNet, Gemini, and Stable diffusion into projects. It offers functions for generating text, speech, and images, as well as semantic search, multi-model evaluation, and chatbot capabilities. The module provides a wrapper layer for low-level model access, a controller layer for unified input handling, and a function layer for abstract functionality tailored to various use cases.

github

: 201

langserve

LangServe helps developers deploy `LangChain` runnables and chains as a REST API. This library is integrated with FastAPI and uses pydantic for data validation. In addition, it provides a client that can be used to call into runnables deployed on a server. A JavaScript client is available in LangChain.js.

github

: 1.9k

monacopilot

Monacopilot is a powerful and customizable AI auto-completion plugin for the Monaco Editor. It supports multiple AI providers such as Anthropic, OpenAI, Groq, and Google, providing real-time code completions with an efficient caching system. The plugin offers context-aware suggestions, customizable completion behavior, and framework agnostic features. Users can also customize the model support and trigger completions manually. Monacopilot is designed to enhance coding productivity by providing accurate and contextually appropriate completions in daily spoken language.

github

: 111

For similar tasks

llm2vec

LLM2Vec is a simple recipe to convert decoder-only LLMs into text encoders. It consists of 3 simple steps: 1) enabling bidirectional attention, 2) training with masked next token prediction, and 3) unsupervised contrastive learning. The model can be further fine-tuned to achieve state-of-the-art performance.

github

: 1.2k

marvin

Marvin is a lightweight AI toolkit for building natural language interfaces that are reliable, scalable, and easy to trust. Each of Marvin's tools is simple and self-documenting, using AI to solve common but complex challenges like entity extraction, classification, and generating synthetic data. Each tool is independent and incrementally adoptable, so you can use them on their own or in combination with any other library. Marvin is also multi-modal, supporting both image and audio generation as well using images as inputs for extraction and classification. Marvin is for developers who care more about _using_ AI than _building_ AI, and we are focused on creating an exceptional developer experience. Marvin users should feel empowered to bring tightly-scoped "AI magic" into any traditional software project with just a few extra lines of code. Marvin aims to merge the best practices for building dependable, observable software with the best practices for building with generative AI into a single, easy-to-use library. It's a serious tool, but we hope you have fun with it. Marvin is open-source, free to use, and made with 💙 by the team at Prefect.

github

: 5.5k

curated-transformers

Curated Transformers is a transformer library for PyTorch that provides state-of-the-art models composed of reusable components. It supports various transformer architectures, including encoders like ALBERT, BERT, and RoBERTa, and decoders like Falcon, Llama, and MPT. The library emphasizes consistent type annotations, minimal dependencies, and ease of use for education and research. It has been production-tested by Explosion and will be the default transformer implementation in spaCy 3.7.

github

: 833

txtai

Txtai is an all-in-one embeddings database for semantic search, LLM orchestration, and language model workflows. It combines vector indexes, graph networks, and relational databases to enable vector search with SQL, topic modeling, retrieval augmented generation, and more. Txtai can stand alone or serve as a knowledge source for large language models (LLMs). Key features include vector search with SQL, object storage, topic modeling, graph analysis, multimodal indexing, embedding creation for various data types, pipelines powered by language models, workflows to connect pipelines, and support for Python, JavaScript, Java, Rust, and Go. Txtai is open-source under the Apache 2.0 license.

github

: 10.7k

bert4torch

**bert4torch** is a high-level framework for training and deploying transformer models in PyTorch. It provides a simple and efficient API for building, training, and evaluating transformer models, and supports a wide range of pre-trained models, including BERT, RoBERTa, ALBERT, XLNet, and GPT-2. bert4torch also includes a number of useful features, such as data loading, tokenization, and model evaluation. It is a powerful and versatile tool for natural language processing tasks.

github

: 1.3k

private-llm-qa-bot

This is a production-grade knowledge Q&A chatbot implementation based on AWS services and the LangChain framework, with optimizations at various stages. It supports flexible configuration and plugging of vector models and large language models. The front and back ends are separated, making it easy to integrate with IM tools (such as Feishu).

github

: 262

openai-cf-workers-ai

OpenAI for Workers AI is a simple, quick, and dirty implementation of OpenAI's API on Cloudflare's new Workers AI platform. It allows developers to use the OpenAI SDKs with the new LLMs without having to rewrite all of their code. The API currently supports completions, chat completions, audio transcription, embeddings, audio translation, and image generation. It is not production ready but will be semi-regularly updated with new features as they roll out to Workers AI.

github

: 130

FlagEmbedding

FlagEmbedding focuses on retrieval-augmented LLMs, consisting of the following projects currently: * **Long-Context LLM** : Activation Beacon * **Fine-tuning of LM** : LM-Cocktail * **Embedding Model** : Visualized-BGE, BGE-M3, LLM Embedder, BGE Embedding * **Reranker Model** : llm rerankers, BGE Reranker * **Benchmark** : C-MTEB

github

: 8.8k

For similar jobs

weave

Weave is a toolkit for developing Generative AI applications, built by Weights & Biases. With Weave, you can log and debug language model inputs, outputs, and traces; build rigorous, apples-to-apples evaluations for language model use cases; and organize all the information generated across the LLM workflow, from experimentation to evaluations to production. Weave aims to bring rigor, best-practices, and composability to the inherently experimental process of developing Generative AI software, without introducing cognitive overhead.

github

: 855

agentcloud

AgentCloud is an open-source platform that enables companies to build and deploy private LLM chat apps, empowering teams to securely interact with their data. It comprises three main components: Agent Backend, Webapp, and Vector Proxy. To run this project locally, clone the repository, install Docker, and start the services. The project is licensed under the GNU Affero General Public License, version 3 only. Contributions and feedback are welcome from the community.

github

: 583

oss-fuzz-gen

This framework generates fuzz targets for real-world `C`/`C++` projects with various Large Language Models (LLM) and benchmarks them via the `OSS-Fuzz` platform. It manages to successfully leverage LLMs to generate valid fuzz targets (which generate non-zero coverage increase) for 160 C/C++ projects. The maximum line coverage increase is 29% from the existing human-written targets.

github

: 1.2k

LLMStack

LLMStack is a no-code platform for building generative AI agents, workflows, and chatbots. It allows users to connect their own data, internal tools, and GPT-powered models without any coding experience. LLMStack can be deployed to the cloud or on-premise and can be accessed via HTTP API or triggered from Slack or Discord.

github

: 1.5k

VisionCraft

The VisionCraft API is a free API for using over 100 different AI models. From images to sound.

github

: 94

kaito

Kaito is an operator that automates the AI/ML inference model deployment in a Kubernetes cluster. It manages large model files using container images, avoids tuning deployment parameters to fit GPU hardware by providing preset configurations, auto-provisions GPU nodes based on model requirements, and hosts large model images in the public Microsoft Container Registry (MCR) if the license allows. Using Kaito, the workflow of onboarding large AI inference models in Kubernetes is largely simplified.

github

: 405

PyRIT

PyRIT is an open access automation framework designed to empower security professionals and ML engineers to red team foundation models and their applications. It automates AI Red Teaming tasks to allow operators to focus on more complicated and time-consuming tasks and can also identify security harms such as misuse (e.g., malware generation, jailbreaking), and privacy harms (e.g., identity theft). The goal is to allow researchers to have a baseline of how well their model and entire inference pipeline is doing against different harm categories and to be able to compare that baseline to future iterations of their model. This allows them to have empirical data on how well their model is doing today, and detect any degradation of performance based on future improvements.

github

: 2.3k

Azure-Analytics-and-AI-Engagement

The Azure-Analytics-and-AI-Engagement repository provides packaged Industry Scenario DREAM Demos with ARM templates (Containing a demo web application, Power BI reports, Synapse resources, AML Notebooks etc.) that can be deployed in a customer’s subscription using the CAPE tool within a matter of few hours. Partners can also deploy DREAM Demos in their own subscriptions using DPoC.

github

: 136