deepgram-js-sdk
Official JavaScript SDK for Deepgram's automated speech recognition APIs.
Stars: 145
Deepgram JavaScript SDK. Power your apps with world-class speech and Language AI models.
README:
Official JavaScript SDK for Deepgram. Power your apps with world-class speech and Language AI models.
- Migrating from earlier versions
- Installation
- Initialization
- Scoped Configuration
- Transcription (Synchronous)
- Transcription (Asynchronous / Callbacks)
- Transcription (Live / Streaming)
- Transcribing to captions
- Text to Speech
- Text Intelligence
- Projects
- Keys
- Members
- Scopes
- Invitations
- Usage
- Billing
- On-Prem APIs
- Backwards Compatibility
- Development and Contributing
- Getting Help
We have published a migration guide on our docs, showing how to move from v2 to v3.
We recommend using only documented interfaces, as we strictly follow semantic versioning (semver) and breaking changes may occur for undocumented interfaces. To ensure compatibility, consider pinning your versions if you need to use undocumented interfaces.
You can install this SDK directly from npm.
npm install @deepgram/sdk
# - or -
# yarn add @deepgram/sdk
You can now use plain <script>
s to import deepgram from CDNs, like:
<script src="https://cdn.jsdelivr.net/npm/@deepgram/sdk"></script>
or even:
<script src="https://unpkg.com/@deepgram/sdk"></script>
Then you can use it from a global deepgram variable:
<script>
const { createClient } = deepgram;
const _deepgram = createClient("deepgram-api-key");
console.log("Deepgram Instance: ", _deepgram);
// ...
</script>
You can now use type="module" <script>
s to import deepgram from CDNs, like:
<script type="module">
import { createClient } from "https://cdn.jsdelivr.net/npm/@deepgram/sdk/+esm";
const deepgram = createClient("deepgram-api-key");
console.log("Deepgram Instance: ", deepgram);
// ...
</script>
import { createClient } from "@deepgram/sdk";
// - or -
// const { createClient } = require("@deepgram/sdk");
const deepgram = createClient(DEEPGRAM_API_KEY);
🔑 To access the Deepgram API you will need a free Deepgram API Key.
The SDK supports scoped configuration. You'll be able to configure various aspects of each namespace of the SDK from the initialization. Below outlines a flexible and customizable configuration system for the Deepgram SDK. Here’s how the namespace configuration works:
- The
global
namespace serves as the foundational configuration applicable across all other namespaces unless overridden. - Includes general settings like URL and headers applicable for all API calls.
- If no specific configurations are provided for other namespaces, the
global
defaults are used.
- Each namespace (
listen
,manage
,onprem
,read
,speak
) can have its specific configurations which override theglobal
settings within their respective scopes. - Allows for detailed control over different parts of the application interacting with various Deepgram API endpoints.
- Configurations for both
fetch
andwebsocket
can be specified under each namespace, allowing different transport mechanisms for different operations. - For example, the
fetch
configuration can have its own URL and proxy settings distinct from thewebsocket
. - The generic interfaces define a structure for transport options which include a client (like a
fetch
orWebSocket
instance) and associated options (like headers, URL, proxy settings).
This configuration system enables robust customization where defaults provide a foundation, but every aspect of the client's interaction with the API can be finely controlled and tailored to specific needs through namespace-specific settings. This enhances the maintainability and scalability of the application by localizing configurations to their relevant contexts.
Useful for using different API environments (for e.g. beta).
import { createClient } from "@deepgram/sdk";
// - or -
// const { createClient } = require("@deepgram/sdk");
const deepgram = createClient(DEEPGRAM_API_KEY, {
global: { fetch: { options: { url: "https://api.beta.deepgram.com" } } },
});
Useful for on-prem installations. Only affects requests to /listen
endpoints.
import { createClient } from "@deepgram/sdk";
// - or -
// const { createClient } = require("@deepgram/sdk");
const deepgram = createClient(DEEPGRAM_API_KEY, {
listen: { fetch: { options: { url: "http://localhost:8080" } } },
});
Useful for providing a custom http client.
import { createClient } from "@deepgram/sdk";
// - or -
// const { createClient } = require("@deepgram/sdk");
const yourFetch = async () => {
return Response("...etc");
};
const deepgram = createClient(DEEPGRAM_API_KEY, {
global: { fetch: { client: yourFetch } },
});
This SDK now works in the browser. If you'd like to make REST-based requests (pre-recorded transcription, on-premise, and management requests), then you'll need to use a proxy as we do not support custom CORS origins on our API. To set up your proxy, you configure the SDK like so:
import { createClient } from "@deepgram/sdk";
const deepgram = createClient("proxy", {
global: { fetch: { options: { proxy: { url: "http://localhost:8080" } } } },
});
Important: You must pass
"proxy"
as your API key, and use the proxy to set theAuthorization
header to your Deepgram API key.
Your proxy service should replace the Authorization header with Authorization: token <DEEPGRAM_API_KEY>
and return results verbatim to the SDK.
Check out our example Node-based proxy here: Deepgram Node Proxy.
Useful for many things.
import { createClient } from "@deepgram/sdk";
const deepgram = createClient("proxy", {
global: { fetch: { options: { headers: { "x-custom-header": "foo" } } } },
});
const { result, error } = await deepgram.listen.prerecorded.transcribeUrl(
{
url: "https://dpgr.am/spacewalk.wav",
},
{
model: "nova",
}
);
See our API reference for more info.
const { result, error } = await deepgram.listen.prerecorded.transcribeFile(
fs.createReadStream("./examples/spacewalk.wav"),
{
model: "nova",
}
);
or
const { result, error } = await deepgram.listen.prerecorded.transcribeFile(
fs.readFileSync("./examples/spacewalk.wav"),
{
model: "nova",
}
);
See our API reference for more info.
import { CallbackUrl } from "@deepgram/sdk";
const { result, error } = await deepgram.listen.prerecorded.transcribeUrlCallback(
{
url: "https://dpgr.am/spacewalk.wav",
},
new CallbackUrl("http://callback/endpoint"),
{
model: "nova",
}
);
See our API reference for more info.
import { CallbackUrl } from "@deepgram/sdk";
const { result, error } = await deepgram.listen.prerecorded.transcribeFileCallback(
fs.createReadStream("./examples/spacewalk.wav"),
new CallbackUrl("http://callback/endpoint"),
{
model: "nova",
}
);
or
import { CallbackUrl } from "@deepgram/sdk";
const { result, error } = await deepgram.listen.prerecorded.transcribeFileCallback(
fs.readFileSync("./examples/spacewalk.wav"),
new CallbackUrl("http://callback/endpoint"),
{
model: "nova",
}
);
See our API reference for more info.
const dgConnection = deepgram.listen.live({ model: "nova" });
dgConnection.on(LiveTranscriptionEvents.Open, () => {
dgConnection.on(LiveTranscriptionEvents.Transcript, (data) => {
console.log(data);
});
source.addListener("got-some-audio", async (event) => {
dgConnection.send(event.raw_audio_data);
});
});
To see an example, check out our Node.js example or our Browser example.
See our API reference for more info.
import { webvtt /* , srt */ } from "@deepgram/captions";
const { result, error } = await deepgram.listen.prerecorded.transcribeUrl(
{
url: "https://dpgr.am/spacewalk.wav",
},
{
model: "nova",
}
);
const vttOutput = webvtt(result);
// const srtOutput = srt(result);
See our standalone captions library for more information.
const { result } = await deepgram.speak.request({ text }, { model: "aura-asteria-en" });
const dgConnection = deepgram.speak.live({ model: "aura-asteria-en" });
dgConnection.on(LiveTTSEvents.Open, () => {
console.log("Connection opened");
// Send text data for TTS synthesis
dgConnection.sendText(text);
// Send Flush message to the server after sending the text
dgConnection.flush();
dgConnection.on(LiveTTSEvents.Close, () => {
console.log("Connection closed");
});
});
See our API reference for more info.
const text = `The history of the phrase 'The quick brown fox jumps over the
lazy dog'. The earliest known appearance of the phrase was in The Boston
Journal. In an article titled "Current Notes" in the February 9, 1885, edition,
the phrase is mentioned as a good practice sentence for writing students: "A
favorite copy set by writing teachers for their pupils is the following,
because it contains every letter of the alphabet: 'A quick brown fox jumps over
the lazy dog.'" Dozens of other newspapers published the phrase over the
next few months, all using the version of the sentence starting with "A" rather
than "The". The earliest known use of the phrase starting with "The" is from
the 1888 book Illustrative Shorthand by Linda Bronson.[3] The modern form
(starting with "The") became more common even though it is slightly longer than
the original (starting with "A").`;
const { result, error } = await deepgram.read.analyzeText(
{ text },
{ language: "en", topics: true, sentiment: true }
);
See our API reference for more info.
Returns all projects accessible by the API key.
const { result, error } = await deepgram.manage.getProjects();
See our API reference for more info.
Retrieves a specific project based on the provided project_id.
const { result, error } = await deepgram.manage.getProject(projectId);
See our API reference for more info.
Update a project.
const { result, error } = await deepgram.manage.updateProject(projectId, options);
See our API reference for more info.
Delete a project.
const { error } = await deepgram.manage.deleteProject(projectId);
See our API reference for more info.
Retrieves all keys associated with the provided project_id.
const { result, error } = await deepgram.manage.getProjectKeys(projectId);
See our API reference for more info.
Retrieves a specific key associated with the provided project_id.
const { result, error } = await deepgram.manage.getProjectKey(projectId, projectKeyId);
See our API reference for more info.
Creates an API key with the provided scopes.
const { result, error } = await deepgram.manage.createProjectKey(projectId, options);
See our API reference for more info.
Deletes a specific key associated with the provided project_id.
const { error } = await deepgram.manage.deleteProjectKey(projectId, projectKeyId);
See our API reference for more info.
Retrieves account objects for all of the accounts in the specified project_id.
const { result, error } = await deepgram.manage.getProjectMembers(projectId);
See our API reference for more info.
Removes member account for specified member_id.
const { error } = await deepgram.manage.removeProjectMember(projectId, projectMemberId);
See our API reference for more info.
Retrieves scopes of the specified member in the specified project.
const { result, error } = await deepgram.manage.getProjectMemberScopes(projectId, projectMemberId);
See our API reference for more info.
Updates the scope for the specified member in the specified project.
const { result, error } = await deepgram.manage.updateProjectMemberScope(
projectId,
projectMemberId,
options
);
See our API reference for more info.
Retrieves all invitations associated with the provided project_id.
const { result, error } = await deepgram.manage.getProjectInvites(projectId);
See our API reference for more info.
Sends an invitation to the provided email address.
const { result, error } = await deepgram.manage.sendProjectInvite(projectId, options);
See our API reference for more info.
Removes the specified invitation from the project.
const { error } = await deepgram.manage.deleteProjectInvite(projectId, email);
See our API reference for more info.
Removes the authenticated user from the project.
const { result, error } = await deepgram.manage.leaveProject(projectId);
See our API reference for more info.
Retrieves all requests associated with the provided project_id based on the provided options.
const { result, error } = await deepgram.manage.getProjectUsageRequest(projectId, requestId);
See our API reference for more info.
Retrieves a specific request associated with the provided project_id.
const { result, error } = await deepgram.manage.getProjectUsageRequest(projectId, requestId);
See our API reference for more info.
Retrieves usage associated with the provided project_id based on the provided options.
const { result, error } = await deepgram.manage.getProjectUsageSummary(projectId, options);
See our API reference for more info.
Lists the features, models, tags, languages, and processing method used for requests in the specified project.
const { result, error } = await deepgram.manage.getProjectUsageFields(projectId, options);
See our API reference for more info.
Retrieves the list of balance info for the specified project.
const { result, error } = await deepgram.manage.getProjectBalances(projectId);
See our API reference for more info.
Retrieves the balance info for the specified project and balance_id.
const { result, error } = await deepgram.manage.getProjectBalance(projectId, balanceId);
See our API reference for more info.
const { result, error } = await deepgram.onprem.listCredentials(projectId);
const { result, error } = await deepgram.onprem.getCredentials(projectId, credentialId);
const { result, error } = await deepgram.onprem.createCredentials(projectId, options);
const { result, error } = await deepgram.onprem.deleteCredentials(projectId, credentialId);
Older SDK versions will receive Priority 1 (P1) bug support only. Security issues, both in our code and dependencies, are promptly addressed. Significant bugs without clear workarounds are also given priority attention.
We strictly follow semver, and will not introduce breaking changes to the publicly documented interfaces of the SDK. Use internal and undocumented interfaces without pinning your version, at your own risk.
Interested in contributing? We ❤️ pull requests!
To make sure our community is safe for all, be sure to review and agree to our Code of Conduct. Then see the Contribution guidelines for more information.
If you want to make local changes to the SDK and run the examples/
, you'll need to npm run build
first, to ensure that your changes are included in the examples that are running.
We love to hear from you so if you have questions, comments or find a bug in the project, let us know! You can either:
For Tasks:
Click tags to check more tools for each tasksFor Jobs:
Alternative AI tools for deepgram-js-sdk
Similar Open Source Tools
deepgram-js-sdk
Deepgram JavaScript SDK. Power your apps with world-class speech and Language AI models.
client-js
The Mistral JavaScript client is a library that allows you to interact with the Mistral AI API. With this client, you can perform various tasks such as listing models, chatting with streaming, chatting without streaming, and generating embeddings. To use the client, you can install it in your project using npm and then set up the client with your API key. Once the client is set up, you can use it to perform the desired tasks. For example, you can use the client to chat with a model by providing a list of messages. The client will then return the response from the model. You can also use the client to generate embeddings for a given input. The embeddings can then be used for various downstream tasks such as clustering or classification.
model.nvim
model.nvim is a tool designed for Neovim users who want to utilize AI models for completions or chat within their text editor. It allows users to build prompts programmatically with Lua, customize prompts, experiment with multiple providers, and use both hosted and local models. The tool supports features like provider agnosticism, programmatic prompts in Lua, async and multistep prompts, streaming completions, and chat functionality in 'mchat' filetype buffer. Users can customize prompts, manage responses, and context, and utilize various providers like OpenAI ChatGPT, Google PaLM, llama.cpp, ollama, and more. The tool also supports treesitter highlights and folds for chat buffers.
genaiscript
GenAIScript is a scripting environment designed to facilitate file ingestion, prompt development, and structured data extraction. Users can define metadata and model configurations, specify data sources, and define tasks to extract specific information. The tool provides a convenient way to analyze files and extract desired content in a structured format. It offers a user-friendly interface for working with data and automating data extraction processes, making it suitable for various data processing tasks.
shortest
Shortest is an AI-powered natural language end-to-end testing framework built on Playwright. It provides a seamless testing experience by allowing users to write tests in natural language and execute them using Anthropic Claude API. The framework also offers GitHub integration with 2FA support, making it suitable for testing web applications with complex authentication flows. Shortest simplifies the testing process by enabling users to run tests locally or in CI/CD pipelines, ensuring the reliability and efficiency of web applications.
aiavatarkit
AIAvatarKit is a tool for building AI-based conversational avatars quickly. It supports various platforms like VRChat and cluster, along with real-world devices. The tool is extensible, allowing unlimited capabilities based on user needs. It requires VOICEVOX API, Google or Azure Speech Services API keys, and Python 3.10. Users can start conversations out of the box and enjoy seamless interactions with the avatars.
auto-playwright
Auto Playwright is a tool that allows users to run Playwright tests using AI. It eliminates the need for selectors by determining actions at runtime based on plain-text instructions. Users can automate complex scenarios, write tests concurrently with or before functionality development, and benefit from rapid test creation. The tool supports various Playwright actions and offers additional options for debugging and customization. It uses HTML sanitization to reduce costs and improve text quality when interacting with the OpenAI API.
lmstudio.js
lmstudio.js is a pre-release alpha client SDK for LM Studio, allowing users to use local LLMs in JS/TS/Node. It is currently undergoing rapid development with breaking changes expected. Users can follow LM Studio's announcements on Twitter and Discord. The SDK provides API usage for loading models, predicting text, setting up the local LLM server, and more. It supports features like custom loading progress tracking, model unloading, structured output prediction, and cancellation of predictions. Users can interact with LM Studio through the CLI tool 'lms' and perform tasks like text completion, conversation, and getting prediction statistics.
react-native-vercel-ai
Run Vercel AI package on React Native, Expo, Web and Universal apps. Currently React Native fetch API does not support streaming which is used as a default on Vercel AI. This package enables you to use AI library on React Native but the best usage is when used on Expo universal native apps. On mobile you get back responses without streaming with the same API of `useChat` and `useCompletion` and on web it will fallback to `ai/react`
langserve
LangServe helps developers deploy `LangChain` runnables and chains as a REST API. This library is integrated with FastAPI and uses pydantic for data validation. In addition, it provides a client that can be used to call into runnables deployed on a server. A JavaScript client is available in LangChain.js.
parrot.nvim
Parrot.nvim is a Neovim plugin that prioritizes a seamless out-of-the-box experience for text generation. It simplifies functionality and focuses solely on text generation, excluding integration of DALLE and Whisper. It supports persistent conversations as markdown files, custom hooks for inline text editing, multiple providers like Anthropic API, perplexity.ai API, OpenAI API, Mistral API, and local/offline serving via ollama. It allows custom agent definitions, flexible API credential support, and repository-specific instructions with a `.parrot.md` file. It does not have autocompletion or hidden requests in the background to analyze files.
monacopilot
Monacopilot is a powerful and customizable AI auto-completion plugin for the Monaco Editor. It supports multiple AI providers such as Anthropic, OpenAI, Groq, and Google, providing real-time code completions with an efficient caching system. The plugin offers context-aware suggestions, customizable completion behavior, and framework agnostic features. Users can also customize the model support and trigger completions manually. Monacopilot is designed to enhance coding productivity by providing accurate and contextually appropriate completions in daily spoken language.
python-tgpt
Python-tgpt is a Python package that enables seamless interaction with over 45 free LLM providers without requiring an API key. It also provides image generation capabilities. The name _python-tgpt_ draws inspiration from its parent project tgpt, which operates on Golang. Through this Python adaptation, users can effortlessly engage with a number of free LLMs available, fostering a smoother AI interaction experience.
syncode
SynCode is a novel framework for the grammar-guided generation of Large Language Models (LLMs) that ensures syntactically valid output with respect to defined Context-Free Grammar (CFG) rules. It supports general-purpose programming languages like Python, Go, SQL, JSON, and more, allowing users to define custom grammars using EBNF syntax. The tool compares favorably to other constrained decoders and offers features like fast grammar-guided generation, compatibility with HuggingFace Language Models, and the ability to work with various decoding strategies.
pinecone-ts-client
The official Node.js client for Pinecone, written in TypeScript. This client library provides a high-level interface for interacting with the Pinecone vector database service. With this client, you can create and manage indexes, upsert and query vector data, and perform other operations related to vector search and retrieval. The client is designed to be easy to use and provides a consistent and idiomatic experience for Node.js developers. It supports all the features and functionality of the Pinecone API, making it a comprehensive solution for building vector-powered applications in Node.js.
llm-client
LLMClient is a JavaScript/TypeScript library that simplifies working with large language models (LLMs) by providing an easy-to-use interface for building and composing efficient prompts using prompt signatures. These signatures enable the automatic generation of typed prompts, allowing developers to leverage advanced capabilities like reasoning, function calling, RAG, ReAcT, and Chain of Thought. The library supports various LLMs and vector databases, making it a versatile tool for a wide range of applications.
For similar tasks
llm2vec
LLM2Vec is a simple recipe to convert decoder-only LLMs into text encoders. It consists of 3 simple steps: 1) enabling bidirectional attention, 2) training with masked next token prediction, and 3) unsupervised contrastive learning. The model can be further fine-tuned to achieve state-of-the-art performance.
marvin
Marvin is a lightweight AI toolkit for building natural language interfaces that are reliable, scalable, and easy to trust. Each of Marvin's tools is simple and self-documenting, using AI to solve common but complex challenges like entity extraction, classification, and generating synthetic data. Each tool is independent and incrementally adoptable, so you can use them on their own or in combination with any other library. Marvin is also multi-modal, supporting both image and audio generation as well using images as inputs for extraction and classification. Marvin is for developers who care more about _using_ AI than _building_ AI, and we are focused on creating an exceptional developer experience. Marvin users should feel empowered to bring tightly-scoped "AI magic" into any traditional software project with just a few extra lines of code. Marvin aims to merge the best practices for building dependable, observable software with the best practices for building with generative AI into a single, easy-to-use library. It's a serious tool, but we hope you have fun with it. Marvin is open-source, free to use, and made with 💙 by the team at Prefect.
curated-transformers
Curated Transformers is a transformer library for PyTorch that provides state-of-the-art models composed of reusable components. It supports various transformer architectures, including encoders like ALBERT, BERT, and RoBERTa, and decoders like Falcon, Llama, and MPT. The library emphasizes consistent type annotations, minimal dependencies, and ease of use for education and research. It has been production-tested by Explosion and will be the default transformer implementation in spaCy 3.7.
txtai
Txtai is an all-in-one embeddings database for semantic search, LLM orchestration, and language model workflows. It combines vector indexes, graph networks, and relational databases to enable vector search with SQL, topic modeling, retrieval augmented generation, and more. Txtai can stand alone or serve as a knowledge source for large language models (LLMs). Key features include vector search with SQL, object storage, topic modeling, graph analysis, multimodal indexing, embedding creation for various data types, pipelines powered by language models, workflows to connect pipelines, and support for Python, JavaScript, Java, Rust, and Go. Txtai is open-source under the Apache 2.0 license.
bert4torch
**bert4torch** is a high-level framework for training and deploying transformer models in PyTorch. It provides a simple and efficient API for building, training, and evaluating transformer models, and supports a wide range of pre-trained models, including BERT, RoBERTa, ALBERT, XLNet, and GPT-2. bert4torch also includes a number of useful features, such as data loading, tokenization, and model evaluation. It is a powerful and versatile tool for natural language processing tasks.
private-llm-qa-bot
This is a production-grade knowledge Q&A chatbot implementation based on AWS services and the LangChain framework, with optimizations at various stages. It supports flexible configuration and plugging of vector models and large language models. The front and back ends are separated, making it easy to integrate with IM tools (such as Feishu).
openai-cf-workers-ai
OpenAI for Workers AI is a simple, quick, and dirty implementation of OpenAI's API on Cloudflare's new Workers AI platform. It allows developers to use the OpenAI SDKs with the new LLMs without having to rewrite all of their code. The API currently supports completions, chat completions, audio transcription, embeddings, audio translation, and image generation. It is not production ready but will be semi-regularly updated with new features as they roll out to Workers AI.
FlagEmbedding
FlagEmbedding focuses on retrieval-augmented LLMs, consisting of the following projects currently: * **Long-Context LLM** : Activation Beacon * **Fine-tuning of LM** : LM-Cocktail * **Embedding Model** : Visualized-BGE, BGE-M3, LLM Embedder, BGE Embedding * **Reranker Model** : llm rerankers, BGE Reranker * **Benchmark** : C-MTEB
For similar jobs
weave
Weave is a toolkit for developing Generative AI applications, built by Weights & Biases. With Weave, you can log and debug language model inputs, outputs, and traces; build rigorous, apples-to-apples evaluations for language model use cases; and organize all the information generated across the LLM workflow, from experimentation to evaluations to production. Weave aims to bring rigor, best-practices, and composability to the inherently experimental process of developing Generative AI software, without introducing cognitive overhead.
agentcloud
AgentCloud is an open-source platform that enables companies to build and deploy private LLM chat apps, empowering teams to securely interact with their data. It comprises three main components: Agent Backend, Webapp, and Vector Proxy. To run this project locally, clone the repository, install Docker, and start the services. The project is licensed under the GNU Affero General Public License, version 3 only. Contributions and feedback are welcome from the community.
oss-fuzz-gen
This framework generates fuzz targets for real-world `C`/`C++` projects with various Large Language Models (LLM) and benchmarks them via the `OSS-Fuzz` platform. It manages to successfully leverage LLMs to generate valid fuzz targets (which generate non-zero coverage increase) for 160 C/C++ projects. The maximum line coverage increase is 29% from the existing human-written targets.
LLMStack
LLMStack is a no-code platform for building generative AI agents, workflows, and chatbots. It allows users to connect their own data, internal tools, and GPT-powered models without any coding experience. LLMStack can be deployed to the cloud or on-premise and can be accessed via HTTP API or triggered from Slack or Discord.
VisionCraft
The VisionCraft API is a free API for using over 100 different AI models. From images to sound.
kaito
Kaito is an operator that automates the AI/ML inference model deployment in a Kubernetes cluster. It manages large model files using container images, avoids tuning deployment parameters to fit GPU hardware by providing preset configurations, auto-provisions GPU nodes based on model requirements, and hosts large model images in the public Microsoft Container Registry (MCR) if the license allows. Using Kaito, the workflow of onboarding large AI inference models in Kubernetes is largely simplified.
PyRIT
PyRIT is an open access automation framework designed to empower security professionals and ML engineers to red team foundation models and their applications. It automates AI Red Teaming tasks to allow operators to focus on more complicated and time-consuming tasks and can also identify security harms such as misuse (e.g., malware generation, jailbreaking), and privacy harms (e.g., identity theft). The goal is to allow researchers to have a baseline of how well their model and entire inference pipeline is doing against different harm categories and to be able to compare that baseline to future iterations of their model. This allows them to have empirical data on how well their model is doing today, and detect any degradation of performance based on future improvements.
Azure-Analytics-and-AI-Engagement
The Azure-Analytics-and-AI-Engagement repository provides packaged Industry Scenario DREAM Demos with ARM templates (Containing a demo web application, Power BI reports, Synapse resources, AML Notebooks etc.) that can be deployed in a customer’s subscription using the CAPE tool within a matter of few hours. Partners can also deploy DREAM Demos in their own subscriptions using DPoC.