call-center-ai
Send a phone call from AI agent, in an API call. Or, directly call the bot from the configured phone number!
Stars: 198
Call Center AI is an AI-powered call center solution leveraging Azure and OpenAI GPT. It allows for AI agent-initiated phone calls or direct calls to the bot from a configured phone number. The bot is customizable for various industries like insurance, IT support, and customer service, with features such as accessing claim information, conversation history, language change, SMS sending, and more. The project is a proof of concept showcasing the integration of Azure Communication Services, Azure Cognitive Services, and Azure OpenAI for an automated call center solution.
README:
AI-powered call center solution with Azure and OpenAI GPT.
Send a phone call from AI agent, in an API call. Or, directly call the bot from the configured phone number!
Insurance, IT support, customer service, and more. The bot can be customized in few seconds (really) to fit your needs.
# Ask the bot to call a phone number
data='{
"bot_company": "Contoso",
"bot_name": "Amélie",
"phone_number": "+11234567890",
"task": "Help the customer with their digital workplace. Assistant is working for the IT support department. The objective is to help the customer with their issue and gather information in the claim.",
"agent_phone_number": "+33612345678",
"claim": [
{
"name": "hardware_info",
"type": "text"
},
{
"name": "first_seen",
"type": "datetime"
},
{
"name": "building_location",
"type": "text"
}
]
}'
curl \
--header 'Content-Type: application/json' \
--request POST \
--url https://xxx/call \
--data $data
[!NOTE] This project is a proof of concept. It is not intended to be used in production. This demonstrates how can be combined Azure Communication Services, Azure Cognitive Services and Azure OpenAI to build an automated call center solution.
- [x] Access the claim on a public website
- [x] Access to customer conversation history
- [x] Allow user to change the language of the conversation
- [x] Assistant can send SMS to the user for futher information
- [x] Bot can be called from a phone number
- [x] Bot use multiple voice tones (e.g. happy, sad, neutral) to keep the conversation engaging
- [x] Company products (= lexicon) can be understood by the bot (e.g. a name of a specific insurance product)
- [x] Create by itself a todo list of tasks to complete the claim
- [x] Customizable prompts
- [x] Disengaging from a human agent when needed
- [x] Filter out inappropriate content from the LLM, like profanity or concurrence company names
- [x] Fine understanding of the customer request with GPT-4o and GPT 4o-mini
- [x] Follow a specific data schema for the claim
- [x] Has access to a documentation database (few-shot training / RAG)
- [x] Help the user to find the information needed to complete the claim
- [x] Jailbreak detection
- [x] Lower AI Search cost by usign a Redis cache
- [x] Monitoring and tracing with Application Insights
- [x] Perform user tests with feature flags
- [x] Receive SMS during a conversation for explicit wordings
- [x] Record the calls for audit and quality assurance
- [x] Responses are streamed from the LLM to the user, to avoid long pauses
- [x] Send a SMS report after the call
- [x] Take back a conversation after a disengagement
- [ ] Call back the user when needed
- [ ] Simulate a IVR workflow
A French demo is avaialble on YouTube. Do not hesitate to watch the demo in x1.5 speed to get a quick overview of the project.
Main interactions shown in the demo:
- User calls the call center
- The bot answers and the conversation starts
- The bot stores conversation, claim and todo list in the database
Extract of the data stored during the call:
{
"claim": {
"incident_datetime": "2024-10-08T02:00:00",
"incident_description": "La trottinette électrique fait des bruits bizarres et émet de la fumée blanche.",
"incident_location": "46 rue du Charles de Gaulle",
"injuries": "Douleur au genou suite à une chute.",
"involved_parties": "Lesne",
"policy_number": "B02131325XPGOLMP"
},
"messages": [
{
"created_at": "2024-10-08T11:23:41.824758Z",
"action": "call",
"content": "",
"persona": "human",
"style": "none",
"tool_calls": []
},
{
"created_at": "2024-10-08T11:23:55.421654Z",
"action": "talk",
"content": "Bonjour, je m'appelle Amélie, de Contoso Assurance ! Comment puis-je vous aider aujourd'hui ?",
"persona": "assistant",
"style": "cheerful",
"tool_calls": []
},
{
"created_at": "2024-10-08T11:24:19.972737Z",
"action": "talk",
"content": "Oui bien sûr. Bonjour, je vous appelle parce que j'ai un problème avec ma trottinette électrique. Elle marche plus depuis ce matin, elle fait des bruits bizarres et il y a une fumée blanche qui sort de la trottinette.",
"persona": "human",
"style": "none",
"tool_calls": []
}
],
"next": {
"action": "case_closed",
"justification": "The customer provided all necessary information for the claim, and they expressed satisfaction with the assistance received. No further action is required at this time."
},
"synthesis": {
"long": "You reported an issue with your electric scooter, which started making strange noises and emitting white smoke. This incident occurred at 2:00 AM while you were riding it, leading to a fall and resulting in knee pain. The location of the incident was noted, and your policy details were confirmed. I have documented all the necessary information to file your claim. Please take care of your knee, and feel free to reach out if you need further assistance.",
"satisfaction": "high",
"short": "the breakdown of your scooter",
"improvement_suggestions": "Ensure that the assistant provides clear next steps and offers to schedule follow-up calls proactively to enhance customer support."
},
...
}
A report is available at https://[your_domain]/report/[phone_number]
(like http://localhost:8080/report/%2B133658471534
). It shows the conversation history, claim data and reminders.
---
title: System diagram (C4 model)
---
graph
user(["User"])
agent(["Agent"])
app["Call Center AI"]
app -- Transfer to --> agent
app -. Send voice .-> user
user -- Call --> app
---
title: Claim AI component diagram (C4 model)
---
graph LR
agent(["Agent"])
user(["User"])
subgraph "Claim AI"
ada["Embedding\n(ADA)"]
app["App\n(Functions App)"]
communication_services["Call & SMS gateway\n(Communication Services)"]
db[("Conversations and claims\n(Cosmos DB / SQLite)")]
eventgrid["Broker\n(Event Grid)"]
gpt["LLM\n(GPT-4o)"]
queues[("Queues\n(Azure Storage)")]
redis[("Cache\n(Redis)")]
search[("RAG\n(AI Search)")]
sounds[("Sounds\n(Azure Storage)")]
sst["Speech-to-Text\n(Cognitive Services)"]
translation["Translation\n(Cognitive Services)"]
tts["Text-to-Speech\n(Cognitive Services)"]
end
app -- Respond with text --> communication_services
app -- Ask for translation --> translation
app -- Ask to transfer --> communication_services
app -- Few-shot training --> search
app -- Generate completion --> gpt
app -- Get cached data --> redis
app -- Save conversation --> db
app -- Send SMS report --> communication_services
app -. Watch .-> queues
communication_services -- Generate voice --> tts
communication_services -- Load sound --> sounds
communication_services -- Notifies --> eventgrid
communication_services -- Send SMS --> user
communication_services -- Transfer to --> agent
communication_services -- Transform voice --> sst
communication_services -. Send voice .-> user
eventgrid -- Push to --> queues
search -- Generate embeddings --> ada
user -- Call --> communication_services
sequenceDiagram
autonumber
actor Customer
participant PSTN
participant Text to Speech
participant Speech to Text
actor Human agent
participant Event Grid
participant Communication Services
participant App
participant Cosmos DB
participant OpenAI GPT
participant AI Search
App->>Event Grid: Subscribe to events
Customer->>PSTN: Initiate a call
PSTN->>Communication Services: Forward call
Communication Services->>Event Grid: New call event
Event Grid->>App: Send event to event URL (HTTP webhook)
activate App
App->>Communication Services: Accept the call and give inbound URL
deactivate App
Communication Services->>Speech to Text: Transform speech to text
Communication Services->>App: Send text to the inbound URL
activate App
alt First call
App->>Communication Services: Send static SSML text
else Callback
App->>AI Search: Gather training data
App->>OpenAI GPT: Ask for a completion
OpenAI GPT-->>App: Respond (HTTP/2 SSE)
loop Over buffer
loop Over multiple tools
alt Is this a claim data update?
App->>Cosmos DB: Update claim data
else Does the user want the human agent?
App->>Communication Services: Send static SSML text
App->>Communication Services: Transfer to a human
Communication Services->>Human agent: Call the phone number
else Should we end the call?
App->>Communication Services: Send static SSML text
App->>Communication Services: End the call
end
end
end
App->>Cosmos DB: Persist conversation
end
deactivate App
Communication Services->>PSTN: Send voice
PSTN->>Customer: Forward voice
Prefer using GitHub Codespaces for a quick start. The environment will setup automatically with all the required tools.
In macOS, with Homebrew, simply type make brew
.
For other systems, make sure you have the following installed:
- Bash compatible shell, like
bash
orzsh
- yq
- Make,
apt install make
(Ubuntu),yum install make
(CentOS),brew install make
(macOS) - Azure CLI
- Azure Functions Core Tools
- Twilio CLI (optional)
Then, Azure resources are needed:
- Prefer to use lowercase and no special characters other than dashes (e.g.
ccai-customer-a
)
- Same name as the resource group
- Enable system managed identity
- From the Communication Services resource
- Allow inbound and outbound communication
- Enable voice (required) and SMS (optional) capabilities
Now that the prerequisites are configured (local + Azure), the deployment can be done.
A pre-built container image is available on GitHub Actions, it will be used to deploy the solution on Azure:
- Latest version from a branch:
ghcr.io/clemlesne/call-center-ai:main
- Specific tag:
ghcr.io/clemlesne/call-center-ai:0.1.0
(recommended)
Local config file is named config.yaml
. It will be used by install scripts (incl. Makefile and Bicep) to configure the Azure resources.
Fill the file with the following content (must be customized for your need):
# config.yaml
conversation:
initiate:
# Phone number the bot will transfer the call to if customer asks for a human agent
agent_phone_number: "+33612345678"
bot_company: Contoso
bot_name: Amélie
lang: {}
communication_services:
# Phone number purshased from Communication Services
phone_number: "+33612345678"
sms: {}
prompts:
llm: {}
tts: {}
az login
make deploy name=my-rg-name
- Wait for the deployment to finish
- An index named
trainings
- A semantic search configuration on the index named
default
make logs name=my-rg-name
[!TIP] To use a Service Principal to authenticate to Azure, you can also add the following in a
.env
file:AZURE_CLIENT_ID=xxx AZURE_CLIENT_SECRET=xxx AZURE_TENANT_ID=xxx
[!TIP] If the application is already deployed on Azure, you can run
make name=my-rg-name sync-local-config
to copy the configuration from the Azure Function App to your local machine.
Local config file is named config.yaml
:
# config.yaml
resources:
public_url: https://xxx.blob.core.windows.net/public
conversation:
initiate:
agent_phone_number: "+33612345678"
bot_company: Contoso
bot_name: Robert
communication_services:
access_key: xxx
call_queue_name: call-33612345678
endpoint: https://xxx.france.communication.azure.com
phone_number: "+33612345678"
post_queue_name: post-33612345678
recording_container_url: https://xxx.blob.core.windows.net/recordings
resource_id: xxx
sms_queue_name: sms-33612345678
cognitive_service:
# Must be of type "AI services multi-service account"
endpoint: https://xxx.cognitiveservices.azure.com
llm:
fast:
mode: azure_openai
azure_openai:
context: 16385
deployment: gpt-4o-mini-2024-07-18
endpoint: https://xxx.openai.azure.com
model: gpt-4o-mini
streaming: true
slow:
mode: azure_openai
azure_openai:
context: 128000
deployment: gpt-4o-2024-08-06
endpoint: https://xxx.openai.azure.com
model: gpt-4o
streaming: true
ai_search:
access_key: xxx
endpoint: https://xxx.search.windows.net
index: trainings
ai_translation:
access_key: xxx
endpoint: https://xxx.cognitiveservices.azure.com
make deploy-bicep deploy-post name=my-rg-name
- This will deploy the Azure resources without the API server, allowing you to test the bot locally
- Wait for the deployment to finish
Copy local.example.settings.json
to local.settings.json
, then fill the required fields:
-
APPLICATIONINSIGHTS_CONNECTION_STRING
, as the connection string of the Application Insights resource -
AzureWebJobsStorage
, as the connection string of the Azure Storage account
[!IMPORTANT] Tunnel requires to be run in a separate terminal, because it needs to be running all the time
# Log in once
devtunnel login
# Start the tunnel
make tunnel
[!NOTE] To override a specific configuration value, you can use environment variables. For example, to override the
llm.fast.endpoint
value, you can use theLLM__FAST__ENDPOINT
variable:LLM__FAST__ENDPOINT=https://xxx.openai.azure.com
[!NOTE] Also,
local.py
script is available to test the application without the need of a phone call (= without Communication Services). Run the script with:python3 -m tests.local
make dev
- Code is automatically reloaded on file changes, no need to restart the server
- The API server is available at
http://localhost:8080
Call recording is disabled by default. To enable it:
- Create a new container in the Azure Storage account (i.e.
recordings
), it is already done if you deployed the solution on Azure - Update the feature flag
recording_enabled
in App Configuration totrue
Training data is stored on AI Search to be retrieved by the bot, on demand.
Required index schema:
Field Name | Type |
Retrievable | Searchable | Dimensions | Vectorizer |
---|---|---|---|---|---|
answer | Edm.String |
Yes | Yes | ||
context | Edm.String |
Yes | Yes | ||
created_at | Edm.String |
Yes | No | ||
document_synthesis | Edm.String |
Yes | Yes | ||
file_path | Edm.String |
Yes | No | ||
id | Edm.String |
Yes | No | ||
question | Edm.String |
Yes | Yes | ||
vectors | Collection(Edm.Single) |
No | Yes | 1536 | OpenAI ADA |
Software to fill the index is included on Synthetic RAG Index repository.
The bot can be used in multiple languages. It can understand the language the user chose.
See the list of supported languages for the Text-to-Speech service.
# config.yaml
conversation:
initiate:
lang:
default_short_code: fr-FR
availables:
- pronunciations_en: ["French", "FR", "France"]
short_code: fr-FR
voice: fr-FR-DeniseNeural
- pronunciations_en: ["Chinese", "ZH", "China"]
short_code: zh-CN
voice: zh-CN-XiaoqiuNeural
If you built and deployed an Azure Speech Custom Neural Voice (CNV), add field custom_voice_endpoint_id
on the language configuration:
# config.yaml
conversation:
initiate:
lang:
default_short_code: fr-FR
availables:
- pronunciations_en: ["French", "FR", "France"]
short_code: fr-FR
voice: xxx
custom_voice_endpoint_id: xxx
Levels are defined for each category of Content Safety. The higher the score, the more strict the moderation is, from 0 to 7. Moderation is applied on all bot data, including the web page and the conversation. Configure them in Azure OpenAI Content Filters.
Customization of the data schema is fully supported. You can add or remove fields as needed, depending on the requirements.
By default, the schema of composed of:
-
caller_email
(email
) -
caller_name
(text
) -
caller_phone
(phone_number
)
Values are validated to ensure the data format commit to your schema. They can be either:
datetime
email
-
phone_number
(E164
format) text
Finally, an optional description can be provided. The description must be short and meaningful, it will be passed to the LLM.
Default schema, for inbound calls, is defined in the configuration:
# config.yaml
conversation:
default_initiate:
claim:
- name: additional_notes
type: text
# description: xxx
- name: device_info
type: text
# description: xxx
- name: incident_datetime
type: datetime
# description: xxx
Claim schema can be customized for each call, by adding the claim
field in the POST /call
API call.
The objective is a description of what the bot will do during the call. It is used to give a context to the LLM. It should be short, meaningful, and written in English.
This solution is priviledged instead of overriding the LLM prompt.
Default task, for inbound calls, is defined in the configuration:
# config.yaml
conversation:
initiate:
task: |
Help the customer with their insurance claim. Assistant requires data from the customer to fill the claim. The latest claim data will be given. Assistant role is not over until all the relevant data is gathered.
Task can be customized for each call, by adding the task
field in the POST /call
API call.
Conversation options are represented as features. They can be configured from App Configuration, without the need to redeploy or restart the application. Once a feature is updated, a delay of 60 seconds is needed to make the change effective.
| Name | Description | Type | Default |
|-|-|-|
| answer_hard_timeout_sec
| The hard timeout for the bot answer in seconds. | int
| 180 |
| answer_soft_timeout_sec
| The soft timeout for the bot answer in seconds. | int
| 30 |
| callback_timeout_hour
| The timeout for a callback in hours. | int
| 72 |
| phone_silence_timeout_sec
| The timeout for phone silence in seconds. | int
| 1 |
| recording_enabled
| Whether call recording is enabled. | bool
| false |
| slow_llm_for_chat
| Whether to use the slower LLM for chat. | bool
| true |
| voice_recognition_retry_max
| The maximum number of retries for voice recognition. | int
| 2 |
To use a model compatible with the OpenAI completion API, you need to create an account and get the following information:
- API key
- Context window size
- Endpoint URL
- Model name
- Streaming capability
Then, add the following in the config.yaml
file:
# config.yaml
llm:
fast:
mode: openai
openai:
context: 128000
endpoint: https://api.openai.com
model: gpt-4o-mini
streaming: true
slow:
mode: openai
openai:
context: 128000
endpoint: https://api.openai.com
model: gpt-4o
streaming: true
To use Twilio for SMS, you need to create an account and get the following information:
- Account SID
- Auth Token
- Phone number
Then, add the following in the config.yaml
file:
# config.yaml
sms:
mode: twilio
twilio:
account_sid: xxx
auth_token: xxx
phone_number: "+33612345678"
Note that prompt examples contains {xxx}
placeholders. These placeholders are replaced by the bot with the corresponding data. For example, {bot_name}
is internally replaced by the bot name.
Be sure to write all the TTS prompts in English. This language is used as a pivot language for the conversation translation.
# config.yaml
prompts:
tts:
hello_tpl: |
Hello, I'm {bot_name}, from {bot_company}! I'm an IT support specialist.
Here's how I work: when I'm working, you'll hear a little music; then, at the beep, it's your turn to speak. You can speak to me naturally, I'll understand.
Examples:
- "I've got a problem with my computer, it won't turn on".
- "The external screen is flashing, I don't know why".
What's your problem?
llm:
default_system_tpl: |
Assistant is called {bot_name} and is in a call center for the company {bot_company} as an expert with 20 years of experience in IT service.
# Context
Today is {date}. Customer is calling from {phone_number}. Call center number is {bot_phone_number}.
chat_system_tpl: |
# Objective
Provide internal IT support to employees. Assistant requires data from the employee to provide IT support. The assistant's role is not over until the issue is resolved or the request is fulfilled.
# Rules
- Answers in {default_lang}, even if the customer speaks another language
- Cannot talk about any topic other than IT support
- Is polite, helpful, and professional
- Rephrase the employee's questions as statements and answer them
- Use additional context to enhance the conversation with useful details
- When the employee says a word and then spells out letters, this means that the word is written in the way the employee spelled it (e.g. "I work in Paris PARIS", "My name is John JOHN", "My email is Clemence CLEMENCE at gmail GMAIL dot com COM")
- You work for {bot_company}, not someone else
# Required employee data to be gathered by the assistant
- Department
- Description of the IT issue or request
- Employee name
- Location
# General process to follow
1. Gather information to know the employee's identity (e.g. name, department)
2. Gather details about the IT issue or request to understand the situation (e.g. description, location)
3. Provide initial troubleshooting steps or solutions
4. Gather additional information if needed (e.g. error messages, screenshots)
5. Be proactive and create reminders for follow-up or further assistance
# Support status
{claim}
# Reminders
{reminders}
At the time of development, no LLM framework was available to handle all of these features: streaming capability with multi-tools, backup models on availability issue, callbacks mechanisms in the triggered tools. So, OpenAI SDK is used directly and some algorithms are implemented to handle reliability.
For Tasks:
Click tags to check more tools for each tasksFor Jobs:
Alternative AI tools for call-center-ai
Similar Open Source Tools
call-center-ai
Call Center AI is an AI-powered call center solution leveraging Azure and OpenAI GPT. It allows for AI agent-initiated phone calls or direct calls to the bot from a configured phone number. The bot is customizable for various industries like insurance, IT support, and customer service, with features such as accessing claim information, conversation history, language change, SMS sending, and more. The project is a proof of concept showcasing the integration of Azure Communication Services, Azure Cognitive Services, and Azure OpenAI for an automated call center solution.
call-center-ai
Call Center AI is an AI-powered call center solution that leverages Azure and OpenAI GPT. It is a proof of concept demonstrating the integration of Azure Communication Services, Azure Cognitive Services, and Azure OpenAI to build an automated call center solution. The project showcases features like accessing claims on a public website, customer conversation history, language change during conversation, bot interaction via phone number, multiple voice tones, lexicon understanding, todo list creation, customizable prompts, content filtering, GPT-4 Turbo for customer requests, specific data schema for claims, documentation database access, SMS report sending, conversation resumption, and more. The system architecture includes components like RAG AI Search, SMS gateway, call gateway, moderation, Cosmos DB, event broker, GPT-4 Turbo, Redis cache, translation service, and more. The tool can be deployed remotely using GitHub Actions and locally with prerequisites like Azure environment setup, configuration file creation, and resource hosting. Advanced usage includes custom training data with AI Search, prompt customization, language customization, moderation level customization, claim data schema customization, OpenAI compatible model usage for the LLM, and Twilio integration for SMS.
claim-ai-phone-bot
AI-powered call center solution with Azure and OpenAI GPT. The bot can answer calls, understand the customer's request, and provide relevant information or assistance. It can also create a todo list of tasks to complete the claim, and send a report after the call. The bot is customizable, and can be used in multiple languages.
Lumos
Lumos is a Chrome extension powered by a local LLM co-pilot for browsing the web. It allows users to summarize long threads, news articles, and technical documentation. Users can ask questions about reviews and product pages. The tool requires a local Ollama server for LLM inference and embedding database. Lumos supports multimodal models and file attachments for processing text and image content. It also provides options to customize models, hosts, and content parsers. The extension can be easily accessed through keyboard shortcuts and offers tools for automatic invocation based on prompts.
client-python
The Mistral Python Client is a tool inspired by cohere-python that allows users to interact with the Mistral AI API. It provides functionalities to access and utilize the AI capabilities offered by Mistral. Users can easily install the client using pip and manage dependencies using poetry. The client includes examples demonstrating how to use the API for various tasks, such as chat interactions. To get started, users need to obtain a Mistral API Key and set it as an environment variable. Overall, the Mistral Python Client simplifies the integration of Mistral AI services into Python applications.
langchainrb
Langchain.rb is a Ruby library that makes it easy to build LLM-powered applications. It provides a unified interface to a variety of LLMs, vector search databases, and other tools, making it easy to build and deploy RAG (Retrieval Augmented Generation) systems and assistants. Langchain.rb is open source and available under the MIT License.
langserve
LangServe helps developers deploy `LangChain` runnables and chains as a REST API. This library is integrated with FastAPI and uses pydantic for data validation. In addition, it provides a client that can be used to call into runnables deployed on a server. A JavaScript client is available in LangChain.js.
julep
Julep is an advanced platform for creating stateful and functional AI apps powered by large language models. It offers features like statefulness by design, automatic function calling, production-ready deployment, cron-like asynchronous functions, 90+ built-in tools, and the ability to switch between different LLMs easily. Users can build AI applications without the need to write code for embedding, saving, and retrieving conversation history, and can connect to third-party applications using Composio. Julep simplifies the process of getting started with AI apps, whether they are conversational, functional, or agentic.
syncode
SynCode is a novel framework for the grammar-guided generation of Large Language Models (LLMs) that ensures syntactically valid output with respect to defined Context-Free Grammar (CFG) rules. It supports general-purpose programming languages like Python, Go, SQL, JSON, and more, allowing users to define custom grammars using EBNF syntax. The tool compares favorably to other constrained decoders and offers features like fast grammar-guided generation, compatibility with HuggingFace Language Models, and the ability to work with various decoding strategies.
scylla
Scylla is an intelligent proxy pool tool designed for humanities, enabling users to extract content from the internet and build their own Large Language Models in the AI era. It features automatic proxy IP crawling and validation, an easy-to-use JSON API, a simple web-based user interface, HTTP forward proxy server, Scrapy and requests integration, and headless browser crawling. Users can start using Scylla with just one command, making it a versatile tool for various web scraping and content extraction tasks.
json-repair
JSON Repair is a toolkit designed to address JSON anomalies that can arise from Large Language Models (LLMs). It offers a comprehensive solution for repairing JSON strings, ensuring accuracy and reliability in your data processing. With its user-friendly interface and extensive capabilities, JSON Repair empowers developers to seamlessly integrate JSON repair into their workflows.
langcorn
LangCorn is an API server that enables you to serve LangChain models and pipelines with ease, leveraging the power of FastAPI for a robust and efficient experience. It offers features such as easy deployment of LangChain models and pipelines, ready-to-use authentication functionality, high-performance FastAPI framework for serving requests, scalability and robustness for language processing applications, support for custom pipelines and processing, well-documented RESTful API endpoints, and asynchronous processing for faster response times.
redis-vl-python
The Python Redis Vector Library (RedisVL) is a tailor-made client for AI applications leveraging Redis. It enhances applications with Redis' speed, flexibility, and reliability, incorporating capabilities like vector-based semantic search, full-text search, and geo-spatial search. The library bridges the gap between the emerging AI-native developer ecosystem and the capabilities of Redis by providing a lightweight, elegant, and intuitive interface. It abstracts the features of Redis into a grammar that is more aligned to the needs of today's AI/ML Engineers or Data Scientists.
CredSweeper
CredSweeper is a tool designed to detect credentials like tokens, passwords, and API keys in directories or files. It helps users identify potential exposure of sensitive information by scanning lines, filtering, and utilizing an AI model. The tool reports lines containing possible credentials, their location, and the expected type of credential.
magentic
Easily integrate Large Language Models into your Python code. Simply use the `@prompt` and `@chatprompt` decorators to create functions that return structured output from the LLM. Mix LLM queries and function calling with regular Python code to create complex logic.
vim-ai
vim-ai is a plugin that adds Artificial Intelligence (AI) capabilities to Vim and Neovim. It allows users to generate code, edit text, and have interactive conversations with GPT models powered by OpenAI's API. The plugin uses OpenAI's API to generate responses, requiring users to set up an account and obtain an API key. It supports various commands for text generation, editing, and chat interactions, providing a seamless integration of AI features into the Vim text editor environment.
For similar tasks
call-center-ai
Call Center AI is an AI-powered call center solution leveraging Azure and OpenAI GPT. It allows for AI agent-initiated phone calls or direct calls to the bot from a configured phone number. The bot is customizable for various industries like insurance, IT support, and customer service, with features such as accessing claim information, conversation history, language change, SMS sending, and more. The project is a proof of concept showcasing the integration of Azure Communication Services, Azure Cognitive Services, and Azure OpenAI for an automated call center solution.
gemini-ai
Gemini AI is a Ruby Gem designed to provide low-level access to Google's generative AI services through Vertex AI, Generative Language API, or AI Studio. It allows users to interact with Gemini to build abstractions on top of it. The Gem provides functionalities for tasks such as generating content, embeddings, predictions, and more. It supports streaming capabilities, server-sent events, safety settings, system instructions, JSON format responses, and tools (functions) calling. The Gem also includes error handling, development setup, publishing to RubyGems, updating the README, and references to resources for further learning.
ai-artifacts
AI Artifacts is an open source tool that replicates Anthropic's Artifacts UI in the Claude chat app. It utilizes E2B's Code Interpreter SDK and Core SDK for secure AI code execution in a cloud sandbox environment. Users can run AI-generated code in various languages such as Python, JavaScript, R, and Nextjs apps. The tool also supports running AI-generated Python in Jupyter notebook, Next.js apps, and Streamlit apps. Additionally, it offers integration with Vercel AI SDK for tool calling and streaming responses from the model.
For similar jobs
llmops-promptflow-template
LLMOps with Prompt flow is a template and guidance for building LLM-infused apps using Prompt flow. It provides centralized code hosting, lifecycle management, variant and hyperparameter experimentation, A/B deployment, many-to-many dataset/flow relationships, multiple deployment targets, comprehensive reporting, BYOF capabilities, configuration-based development, local prompt experimentation and evaluation, endpoint testing, and optional Human-in-loop validation. The tool is customizable to suit various application needs.
azure-search-vector-samples
This repository provides code samples in Python, C#, REST, and JavaScript for vector support in Azure AI Search. It includes demos for various languages showcasing vectorization of data, creating indexes, and querying vector data. Additionally, it offers tools like Azure AI Search Lab for experimenting with AI-enabled search scenarios in Azure and templates for deploying custom chat-with-your-data solutions. The repository also features documentation on vector search, hybrid search, creating and querying vector indexes, and REST API references for Azure AI Search and Azure OpenAI Service.
geti-sdk
The Intel® Geti™ SDK is a python package that enables teams to rapidly develop AI models by easing the complexities of model development and enhancing collaboration between teams. It provides tools to interact with an Intel® Geti™ server via the REST API, allowing for project creation, downloading, uploading, deploying for local inference with OpenVINO, setting project and model configuration, launching and monitoring training jobs, and media upload and prediction. The SDK also includes tutorial-style Jupyter notebooks demonstrating its usage.
booster
Booster is a powerful inference accelerator designed for scaling large language models within production environments or for experimental purposes. It is built with performance and scaling in mind, supporting various CPUs and GPUs, including Nvidia CUDA, Apple Metal, and OpenCL cards. The tool can split large models across multiple GPUs, offering fast inference on machines with beefy GPUs. It supports both regular FP16/FP32 models and quantised versions, along with popular LLM architectures. Additionally, Booster features proprietary Janus Sampling for code generation and non-English languages.
xFasterTransformer
xFasterTransformer is an optimized solution for Large Language Models (LLMs) on the X86 platform, providing high performance and scalability for inference on mainstream LLM models. It offers C++ and Python APIs for easy integration, along with example codes and benchmark scripts. Users can prepare models in a different format, convert them, and use the APIs for tasks like encoding input prompts, generating token ids, and serving inference requests. The tool supports various data types and models, and can run in single or multi-rank modes using MPI. A web demo based on Gradio is available for popular LLM models like ChatGLM and Llama2. Benchmark scripts help evaluate model inference performance quickly, and MLServer enables serving with REST and gRPC interfaces.
amazon-transcribe-live-call-analytics
The Amazon Transcribe Live Call Analytics (LCA) with Agent Assist Sample Solution is designed to help contact centers assess and optimize caller experiences in real time. It leverages Amazon machine learning services like Amazon Transcribe, Amazon Comprehend, and Amazon SageMaker to transcribe and extract insights from contact center audio. The solution provides real-time supervisor and agent assist features, integrates with existing contact centers, and offers a scalable, cost-effective approach to improve customer interactions. The end-to-end architecture includes features like live call transcription, call summarization, AI-powered agent assistance, and real-time analytics. The solution is event-driven, ensuring low latency and seamless processing flow from ingested speech to live webpage updates.
ai-lab-recipes
This repository contains recipes for building and running containerized AI and LLM applications with Podman. It provides model servers that serve machine-learning models via an API, allowing developers to quickly prototype new AI applications locally. The recipes include components like model servers and AI applications for tasks such as chat, summarization, object detection, etc. Images for sample applications and models are available in `quay.io`, and bootable containers for AI training on Linux OS are enabled.
XLearning
XLearning is a scheduling platform for big data and artificial intelligence, supporting various machine learning and deep learning frameworks. It runs on Hadoop Yarn and integrates frameworks like TensorFlow, MXNet, Caffe, Theano, PyTorch, Keras, XGBoost. XLearning offers scalability, compatibility, multiple deep learning framework support, unified data management based on HDFS, visualization display, and compatibility with code at native frameworks. It provides functions for data input/output strategies, container management, TensorBoard service, and resource usage metrics display. XLearning requires JDK >= 1.7 and Maven >= 3.3 for compilation, and deployment on CentOS 7.2 with Java >= 1.7 and Hadoop 2.6, 2.7, 2.8.