008

Open-source event-driven AI powered Softphone

Stars: 75

Visit

008 is an open-source event-driven AI powered WebRTC Softphone compatible with macOS, Windows, and Linux. It is also accessible on the web. The name '008' or 'agent 008' reflects our ambition: beyond crafting the premier Open Source Softphone, we aim to introduce a programmable, event-driven AI agent. This agent utilizes embedded artificial intelligence models operating directly on the softphone, ensuring efficiency and reduced operational costs.

README:

008 Event-driven AI powered Open Source Softphone

008 is an open-source event-driven AI powered WebRTC Softphone compatible with macOS, Windows, and Linux.
It is also accessible on the web (though official support for browser-related issues is not provided).

The name '008' or 'agent 008' reflects our ambition: beyond crafting the premier Open Source Softphone, we aim to introduce a programmable, event-driven AI agent. This agent utilizes embedded artificial intelligence models operating directly on the softphone, ensuring efficiency and reduced operational costs.

Here are the planned features in our roadmap

📣 Want to do a quick test it without having to install a SIP server?

Download

You can download the latest version from the Releases page.

Setup

This project is a WebRTC softphone, and communication is achieved via SIP over a socket. Leading PBX systems like Asterisk or Freeswitch support socket connections. If your provider does not offer this feature, consider using a SIP proxy such as Kamailio, Opensip or Routr.

Configuration

The softphone is internally configured using a JSON definition (see details below). The configuration file can be loaded from either a server or a local file. 008 reads the file only once. To apply new settings, you must reload the configuration file as if it were new by clicking the green button in the configuration tab. To do so, follow these steps:

Go to Settings -> Configuration (Gear Icon).
Fill in the 'Settings' input and 'Basic Auth' fields if needed.
Apply the changes by clicking the green button.

{
  "sipUri": "sip:[email protected]",
  "sipPassword": "securepass",
  "sipUser": "JohnDoe",
  "wsUri": "wss://example.com:8089/ws",
  "allowVideo": true,
  "allowTransfer": true,
  "allowBlindTransfer": true,
  "allowAutoanswer": false,
  "autoanswer": 5,
  "statuses": [
    { "value": "online", "text": "Online", "color": "#057e74" },
    { "value": "away", "text": "Away", "color": "#ff00ff" },
    { "value": "offline", "text": "Offline", "color": "#A9A9A9" }
  ],
  "numbers": [
    {
      "number": "+34917370224",
      "tags": ["Main"]
    },
    {
      "number": "+34917370225",
      "tags": ["Sec"]
    }
  ],
  "webhooks": [
    {
      "label": "mywebhook",
      "endpoint": "https://example.com/webhook"
    }
  ],
  "size": {
    "width": 360,
    "height": 500
  },
  "avatar": "https://example.com/avatar.jpg",
  "nickname": "John Doe" // used as Basic Auth user,
  "qTts": true, // enable transcription
  "qSummarization": true //enable summarization
}

Quick test

Do you want to test it without having to install your SIP server? We have you covered! Set https://raw.githubusercontent.com/kunzite-app/008/master/packages/008/web/cfgDemo008.json as your testing configuration. Then, call the number 008.

Autoanswer

Autoanswer can be enabled via two options:

Set allowAutoanswer to true and adjust autoanswer to the desired wait time (in seconds).
Have the incoming request include the X-Autoanswer header with the desired wait time. This setting will override any prior setup.

Numbers or Caller IDs

When these are specified under the field numbers, two fields P-Asserted-Identity and X-Number will be added to the SIP header.
This helps identify the desired outgoing number or Caller ID in your PBX system.

Events

One of the standout features is the event system. Every time an event is triggered, the corresponding data is dispatched to the designated webhooks or integrations in the configuration via a REST POST request.
Most of these events also trigger the AI models that enhance the softphone in the Commercial version.
Below, you'll find a detailed description of each event and sample payloads that you can expect at your endpoint.

status:change

Triggered when the user changes the status within the settings. :warning: This event does not determine the current phone network connectivity.

{
  "type": "status:change",
  "data": {
    "status": "online",
    "context": {}
  }
}

contact:click

Triggered when the contact link within the session screen is clicked. This link is available only if the contact can be found in the softphone's contacts.

{
  "type": "phone:terminated",
  "data": {
    "contact": {
      "id": 1,
      "name": "John Doe",
      "phones": ["+1223456869"]
    },
    "context": {}
  }
}

phone:ringing

Triggered after the call is emitted or received; this is determined by the direction field.

{
  "type": "phone:ringing",
  "data": {
    "cdr": {
      "id": "uuid",
      "direction": "inbound|outbound",
      "from": "extension1",
      "to": "extension2",
      "headers": {},
      "video": false,
      "status": "ringing",
      "date": "ISO 8601 date",
      "wait": 0,
      "total": 0,
      "duration": 0
    },
    "context": {}
  }
}

phone:accepted

Triggered once the call is accepted.

{
  "type": "phone:ringing",
  "data": {
    "cdr": {
      "id": "uuid",
      "direction": "inbound|outbound",
      "from": "extension1",
      "to": "extension2",
      "headers": {},
      "video": false,
      "status": "answered",
      "date": "ISO 8601 date",
      "wait": 1,
      "total": 1,
      "duration": 0
    },
    "context": {}
  }
}

phone:terminated

Triggered upon call termination. The status field can have one of two possible values at this point: missed or answered;

{
  "type": "phone:terminated",
  "data": {
    "cdr": {
      "id": "uuid",
      "direction": "inbound|outbound",
      "from": "extension1",
      "to": "extension12",
      "headers": {},
      "video": false,
      "status": "missed|answered",
      "date": "ISO 8601 date",
      "wait": 1,
      "total": 2,
      "duration": 1
    },
    "context": {}
  }
}

phone:recording

Triggered upon the recording is ready. It's sent as a base64 encoded webm file.

{
  "type": "phone:recording",
  "data": {
    "id": "uuid", // the call id
    "audio": {
      "blob": "base64 webm audio file"
    },
    "context": {}
  }
}

phone:transcript

Triggered upon the transcription is ready.

{
  "type": "phone:transcript",
  "data": {
    "id": "uuid", // the call id
    "transcription": [
      {
        "channel": "remote|local",
        "start": 0,
        "end": 0,
        "text": ""
      }
    ],
    "context": {}
  }
}

phone:summarization

Triggered upon the summarization is ready.

{
  "type": "phone:summarization",
  "data": {
    "id": "uuid", // the call id
    "summarization": "text",
    "context": {}
  }
}

Context

All events come with a context field. This includes various account details that help identify who is sending the event, among other common settings:

{
  "nickname": "John Doe",
  "sipUri": "sip:[email protected]",
  "sipUser": "JohnDoe",
  "language": "en",
  "device": "default",
  "status": "online",
  "size": { "width": 360, "height": 500 }
}

Retry

If the http call fails the softphone will try the request 5 times delaying the request gradually up to 2.5 minutes.

CDR payload

Field	Info
id	ID obtained from SIP headers `X-Call-ID` or `Call-ID` in that order
direction	Determines the direction of the call: `inbound` or `outbound`
from	The initiator of the call. It is derived from the `P-Asserted-Identity` which is a `Number` if outbound or displayName if inbound
to	The receiver of the call. Calculated as the opposite of the `from` field
video	Indicates if the call used video: `true` or `false`
status	Possible statuses: `ringing` or `answered` or `missed`
date	ISO 8601 date format
wait	Number of seconds waited before the call is answered
duration	Duration of the call in seconds after it is answered
total	Total duration of the call in seconds. Calculated by adding the `wait` and `duration` values

Community VS Commercial

We offer a commercial version that incorporates embedded AI models and provides integrations with widely recognized CRMs, Helpdesk, and analytics software. If you're interested, please contact us.

	Community	Commercial
Support	`Github`	`Dedicated`
Desktop Softphone	🟢	🟢
Mobile Softphone	sources	🟢
Events	🟢	🟢
Integrations	🔴	🟢
AI Speech2Text	🟢	🟢
AI Summarization	🟢	🟢
AI Sentiment Analysis	🔴	🟢
AI KPI insights	🔴	🟢
Programmable conversational agent	`ChatGPT`	`ChatGPT` `embedded`

Contributing

Every sort of contribution will be very helpful to enhance 008. How you’ll participate? All your ideas and code are welcome:

⭐ this repo! It helps us a lot.
Report bugs
Contribute to 008's code

License

Released under the AGPL-3.0 license.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more details.

If you wish to use our software in a manner that does not allow for AGPL-3.0 compliance (e.g., incorporating our software into proprietary software), you can obtain a commercial license. This commercial license provides more flexibility in terms of integration and redistribution, but comes with its own terms and conditions. If you require a commercial license, please send us an email directly for more information and pricing details.

For Tasks:

Click tags to check more tools for each tasks

make calls receive calls transfer calls record calls transcribe calls

For Jobs:

customer service representative sales representative technical support specialist call center agent help desk technician

Alternative AI tools for 008

Similar Open Source Tools

008

github

: 75

AICentral

AI Central is a powerful tool designed to take control of your AI services with minimal overhead. It is built on Asp.Net Core and dotnet 8, offering fast web-server performance. The tool enables advanced Azure APIm scenarios, PII stripping logging to Cosmos DB, token metrics through Open Telemetry, and intelligent routing features. AI Central supports various endpoint selection strategies, proxying asynchronous requests, custom OAuth2 authorization, circuit breakers, rate limiting, and extensibility through plugins. It provides an extensibility model for easy plugin development and offers enriched telemetry and logging capabilities for monitoring and insights.

github

: 76

sparrow

Sparrow is an innovative open-source solution for efficient data extraction and processing from various documents and images. It seamlessly handles forms, invoices, receipts, and other unstructured data sources. Sparrow stands out with its modular architecture, offering independent services and pipelines all optimized for robust performance. One of the critical functionalities of Sparrow - pluggable architecture. You can easily integrate and run data extraction pipelines using tools and frameworks like LlamaIndex, Haystack, or Unstructured. Sparrow enables local LLM data extraction pipelines through Ollama or Apple MLX. With Sparrow solution you get API, which helps to process and transform your data into structured output, ready to be integrated with custom workflows. Sparrow Agents - with Sparrow you can build independent LLM agents, and use API to invoke them from your system. **List of available agents:** * **llamaindex** - RAG pipeline with LlamaIndex for PDF processing * **vllamaindex** - RAG pipeline with LLamaIndex multimodal for image processing * **vprocessor** - RAG pipeline with OCR and LlamaIndex for image processing * **haystack** - RAG pipeline with Haystack for PDF processing * **fcall** - Function call pipeline * **unstructured-light** - RAG pipeline with Unstructured and LangChain, supports PDF and image processing * **unstructured** - RAG pipeline with Weaviate vector DB query, Unstructured and LangChain, supports PDF and image processing * **instructor** - RAG pipeline with Unstructured and Instructor libraries, supports PDF and image processing. Works great for JSON response generation

github

: 4.5k

firecrawl

Firecrawl is an API service that takes a URL, crawls it, and converts it into clean markdown. It crawls all accessible subpages and provides clean markdown for each, without requiring a sitemap. The API is easy to use and can be self-hosted. It also integrates with Langchain and Llama Index. The Python SDK makes it easy to crawl and scrape websites in Python code.

github

: 34.1k

VectorETL

VectorETL is a lightweight ETL framework designed to assist Data & AI engineers in processing data for AI applications quickly. It streamlines the conversion of diverse data sources into vector embeddings and storage in various vector databases. The framework supports multiple data sources, embedding models, and vector database targets, simplifying the creation and management of vector search systems for semantic search, recommendation systems, and other vector-based operations.

github

: 72

chat-ui

A chat interface using open source models, eg OpenAssistant or Llama. It is a SvelteKit app and it powers the HuggingChat app on hf.co/chat.

github

: 8.5k

openmacro

Openmacro is a multimodal personal agent that allows users to run code locally. It acts as a personal agent capable of completing and automating tasks autonomously via self-prompting. The tool provides a CLI natural-language interface for completing and automating tasks, analyzing and plotting data, browsing the web, and manipulating files. Currently, it supports API keys for models powered by SambaNova, with plans to add support for other hosts like OpenAI and Anthropic in future versions.

github

: 62

firecrawl-mcp-server

Firecrawl MCP Server is a Model Context Protocol (MCP) server implementation that integrates with Firecrawl for web scraping capabilities. It supports features like scrape, crawl, search, extract, and batch scrape. It provides web scraping with JS rendering, URL discovery, web search with content extraction, automatic retries with exponential backoff, credit usage monitoring, comprehensive logging system, support for cloud and self-hosted FireCrawl instances, mobile/desktop viewport support, and smart content filtering with tag inclusion/exclusion. The server includes configurable parameters for retry behavior and credit usage monitoring, rate limiting and batch processing capabilities, and tools for scraping, batch scraping, checking batch status, searching, crawling, and extracting structured information from web pages.

github

: 116

mistreevous

Mistreevous is a library written in TypeScript for Node and browsers, used to declaratively define, build, and execute behaviour trees for creating complex AI. It allows defining trees with JSON or a minimal DSL, providing in-browser editor and visualizer. The tool offers methods for tree state, stepping, resetting, and getting node details, along with various composite, decorator, leaf nodes, callbacks, guards, and global functions/subtrees. Version history includes updates for node types, callbacks, global functions, and TypeScript conversion.

github

: 82

ruby-openai

Use the OpenAI API with Ruby! 🤖🩵 Stream text with GPT-4, transcribe and translate audio with Whisper, or create images with DALL·E... Hire me | 🎮 Ruby AI Builders Discord | 🐦 Twitter | 🧠 Anthropic Gem | 🚂 Midjourney Gem ## Table of Contents * Ruby OpenAI * Table of Contents * Installation * Bundler * Gem install * Usage * Quickstart * With Config * Custom timeout or base URI * Extra Headers per Client * Logging * Errors * Faraday middleware * Azure * Ollama * Counting Tokens * Models * Examples * Chat * Streaming Chat * Vision * JSON Mode * Functions * Edits * Embeddings * Batches * Files * Finetunes * Assistants * Threads and Messages * Runs * Runs involving function tools * Image Generation * DALL·E 2 * DALL·E 3 * Image Edit * Image Variations * Moderations * Whisper * Translate * Transcribe * Speech * Errors * Development * Release * Contributing * License * Code of Conduct

github

: 3.0k

pipecat-flows

Pipecat Flows is a framework designed for building structured conversations in AI applications. It allows users to create both predefined conversation paths and dynamically generated flows, handling state management and LLM interactions. The framework includes a Python module for building conversation flows and a visual editor for designing and exporting flow configurations. Pipecat Flows is suitable for scenarios such as customer service scripts, intake forms, personalized experiences, and complex decision trees.

github

: 222

promptic

Promptic is a tool designed for LLM app development, providing a productive and pythonic way to build LLM applications. It leverages LiteLLM, allowing flexibility to switch LLM providers easily. Promptic focuses on building features by providing type-safe structured outputs, easy-to-build agents, streaming support, automatic prompt caching, and built-in conversation memory.

github

: 223

functionary

Functionary is a language model that interprets and executes functions/plugins. It determines when to execute functions, whether in parallel or serially, and understands their outputs. Function definitions are given as JSON Schema Objects, similar to OpenAI GPT function calls. It offers documentation and examples on functionary.meetkai.com. The newest model, meetkai/functionary-medium-v3.1, is ranked 2nd in the Berkeley Function-Calling Leaderboard. Functionary supports models with different context lengths and capabilities for function calling and code interpretation. It also provides grammar sampling for accurate function and parameter names. Users can deploy Functionary models serverlessly using Modal.com.

github

: 1.5k

aiavatarkit

AIAvatarKit is a tool for building AI-based conversational avatars quickly. It supports various platforms like VRChat and cluster, along with real-world devices. The tool is extensible, allowing unlimited capabilities based on user needs. It requires VOICEVOX API, Google or Azure Speech Services API keys, and Python 3.10. Users can start conversations out of the box and enjoy seamless interactions with the avatars.

github

: 303

structured-logprobs

This Python library enhances OpenAI chat completion responses by providing detailed information about token log probabilities. It works with OpenAI Structured Outputs to ensure model-generated responses adhere to a JSON Schema. Developers can analyze and incorporate token-level log probabilities to understand the reliability of structured data extracted from OpenAI models.

github

: 155

scylla

Scylla is an intelligent proxy pool tool designed for humanities, enabling users to extract content from the internet and build their own Large Language Models in the AI era. It features automatic proxy IP crawling and validation, an easy-to-use JSON API, a simple web-based user interface, HTTP forward proxy server, Scrapy and requests integration, and headless browser crawling. Users can start using Scylla with just one command, making it a versatile tool for various web scraping and content extraction tasks.

github

: 3.9k

For similar tasks

008

github

: 75

For similar jobs

bolna

Bolna is an open-source platform for building voice-driven conversational applications using large language models (LLMs). It provides a comprehensive set of tools and integrations to handle various aspects of voice-based interactions, including telephony, transcription, LLM-based conversation handling, and text-to-speech synthesis. Bolna simplifies the process of creating voice agents that can perform tasks such as initiating phone calls, transcribing conversations, generating LLM-powered responses, and synthesizing speech. It supports multiple providers for each component, allowing users to customize their setup based on their specific needs. Bolna is designed to be easy to use, with a straightforward local setup process and well-documented APIs. It is also extensible, enabling users to integrate with other telephony providers or add custom functionality.

github

: 205

claim-ai-phone-bot

AI-powered call center solution with Azure and OpenAI GPT. The bot can answer calls, understand the customer's request, and provide relevant information or assistance. It can also create a todo list of tasks to complete the claim, and send a report after the call. The bot is customizable, and can be used in multiple languages.

github

: 65

008

github

: 75

call-center-ai

Call Center AI is an AI-powered call center solution that leverages Azure and OpenAI GPT. It is a proof of concept demonstrating the integration of Azure Communication Services, Azure Cognitive Services, and Azure OpenAI to build an automated call center solution. The project showcases features like accessing claims on a public website, customer conversation history, language change during conversation, bot interaction via phone number, multiple voice tones, lexicon understanding, todo list creation, customizable prompts, content filtering, GPT-4 Turbo for customer requests, specific data schema for claims, documentation database access, SMS report sending, conversation resumption, and more. The system architecture includes components like RAG AI Search, SMS gateway, call gateway, moderation, Cosmos DB, event broker, GPT-4 Turbo, Redis cache, translation service, and more. The tool can be deployed remotely using GitHub Actions and locally with prerequisites like Azure environment setup, configuration file creation, and resource hosting. Advanced usage includes custom training data with AI Search, prompt customization, language customization, moderation level customization, claim data schema customization, OpenAI compatible model usage for the LLM, and Twilio integration for SMS.

github

: 119

air724ug-forwarder

Air724UG forwarder is a tool designed to forward SMS, notify incoming calls, and manage voice messages. It provides a convenient way to handle communication tasks on Air724UG devices. The tool streamlines the process of receiving and managing messages, ensuring users stay connected and informed.

github

: 278

Callytics

Callytics is an advanced call analytics solution that leverages speech recognition and large language models (LLMs) technologies to analyze phone conversations from customer service and call centers. By processing both the audio and text of each call, it provides insights such as sentiment analysis, topic detection, conflict detection, profanity word detection, and summary. These cutting-edge techniques help businesses optimize customer interactions, identify areas for improvement, and enhance overall service quality. When an audio file is placed in the .data/input directory, the entire pipeline automatically starts running, and the resulting data is inserted into the database. This is only a v1.1.0 version; many new features will be added, models will be fine-tuned or trained from scratch, and various optimization efforts will be applied.

github

: 63

ChatFAQ

ChatFAQ is an open-source comprehensive platform for creating a wide variety of chatbots: generic ones, business-trained, or even capable of redirecting requests to human operators. It includes a specialized NLP/NLG engine based on a RAG architecture and customized chat widgets, ensuring a tailored experience for users and avoiding vendor lock-in.

github

: 128

anything-llm

AnythingLLM is a full-stack application that enables you to turn any document, resource, or piece of content into context that any LLM can use as references during chatting. This application allows you to pick and choose which LLM or Vector Database you want to use as well as supporting multi-user management and permissions.

github

: 42.1k