ai-audio-startups

Community list of startups working with AI in audio and music technology

Stars: 1465

Visit

The 'ai-audio-startups' repository is a community list of startups working with AI for audio and music tech. It includes a comprehensive collection of tools and platforms that leverage artificial intelligence to enhance various aspects of music creation, production, source separation, analysis, recommendation, health & wellbeing, radio/podcast, hearing, sound detection, speech transcription, synthesis, enhancement, and manipulation. The repository serves as a valuable resource for individuals interested in exploring innovative AI applications in the audio and music industry.

README:

ai-audio-startups

Community list of startups working with AI for audio and music tech

Music

Creation & Production

Microphone Studio - Multi-track recording without expensive studio equipment
TuneFlow - Generate lyrics, melody, drum beats and a lot more, while still being able to edit and mix like any professional DAWs.
CassetteAI - AI powered music production platform: Make lyrics, beats & vocals with AI then mix & publish straight from Cassette.
AIVA - The Artificial Intelligence composing emotional soundtrack music.
beatoven.ai - A simplified music creation tool that helps you create music for your videos and podcasts.
Infinite Album - Adaptive AI music for gamers who livestream.
Epidemic Sound - High quality music and sound effects for all your content, all rights included.
Wonder - Dynascore: The world’s first Dynamic Music Engine.
Amper (Acquired by Shutterstock) - AI Music Composition Tools for Content Creators.
mayk.it - your virtual music studio.
boomy - Make instant music, Share it with the world.
enote - Intelligent Sheet Music
Qosmo - Qosmo is a group of artists, researchers, designers, and programmers.
AI Music (Acquired by Apple) - Our music helps brands enable deeper connections with their audiences.
Splash HQ - The next generation of music producers
musico - AI-driven software engine that generates music. It can react to gesture, movement, code or other sound.
Yousician - The largest music educator on the planet.
Tape It - App for songwriting & audio recording.
sessionwire - All-in-one online collaboration platform that delivers a seamless studio experience.
Aflorithmic - Professional audio, voice, sound and music to scale.
Audio Design Desk - The Audio Solution for Video Editors.
Never Before Heard Sounds - A music studio powered by AI.
NeuralDSP - Empowers music players by democratizing the access to world-class sound, through an intuitive software/hardware ecosystem.
Neutone - AI audio plugin & community bridging the gap between AI research and creativity.
RoEx - AI Powered Mixing Services for Musicians, Producers and Content Creators.
LANDR - Online music software for creators: music mastering, digital music distribution, rent-to-own plugins, free sample packs, collaboration tools.
Accusonus (Acquired by Meta) - Audio and Video Editing Software For Creators
Moises - The Musician’s App.
Waveshaper (Previously Tonz) - Real-time neural signal processing
Sonible - Audio Soft & Hardware made in Austria.
Accentize - Intelligent audio tools
AI Mastering - AI-powered online audio mastering service.
Splice - Music-creation technology platform that automates the process of making and sharing music.
AudioStellar - Open source data-driven experimental sampler.
chord.ai - Chords and beats for any song!
DoReMIR - Sing and play into a single mic to get a lead sheet, with lyrics & chords!
mubert - Instantly generate tracks perfectly tailored to your content on any platform.
Evoke Music - Find the right music for your videos, podcasts, and business.
Klangio - Our innovative apps enable you to create sheet music easy and fast!
XLN Audio - VST plugin developer of Addictive Drums, Addictive Keys, RC-20 and XO.
Laplacian Audio - Formerly 'Definite Technologies', developing VST/AU/AUv3 that uses AI in order to process/generate sound.
Lifescore - Adaptive AI music platform. Real time Cellular Composition from high quality audio samples.
WaveAI - AI-based musical assistanc including lyrics writing assistant.
Humtap - A platform for real-time music, audio & video creation.
Voctro Labs - Synthetic Singing for creative media applications.
Loudly - Music solutions for the digital universe, makers of Soundtracks, AI Studio, Music Maker JAM
DeepMusic - AI music creation and production.
Soundraw - Freely customize high quality royalty-free music
BandLab - The cloud platform where musicians and fans create music, collaborate, and engage with each other across the globe.
Setmixer - Help artists record, mix and master their live shows using a combination of embedded software, signal processing, AI.
okio - Open source generative tools for music
Audialab - Ethical audio AI plugins, tools, and community designed to empower real artists with AI, not displace them.
suno - Create music and speech with AI
Lemonaide Music - Generative Music tools integrated with DAWs and 100% royalty free.
tuney.io - Ethical Music AI for Creative Media.
KORUS AI - AI music creation platform and your personal music producer exploring the Universe of Sound.
TRINITI - Gives you new ways to create and express yourself through music.
voice swap - Change your singing voice using AI.
mix audio - AI music for your creativity and productivity.
Audiogen - Generate sounds, sound effects, music, samples, ambience and more with AI.
Wavtool - web based DAW with AI assistants and support for local VST plug-ins
Wavacity - A port of the Audacity® audio editor to the web browser.

Source separation

TuneFlow - A free DAW that offers high quality vocal, drums, melody, bass stem separation, all-in-one audio separation, editing and vocal/instrument to MIDI transcription.
Spliter.ai - AI Audio Processing
Gaudio - Redefine your audio experience in music/video streaming and virtual/augmented reality.
AudioShake - An On-Demand Stem Creation Platform for the Music Industry.
Audionamix - Audio separation solutions for the entertainment industry to unlock every ounce of potential from classic content.
vocali.se - Separate vocals and music from any song, in seconds!
lalal.ai - High-quality stem splitting based on the world's #1 AI-powered technology.
VocalRemover - Separate voice from music out of a song free with powerful AI algorithms.
PhonicMind - Separate vocals, drums, bass and other instruments out of your songs with our HiFi AI.
EasySplitter - AI-Based Vocal Remover Online for DJ Singers
Remover.studio/) - Vocal Remover & Online Karaoke
MVSep - Free separation of songs with many different algorithms (Demucs, MDX, UVR etc)
MuzLab - Remove vocals from songs and split drums, bass and other instruments out of music.
Fadr - Remove stems, convert to midi, and create high-quality remixes and mashups using AI tools!

Analysis / Recommendation

AIMS - AI-powered music similarity search & auto-tagging for anyone who makes music discovery their business.
FeedForward - The intuitive audio search engine for audio & sound catalogues.
Aimi - Discover the artists who freed their music from the shackles of songs and playlists.
Utopia Music - Fair Pay for Every Play
Musiio (Acquired by SoundCloud) - Use Artificial Intelligence to help automate your workflows.
niland (Acquired by Spotify) - Build AI Powered Music Apps
cyanite - AI for Music tagging and similarity search
musicube (Acquired by SongTradr) - B2B AI music metadata services like auto-tagging, metadata enrichment and semantic search
Musixmatch - Algorithms and tools for music discovery, recommendation, and search based on lyrics.
hoopr - Find the best music, tell better stories, grow your audience. AI-powered engine that helps find the right soundtrack.
Pex - Music identification and copyright compliance. Audio fingerprinting, cover song identification in large scale.
SONOTELLER - AI music analysis including song lyrics summarization, themes extraction and musical features.

Health & Wellbeing

Endel - Personalized soundscapes to help you focus, relax, and sleep.
Lucid - Transforming music into medicine, using AI to compose and curate a personalized therapeutic music experience
Wavepaths - Music for Psychedelic Therapy
Suki - AI-powered voice solutions for healthcare.
audEERING - Technology that can detect emotions and health information from the voice.
brain.fm - Music to Focus Better
SPOKE - Lo-fi & Lyricism-led Mindfulness music episodes
sona - music as medicine. research-based music for anxiety made by Grammy-winning producers.
Novoic - Using speech to detect neurological diseases.
Ubenwa - Infant health analysis based on cry signals.

Radio / Podcast

faidr - Your favorite radio, interruption free.
fathom - The search engine for podcasts.
Nomono - A self-contained recording kit for capturing interviews in the field.
Descript - All-in-one audio & video editing, as easy as a doc.
auphonic - Automatic audio post production web service for podcasts, broadcasters, radio shows, movies, screencasts and more.
SimonSays - Edit Video 5x Faster Built For Teams
Podcastle - Studio-quality recording, AI-powered editing, and seamless exporting – easy to use and FREE
cleanvoice - Removes filler sounds, stuttering and mouth sounds from your podcast or audio recording
Super Hi-Fi - Artificial Intelligence Powered Music Experiences

Hearing

Whisper.ai - Smarter than your average hearing aid.
Eargo - A Revolutionary New Hearing Aid.
Concha Labs - Helping you hear more clearly

Sound detection

Audio Analytic - Creating exceptional human experiences through a greater sense of hearing.
SoundEye - Advanced sound recognition solutions capable of classifying sounds such as screaming, gunshot, coughing, and crying
cochl - A next-generation sound AI platform that understands any sounds like a human.
Josh.ai - a voice-controlled home automation system.
SEE SOUND - The world’s first smart home hearing system
Epigos.ai - AI models that can be used to extract hidden data from audio sources.
HyperSurfaces - Seamlessly merging the physical and data worlds without the need for keyboards, buttons or touch screens.
HyperSentience - HyperSentience delivers context awareness to phones, VR/AR headsets, smart watches, speakers and laptops.
Circulr Sound - Smart audio wearables
Securaxis - We turn sounds into information.
Deeply - We add meaning to every sound in the world using advanced deep learning technology for sound event detection and context recognition

Speech

Transcription

Ava - Professional and AI-Based Captions for Deaf and HoH (Transcription & Diarization)
verbit - Professional AI-Based Transcription & Captioning
otter - Everything hybrid teams need for productive, collaborative meetings.
Trint - Audio Transcription Software - Speech to Text to Magic
Rev - 99% accurate captions, transcripts, and subtitles.
voiceitt - An app for people with non-standard speech
deepgram.com - Better voice applications with faster, more accurate transcription through AI Speech Recognition
fireflies.ai - AI assistant for your meetings
SoapBox - Speech technology that makes kids heard.
Amberscript - SaaS solutions that automatically transform audio and video into text and subtitles using speech recognition.
Speaksee - Live captions what’s being said during in-person group meetings.
Speechmatics - Autonomous Speech Recognition technology that understands every voice.
sonix - Automated transcription in 35+ languages.
Picovoice - End-to-end Edge Voice AI, On-device voice recognition
BoldVoice - Speak English clearly and confidently
Gladia - Power your product with cutting-edge AI transcription, translation and audio intelligence using a single API.
Podsqueeze - Re-purpose your audio or video podcast into transcript, show notes, blog post, video clips and other assets to publish and promote your show.

Synthesis (TTS)

adauris.ai - Transforming written content into engaging audio with seamless distribution.
Aflorithmic - Professional audio, voice, sound and music to scale.
Sonantic (Aquired by Spotify) - Deliver compelling, lifelike performances with fully expressive AI-generated voices.
kroop AI - Harness synthetic media generation and detection with endless possibilities.
dubverse - Make your content multilingual at a click of a button and reach more people.
Resemble.ai - Generate AI Voices that sound real.
Replica - AI voice actors for games, film & the metaverse.
Respeecher - Voice Cloning for Content Creators.
amai - Ultra realistic text to speech voice engines.
AssemblyAI - Transcribe and understand audio with a single AI-powered API.
DAISYS - New voices that sound like real people
WellSaid - Text-to-speech technology that creates life-like synthetic voices, from the voices of real people.
Deepsync - Generate audio content that exactly sounds like you.
coqui.ai - Providing open speech tech for everyone
Voiseed- AI-based Voice Engine is able to mimic the emotions and prosody of human speech.
Speechki - NLP-based most improved text and audio editing platform with hundreds AI-voices inside.
MiSynth - A brain-controlled instrument that uses synaptic technology and BCIs to turn imagined sounds into a synthesized MIDI instrument.
ElevenLabs - Developing the most compelling AI speech software for publishers and creators
Wondercraft - Wondercraft enables users to generate podcasts using Text-to-Speech technology.
play.ht - Building the future of content creation based on generative machine learning models.
Revocalize.ai – Generate studio-quality AI Voices and train AI voice models from the web dashboard or the VST plugin.
morpheme.ai - Our Actor-First, Digital-Double Voices are powered by the latest AI technology, ensuring that they are efficient, authentic, and ethical.

Enhancement & Manipulation

Meaning - Streaming real-time voice and accent conversion.
krisp - An AI-powered software solution for effective online meetings.
voicemod - Free real-time voice changer.
audo - Noise cancellation products for creators, developers, and virtual meetings.
AudioTelligence - Our software transforms the clarity and intelligibility of speech in challenging acoustic environments.
immersitech.io - We don’t make audio. We make audio better.
utterly - Noise removal for meetings and audio.
claerity.ai - Cutting-edge AI to eliminate all background noise on video conference calls.
Neural Love - Set of AI-powered tools to enhance audio quality.
HeardThat - A smartphone app that turns your smartphone into a sophisticated speech-enhancement device.
Chatable - A smartphone app that removes disruptive background noise
BdSound - Intelligent Audio Solution for audio and voice-enabled products.
echosonic - Revolutionizing microphone by bringing Machine Learning capabilities into it.
Insoundz - Generative AI Audio Enhancement

Contributing

Fork the repo, edit the README, and then make a PR.

For Tasks:

Click tags to check more tools for each tasks

create music tracks separate audio stems analyze music similarity transcribe audio content enhance speech clarity

For Jobs:

music producer audio engineer sound designer podcast editor healthcare technologist

Alternative AI tools for ai-audio-startups

Similar Open Source Tools

ai-audio-startups

github

: 1.5k

awesome-ai-tools

Awesome AI Tools is a curated list of popular tools and resources for artificial intelligence enthusiasts. It includes a wide range of tools such as machine learning libraries, deep learning frameworks, data visualization tools, and natural language processing resources. Whether you are a beginner or an experienced AI practitioner, this repository aims to provide you with a comprehensive collection of tools to enhance your AI projects and research. Explore the list to discover new tools, stay updated with the latest advancements in AI technology, and find the right resources to support your AI endeavors.

github

: 1.6k

awesome-generative-ai

Awesome Generative AI is a curated list of modern Generative Artificial Intelligence projects and services. Generative AI technology creates original content like images, sounds, and texts using machine learning algorithms trained on large data sets. It can produce unique and realistic outputs such as photorealistic images, digital art, music, and writing. The repo covers a wide range of applications in art, entertainment, marketing, academia, and computer science.

github

: 10.3k

llms-tools

The 'llms-tools' repository is a comprehensive collection of AI tools, open-source projects, and research related to Large Language Models (LLMs) and Chatbots. It covers a wide range of topics such as AI in various domains, open-source models, chats & assistants, visual language models, evaluation tools, libraries, devices, income models, text-to-image, computer vision, audio & speech, code & math, games, robotics, typography, bio & med, military, climate, finance, and presentation. The repository provides valuable resources for researchers, developers, and enthusiasts interested in exploring the capabilities of LLMs and related technologies.

github

: 278

ai-audio-datasets

AI Audio Datasets List (AI-ADL) is a comprehensive collection of datasets consisting of speech, music, and sound effects, used for Generative AI, AIGC, AI model training, and audio applications. It includes datasets for speech recognition, speech synthesis, music information retrieval, music generation, audio processing, sound synthesis, and more. The repository provides a curated list of diverse datasets suitable for various AI audio tasks.

github

: 487

awesome-ai

Awesome AI is a curated list of artificial intelligence resources including courses, tools, apps, and open-source projects. It covers a wide range of topics such as machine learning, deep learning, natural language processing, robotics, conversational interfaces, data science, and more. The repository serves as a comprehensive guide for individuals interested in exploring the field of artificial intelligence and its applications across various domains.

github

: 453

MediaAI

MediaAI is a repository containing lectures and materials for Aalto University's AI for Media, Art & Design course. The course is a hands-on, project-based crash course focusing on deep learning and AI techniques for artists and designers. It covers common AI algorithms & tools, their applications in art, media, and design, and provides hands-on practice in designing, implementing, and using these tools. The course includes lectures, exercises, and a final project based on students' interests. Students can complete the course without programming by creatively utilizing existing tools like ChatGPT and DALL-E. The course emphasizes collaboration, peer-to-peer tutoring, and project-based learning. It covers topics such as text generation, image generation, optimization, and game AI.

github

: 61

start-llms

This repository is a comprehensive guide for individuals looking to start and improve their skills in Large Language Models (LLMs) without an advanced background in the field. It provides free resources, online courses, books, articles, and practical tips to become an expert in machine learning. The guide covers topics such as terminology, transformers, prompting, retrieval augmented generation (RAG), and more. It also includes recommendations for podcasts, YouTube videos, and communities to stay updated with the latest news in AI and LLMs.

github

: 789

foundations-of-gen-ai

This repository contains code for the O'Reilly Live Online Training for 'Transformer Architectures for Generative AI'. The course provides a deep understanding of transformer architectures and their impact on natural language processing (NLP) and vision tasks. Participants learn to harness transformers to tackle problems in text, image, and multimodal AI through theory and practical exercises.

github

: 74

start-machine-learning

Start Machine Learning in 2024 is a comprehensive guide for beginners to advance in machine learning and artificial intelligence without any prior background. The guide covers various resources such as free online courses, articles, books, and practical tips to become an expert in the field. It emphasizes self-paced learning and provides recommendations for learning paths, including videos, podcasts, and online communities. The guide also includes information on building language models and applications, practicing through Kaggle competitions, and staying updated with the latest news and developments in AI. The goal is to empower individuals with the knowledge and resources to excel in machine learning and AI.

github

: 4.6k

ai_gallery

AI Gallery is a showcase site built using React and Nextjs for static site generation, featuring interactive visualizations of classic algorithms, classic games implementation, and various interesting widgets. The project utilizes AI assistance from Claude 3.5 and GPT-4 to create components and enhance the development process. It aims to continually add more components with AI assistance, providing a platform for contributors to leverage AI in frontend development.

github

: 656

orate

Orate is an AI toolkit designed for speech processing tasks. It allows users to generate realistic, human-like speech and transcribe audio using a unified API that integrates with popular AI providers such as OpenAI, ElevenLabs, and AssemblyAI. The toolkit can be easily installed using npm or other package managers. For more details, visit the website.

github

: 363

awesome-generative-ai

A curated list of Generative AI projects, tools, artworks, and models

github

: 2.7k

oreilly-retrieval-augmented-gen-ai

This repository focuses on Retrieval-Augmented Generation (RAG) and Large Language Models (LLMs). It provides code and resources to augment LLMs with real-time data for dynamic, context-aware applications. The content covers topics such as semantic search, fine-tuning embeddings, building RAG chatbots, evaluating LLMs, and using knowledge graphs in RAG. Prerequisites include Python skills, knowledge of machine learning and LLMs, and introductory experience with NLP and AI models.

github

: 61

awesome-llms-fine-tuning

This repository is a curated collection of resources for fine-tuning Large Language Models (LLMs) like GPT, BERT, RoBERTa, and their variants. It includes tutorials, papers, tools, frameworks, and best practices to aid researchers, data scientists, and machine learning practitioners in adapting pre-trained models to specific tasks and domains. The resources cover a wide range of topics related to fine-tuning LLMs, providing valuable insights and guidelines to streamline the process and enhance model performance.

github

: 119

LLM-Codec

This repository provides an LLM-driven audio codec model, LLM-Codec, for building multi-modal LLMs (text and audio modalities). The model enables frozen LLMs to achieve multiple audio tasks in a few-shot style without parameter updates. It compresses the audio modality into a well-trained LLMs token space, treating audio representation as a 'foreign language' that LLMs can learn with minimal examples. The proposed approach supports tasks like speech emotion classification, audio classification, text-to-speech generation, speech enhancement, etc., demonstrating feasibility and effectiveness in simple scenarios. The LLM-Codec model is open-sourced to facilitate research on few-shot audio task learning and multi-modal LLMs.

github

: 103

For similar tasks

ai-audio-startups

github

: 1.5k

For similar jobs

metavoice-src

MetaVoice-1B is a 1.2B parameter base model trained on 100K hours of speech for TTS (text-to-speech). It has been built with the following priorities: * Emotional speech rhythm and tone in English. * Zero-shot cloning for American & British voices, with 30s reference audio. * Support for (cross-lingual) voice cloning with finetuning. * We have had success with as little as 1 minute training data for Indian speakers. * Synthesis of arbitrary length text

github

: 3.1k

suno-api

Suno AI API is an open-source project that allows developers to integrate the music generation capabilities of Suno.ai into their own applications. The API provides a simple and convenient way to generate music, lyrics, and other audio content using Suno.ai's powerful AI models. With Suno AI API, developers can easily add music generation functionality to their apps, websites, and other projects.

github

: 1.7k

bark.cpp

Bark.cpp is a C/C++ implementation of the Bark model, a real-time, multilingual text-to-speech generation model. It supports AVX, AVX2, and AVX512 for x86 architectures, and is compatible with both CPU and GPU backends. Bark.cpp also supports mixed F16/F32 precision and 4-bit, 5-bit, and 8-bit integer quantization. It can be used to generate realistic-sounding audio from text prompts.

github

: 696

NSMusicS

NSMusicS is a local music software that is expected to support multiple platforms with AI capabilities and multimodal features. The goal of NSMusicS is to integrate various functions (such as artificial intelligence, streaming, music library management, cross platform, etc.), which can be understood as similar to Navidrome but with more features than Navidrome. It wants to become a plugin integrated application that can almost have all music functions.

github

: 713

ai-voice-cloning

This repository provides a tool for AI voice cloning, allowing users to generate synthetic speech that closely resembles a target speaker's voice. The tool is designed to be user-friendly and accessible, with a graphical user interface that guides users through the process of training a voice model and generating synthetic speech. The tool also includes a variety of features that allow users to customize the generated speech, such as the pitch, volume, and speaking rate. Overall, this tool is a valuable resource for anyone interested in creating realistic and engaging synthetic speech.

github

: 268

RVC_CLI

**RVC_CLI: Retrieval-based Voice Conversion Command Line Interface** This command-line interface (CLI) provides a comprehensive set of tools for voice conversion, enabling you to modify the pitch, timbre, and other characteristics of audio recordings. It leverages advanced machine learning models to achieve realistic and high-quality voice conversions. **Key Features:** * **Inference:** Convert the pitch and timbre of audio in real-time or process audio files in batch mode. * **TTS Inference:** Synthesize speech from text using a variety of voices and apply voice conversion techniques. * **Training:** Train custom voice conversion models to meet specific requirements. * **Model Management:** Extract, blend, and analyze models to fine-tune and optimize performance. * **Audio Analysis:** Inspect audio files to gain insights into their characteristics. * **API:** Integrate the CLI's functionality into your own applications or workflows. **Applications:** The RVC_CLI finds applications in various domains, including: * **Music Production:** Create unique vocal effects, harmonies, and backing vocals. * **Voiceovers:** Generate voiceovers with different accents, emotions, and styles. * **Audio Editing:** Enhance or modify audio recordings for podcasts, audiobooks, and other content. * **Research and Development:** Explore and advance the field of voice conversion technology. **For Jobs:** * Audio Engineer * Music Producer * Voiceover Artist * Audio Editor * Machine Learning Engineer **AI Keywords:** * Voice Conversion * Pitch Shifting * Timbre Modification * Machine Learning * Audio Processing **For Tasks:** * Convert Pitch * Change Timbre * Synthesize Speech * Train Model * Analyze Audio

github

: 71

openvino-plugins-ai-audacity

OpenVINO™ AI Plugins for Audacity* are a set of AI-enabled effects, generators, and analyzers for Audacity®. These AI features run 100% locally on your PC -- no internet connection necessary! OpenVINO™ is used to run AI models on supported accelerators found on the user's system such as CPU, GPU, and NPU. * **Music Separation**: Separate a mono or stereo track into individual stems -- Drums, Bass, Vocals, & Other Instruments. * **Noise Suppression**: Removes background noise from an audio sample. * **Music Generation & Continuation**: Uses MusicGen LLM to generate snippets of music, or to generate a continuation of an existing snippet of music. * **Whisper Transcription**: Uses whisper.cpp to generate a label track containing the transcription or translation for a given selection of spoken audio or vocals.

github

: 885

WavCraft

WavCraft is an LLM-driven agent for audio content creation and editing. It applies LLM to connect various audio expert models and DSP function together. With WavCraft, users can edit the content of given audio clip(s) conditioned on text input, create an audio clip given text input, get more inspiration from WavCraft by prompting a script setting and let the model do the scriptwriting and create the sound, and check if your audio file is synthesized by WavCraft.

github

: 347