ai-audio-startups
Community list of startups working with AI in audio and music technology
Stars: 1465
The 'ai-audio-startups' repository is a community list of startups working with AI for audio and music tech. It includes a comprehensive collection of tools and platforms that leverage artificial intelligence to enhance various aspects of music creation, production, source separation, analysis, recommendation, health & wellbeing, radio/podcast, hearing, sound detection, speech transcription, synthesis, enhancement, and manipulation. The repository serves as a valuable resource for individuals interested in exploring innovative AI applications in the audio and music industry.
README:
Community list of startups working with AI for audio and music tech
- Microphone Studio - Multi-track recording without expensive studio equipment
- TuneFlow - Generate lyrics, melody, drum beats and a lot more, while still being able to edit and mix like any professional DAWs.
- CassetteAI - AI powered music production platform: Make lyrics, beats & vocals with AI then mix & publish straight from Cassette.
- AIVA - The Artificial Intelligence composing emotional soundtrack music.
- beatoven.ai - A simplified music creation tool that helps you create music for your videos and podcasts.
- Infinite Album - Adaptive AI music for gamers who livestream.
- Epidemic Sound - High quality music and sound effects for all your content, all rights included.
- Wonder - Dynascore: The world’s first Dynamic Music Engine.
- Amper (Acquired by Shutterstock) - AI Music Composition Tools for Content Creators.
- mayk.it - your virtual music studio.
- boomy - Make instant music, Share it with the world.
- enote - Intelligent Sheet Music
- Qosmo - Qosmo is a group of artists, researchers, designers, and programmers.
- AI Music (Acquired by Apple) - Our music helps brands enable deeper connections with their audiences.
- Splash HQ - The next generation of music producers
- musico - AI-driven software engine that generates music. It can react to gesture, movement, code or other sound.
- Yousician - The largest music educator on the planet.
- Tape It - App for songwriting & audio recording.
- sessionwire - All-in-one online collaboration platform that delivers a seamless studio experience.
- Aflorithmic - Professional audio, voice, sound and music to scale.
- Audio Design Desk - The Audio Solution for Video Editors.
- Never Before Heard Sounds - A music studio powered by AI.
- NeuralDSP - Empowers music players by democratizing the access to world-class sound, through an intuitive software/hardware ecosystem.
- Neutone - AI audio plugin & community bridging the gap between AI research and creativity.
- RoEx - AI Powered Mixing Services for Musicians, Producers and Content Creators.
- LANDR - Online music software for creators: music mastering, digital music distribution, rent-to-own plugins, free sample packs, collaboration tools.
- Accusonus (Acquired by Meta) - Audio and Video Editing Software For Creators
- Moises - The Musician’s App.
- Waveshaper (Previously Tonz) - Real-time neural signal processing
- Sonible - Audio Soft & Hardware made in Austria.
- Accentize - Intelligent audio tools
- AI Mastering - AI-powered online audio mastering service.
- Splice - Music-creation technology platform that automates the process of making and sharing music.
- AudioStellar - Open source data-driven experimental sampler.
- chord.ai - Chords and beats for any song!
- DoReMIR - Sing and play into a single mic to get a lead sheet, with lyrics & chords!
- mubert - Instantly generate tracks perfectly tailored to your content on any platform.
- Evoke Music - Find the right music for your videos, podcasts, and business.
- Klangio - Our innovative apps enable you to create sheet music easy and fast!
- XLN Audio - VST plugin developer of Addictive Drums, Addictive Keys, RC-20 and XO.
- Laplacian Audio - Formerly 'Definite Technologies', developing VST/AU/AUv3 that uses AI in order to process/generate sound.
- Lifescore - Adaptive AI music platform. Real time Cellular Composition from high quality audio samples.
- WaveAI - AI-based musical assistanc including lyrics writing assistant.
- Humtap - A platform for real-time music, audio & video creation.
- Voctro Labs - Synthetic Singing for creative media applications.
- Loudly - Music solutions for the digital universe, makers of Soundtracks, AI Studio, Music Maker JAM
- DeepMusic - AI music creation and production.
- Soundraw - Freely customize high quality royalty-free music
- BandLab - The cloud platform where musicians and fans create music, collaborate, and engage with each other across the globe.
- Setmixer - Help artists record, mix and master their live shows using a combination of embedded software, signal processing, AI.
- okio - Open source generative tools for music
- Audialab - Ethical audio AI plugins, tools, and community designed to empower real artists with AI, not displace them.
- suno - Create music and speech with AI
- Lemonaide Music - Generative Music tools integrated with DAWs and 100% royalty free.
- tuney.io - Ethical Music AI for Creative Media.
- KORUS AI - AI music creation platform and your personal music producer exploring the Universe of Sound.
- TRINITI - Gives you new ways to create and express yourself through music.
- voice swap - Change your singing voice using AI.
- mix audio - AI music for your creativity and productivity.
- Audiogen - Generate sounds, sound effects, music, samples, ambience and more with AI.
- Wavtool - web based DAW with AI assistants and support for local VST plug-ins
- Wavacity - A port of the Audacity® audio editor to the web browser.
- TuneFlow - A free DAW that offers high quality vocal, drums, melody, bass stem separation, all-in-one audio separation, editing and vocal/instrument to MIDI transcription.
- Spliter.ai - AI Audio Processing
- Gaudio - Redefine your audio experience in music/video streaming and virtual/augmented reality.
- AudioShake - An On-Demand Stem Creation Platform for the Music Industry.
- Audionamix - Audio separation solutions for the entertainment industry to unlock every ounce of potential from classic content.
- vocali.se - Separate vocals and music from any song, in seconds!
- lalal.ai - High-quality stem splitting based on the world's #1 AI-powered technology.
- VocalRemover - Separate voice from music out of a song free with powerful AI algorithms.
- PhonicMind - Separate vocals, drums, bass and other instruments out of your songs with our HiFi AI.
- EasySplitter - AI-Based Vocal Remover Online for DJ Singers
- Remover.studio/) - Vocal Remover & Online Karaoke
- MVSep - Free separation of songs with many different algorithms (Demucs, MDX, UVR etc)
- MuzLab - Remove vocals from songs and split drums, bass and other instruments out of music.
- Fadr - Remove stems, convert to midi, and create high-quality remixes and mashups using AI tools!
- AIMS - AI-powered music similarity search & auto-tagging for anyone who makes music discovery their business.
- FeedForward - The intuitive audio search engine for audio & sound catalogues.
- Aimi - Discover the artists who freed their music from the shackles of songs and playlists.
- Utopia Music - Fair Pay for Every Play
- Musiio (Acquired by SoundCloud) - Use Artificial Intelligence to help automate your workflows.
- niland (Acquired by Spotify) - Build AI Powered Music Apps
- cyanite - AI for Music tagging and similarity search
- musicube (Acquired by SongTradr) - B2B AI music metadata services like auto-tagging, metadata enrichment and semantic search
- Musixmatch - Algorithms and tools for music discovery, recommendation, and search based on lyrics.
- hoopr - Find the best music, tell better stories, grow your audience. AI-powered engine that helps find the right soundtrack.
- Pex - Music identification and copyright compliance. Audio fingerprinting, cover song identification in large scale.
- SONOTELLER - AI music analysis including song lyrics summarization, themes extraction and musical features.
- Endel - Personalized soundscapes to help you focus, relax, and sleep.
- Lucid - Transforming music into medicine, using AI to compose and curate a personalized therapeutic music experience
- Wavepaths - Music for Psychedelic Therapy
- Suki - AI-powered voice solutions for healthcare.
- audEERING - Technology that can detect emotions and health information from the voice.
- brain.fm - Music to Focus Better
- SPOKE - Lo-fi & Lyricism-led Mindfulness music episodes
- sona - music as medicine. research-based music for anxiety made by Grammy-winning producers.
- Novoic - Using speech to detect neurological diseases.
- Ubenwa - Infant health analysis based on cry signals.
- faidr - Your favorite radio, interruption free.
- fathom - The search engine for podcasts.
- Nomono - A self-contained recording kit for capturing interviews in the field.
- Descript - All-in-one audio & video editing, as easy as a doc.
- auphonic - Automatic audio post production web service for podcasts, broadcasters, radio shows, movies, screencasts and more.
- SimonSays - Edit Video 5x Faster Built For Teams
- Podcastle - Studio-quality recording, AI-powered editing, and seamless exporting – easy to use and FREE
- cleanvoice - Removes filler sounds, stuttering and mouth sounds from your podcast or audio recording
- Super Hi-Fi - Artificial Intelligence Powered Music Experiences
- Whisper.ai - Smarter than your average hearing aid.
- Eargo - A Revolutionary New Hearing Aid.
- Concha Labs - Helping you hear more clearly
- Audio Analytic - Creating exceptional human experiences through a greater sense of hearing.
- SoundEye - Advanced sound recognition solutions capable of classifying sounds such as screaming, gunshot, coughing, and crying
- cochl - A next-generation sound AI platform that understands any sounds like a human.
- Josh.ai - a voice-controlled home automation system.
- SEE SOUND - The world’s first smart home hearing system
- Epigos.ai - AI models that can be used to extract hidden data from audio sources.
- HyperSurfaces - Seamlessly merging the physical and data worlds without the need for keyboards, buttons or touch screens.
- HyperSentience - HyperSentience delivers context awareness to phones, VR/AR headsets, smart watches, speakers and laptops.
- Circulr Sound - Smart audio wearables
- Securaxis - We turn sounds into information.
- Deeply - We add meaning to every sound in the world using advanced deep learning technology for sound event detection and context recognition
- Ava - Professional and AI-Based Captions for Deaf and HoH (Transcription & Diarization)
- verbit - Professional AI-Based Transcription & Captioning
- otter - Everything hybrid teams need for productive, collaborative meetings.
- Trint - Audio Transcription Software - Speech to Text to Magic
- Rev - 99% accurate captions, transcripts, and subtitles.
- voiceitt - An app for people with non-standard speech
- deepgram.com - Better voice applications with faster, more accurate transcription through AI Speech Recognition
- fireflies.ai - AI assistant for your meetings
- SoapBox - Speech technology that makes kids heard.
- Amberscript - SaaS solutions that automatically transform audio and video into text and subtitles using speech recognition.
- Speaksee - Live captions what’s being said during in-person group meetings.
- Speechmatics - Autonomous Speech Recognition technology that understands every voice.
- sonix - Automated transcription in 35+ languages.
- Picovoice - End-to-end Edge Voice AI, On-device voice recognition
- BoldVoice - Speak English clearly and confidently
- Gladia - Power your product with cutting-edge AI transcription, translation and audio intelligence using a single API.
- Podsqueeze - Re-purpose your audio or video podcast into transcript, show notes, blog post, video clips and other assets to publish and promote your show.
- adauris.ai - Transforming written content into engaging audio with seamless distribution.
- Aflorithmic - Professional audio, voice, sound and music to scale.
- Sonantic (Aquired by Spotify) - Deliver compelling, lifelike performances with fully expressive AI-generated voices.
- kroop AI - Harness synthetic media generation and detection with endless possibilities.
- dubverse - Make your content multilingual at a click of a button and reach more people.
- Resemble.ai - Generate AI Voices that sound real.
- Replica - AI voice actors for games, film & the metaverse.
- Respeecher - Voice Cloning for Content Creators.
- amai - Ultra realistic text to speech voice engines.
- AssemblyAI - Transcribe and understand audio with a single AI-powered API.
- DAISYS - New voices that sound like real people
- WellSaid - Text-to-speech technology that creates life-like synthetic voices, from the voices of real people.
- Deepsync - Generate audio content that exactly sounds like you.
- coqui.ai - Providing open speech tech for everyone
- Voiseed- AI-based Voice Engine is able to mimic the emotions and prosody of human speech.
- Speechki - NLP-based most improved text and audio editing platform with hundreds AI-voices inside.
- MiSynth - A brain-controlled instrument that uses synaptic technology and BCIs to turn imagined sounds into a synthesized MIDI instrument.
- ElevenLabs - Developing the most compelling AI speech software for publishers and creators
- Wondercraft - Wondercraft enables users to generate podcasts using Text-to-Speech technology.
- play.ht - Building the future of content creation based on generative machine learning models.
- Revocalize.ai – Generate studio-quality AI Voices and train AI voice models from the web dashboard or the VST plugin.
- morpheme.ai - Our Actor-First, Digital-Double Voices are powered by the latest AI technology, ensuring that they are efficient, authentic, and ethical.
- Meaning - Streaming real-time voice and accent conversion.
- krisp - An AI-powered software solution for effective online meetings.
- voicemod - Free real-time voice changer.
- audo - Noise cancellation products for creators, developers, and virtual meetings.
- AudioTelligence - Our software transforms the clarity and intelligibility of speech in challenging acoustic environments.
- immersitech.io - We don’t make audio. We make audio better.
- utterly - Noise removal for meetings and audio.
- claerity.ai - Cutting-edge AI to eliminate all background noise on video conference calls.
- Neural Love - Set of AI-powered tools to enhance audio quality.
- HeardThat - A smartphone app that turns your smartphone into a sophisticated speech-enhancement device.
- Chatable - A smartphone app that removes disruptive background noise
- BdSound - Intelligent Audio Solution for audio and voice-enabled products.
- echosonic - Revolutionizing microphone by bringing Machine Learning capabilities into it.
- Insoundz - Generative AI Audio Enhancement
Fork the repo, edit the README, and then make a PR.
For Tasks:
Click tags to check more tools for each tasksFor Jobs:
Alternative AI tools for ai-audio-startups
Similar Open Source Tools
ai-audio-startups
The 'ai-audio-startups' repository is a community list of startups working with AI for audio and music tech. It includes a comprehensive collection of tools and platforms that leverage artificial intelligence to enhance various aspects of music creation, production, source separation, analysis, recommendation, health & wellbeing, radio/podcast, hearing, sound detection, speech transcription, synthesis, enhancement, and manipulation. The repository serves as a valuable resource for individuals interested in exploring innovative AI applications in the audio and music industry.
awesome-ai-tools
Awesome AI Tools is a curated list of popular tools and resources for artificial intelligence enthusiasts. It includes a wide range of tools such as machine learning libraries, deep learning frameworks, data visualization tools, and natural language processing resources. Whether you are a beginner or an experienced AI practitioner, this repository aims to provide you with a comprehensive collection of tools to enhance your AI projects and research. Explore the list to discover new tools, stay updated with the latest advancements in AI technology, and find the right resources to support your AI endeavors.
awesome-generative-ai
Awesome Generative AI is a curated list of modern Generative Artificial Intelligence projects and services. Generative AI technology creates original content like images, sounds, and texts using machine learning algorithms trained on large data sets. It can produce unique and realistic outputs such as photorealistic images, digital art, music, and writing. The repo covers a wide range of applications in art, entertainment, marketing, academia, and computer science.
awesome-ai
Awesome AI is a curated list of artificial intelligence resources including courses, tools, apps, and open-source projects. It covers a wide range of topics such as machine learning, deep learning, natural language processing, robotics, conversational interfaces, data science, and more. The repository serves as a comprehensive guide for individuals interested in exploring the field of artificial intelligence and its applications across various domains.
lsp-ai
LSP-AI is an open source language server designed to enhance software engineers' productivity by integrating AI-powered functionality into various text editors. It serves as a backend for completion with large language models and offers features like unified AI capabilities, simplified plugin development, enhanced collaboration, broad compatibility with editors supporting Language Server Protocol, flexible LLM backend support, and commitment to staying updated with the latest advancements in LLM-driven software development. The tool aims to centralize open-source development work, provide a collaborative platform for developers, and offer a future-ready solution for AI-powered assistants in text editors.
awesome-openvino
Awesome OpenVINO is a curated list of AI projects based on the OpenVINO toolkit, offering a rich assortment of projects, libraries, and tutorials covering various topics like model optimization, deployment, and real-world applications across industries. It serves as a valuable resource continuously updated to maximize the potential of OpenVINO in projects, featuring projects like Stable Diffusion web UI, Visioncom, FastSD CPU, OpenVINO AI Plugins for GIMP, and more.
Midori-AI
Midori AI is a cutting-edge initiative dedicated to advancing the field of artificial intelligence through research, development, and community engagement. They focus on creating innovative AI solutions, exploring novel approaches, and empowering users to harness the power of AI. Key areas of focus include cluster-based AI, AI setup assistance, AI development for Discord bots, model serving and hosting, novel AI memory architectures, and Carly - a fully simulated human with advanced AI capabilities. They have also developed the Midori AI Subsystem to streamline AI workloads by providing simplified deployment, standardized configurations, isolation for AI systems, and a growing library of backends and tools.
obsidian-systemsculpt-ai
SystemSculpt AI is a comprehensive AI-powered plugin for Obsidian, integrating advanced AI capabilities into note-taking, task management, knowledge organization, and content creation. It offers modules for brain integration, chat conversations, audio recording and transcription, note templates, and task generation and management. Users can customize settings, utilize AI services like OpenAI and Groq, and access documentation for detailed guidance. The plugin prioritizes data privacy by storing sensitive information locally and offering the option to use local AI models for enhanced privacy.
awesome-ai-devtools
Awesome AI-Powered Developer Tools is a curated list of AI-powered developer tools that leverage AI to assist developers in tasks such as code completion, refactoring, debugging, documentation, and more. The repository includes a wide range of tools, from IDEs and Git clients to assistants, agents, app generators, UI generators, snippet generators, documentation tools, code generation tools, agent platforms, OpenAI plugins, search tools, and testing tools. These tools are designed to enhance developer productivity and streamline various development tasks by integrating AI capabilities.
MediaAI
MediaAI is a repository containing lectures and materials for Aalto University's AI for Media, Art & Design course. The course is a hands-on, project-based crash course focusing on deep learning and AI techniques for artists and designers. It covers common AI algorithms & tools, their applications in art, media, and design, and provides hands-on practice in designing, implementing, and using these tools. The course includes lectures, exercises, and a final project based on students' interests. Students can complete the course without programming by creatively utilizing existing tools like ChatGPT and DALL-E. The course emphasizes collaboration, peer-to-peer tutoring, and project-based learning. It covers topics such as text generation, image generation, optimization, and game AI.
CodeFuse-muAgent
CodeFuse-muAgent is a Multi-Agent framework designed to streamline Standard Operating Procedure (SOP) orchestration for agents. It integrates toolkits, code libraries, knowledge bases, and sandbox environments for rapid construction of complex Multi-Agent interactive applications. The framework enables efficient execution and handling of multi-layered and multi-dimensional tasks.
FedML
FedML is a unified and scalable machine learning library for running training and deployment anywhere at any scale. It is highly integrated with FEDML Nexus AI, a next-gen cloud service for LLMs & Generative AI. FEDML Nexus AI provides holistic support of three interconnected AI infrastructure layers: user-friendly MLOps, a well-managed scheduler, and high-performance ML libraries for running any AI jobs across GPU Clouds.
oreilly-hands-on-gpt-llm
This repository contains code for the O'Reilly Live Online Training for Deploying GPT & LLMs. Learn how to use GPT-4, ChatGPT, OpenAI embeddings, and other large language models to build applications for experimenting and production. Gain practical experience in building applications like text generation, summarization, question answering, and more. Explore alternative generative models such as Cohere and GPT-J. Understand prompt engineering, context stuffing, and few-shot learning to maximize the potential of GPT-like models. Focus on deploying models in production with best practices and debugging techniques. By the end of the training, you will have the skills to start building applications with GPT and other large language models.
knowledge
Knowledge is a tool for saving, searching, accessing, exploring and chatting with all of your favorite websites, documents and files. Dive into a more interactive learning experience with Knowledge's new Chat feature! Engage in dynamic conversations with your Projects and Sources, leveraging the power of Large Language Models. The Chat feature is designed to transform the way you interact with your data, offering a more engaging and exploratory approach to learning. Unleash the power of context with the built-in Chromium browser. Transform your browsing into knowledge gathering effortlessly.
ServerlessLLM
ServerlessLLM is a fast, affordable, and easy-to-use library designed for multi-LLM serving, optimized for environments with limited GPU resources. It supports loading various leading LLM inference libraries, achieving fast load times, and reducing model switching overhead. The library facilitates easy deployment via Ray Cluster and Kubernetes, integrates with the OpenAI Query API, and is actively maintained by contributors.
intro-to-intelligent-apps
This repository introduces and helps organizations get started with building AI Apps and incorporating Large Language Models (LLMs) into them. The workshop covers topics such as prompt engineering, AI orchestration, and deploying AI apps. Participants will learn how to use Azure OpenAI, Langchain/ Semantic Kernel, Qdrant, and Azure AI Search to build intelligent applications.
For similar tasks
ai-audio-startups
The 'ai-audio-startups' repository is a community list of startups working with AI for audio and music tech. It includes a comprehensive collection of tools and platforms that leverage artificial intelligence to enhance various aspects of music creation, production, source separation, analysis, recommendation, health & wellbeing, radio/podcast, hearing, sound detection, speech transcription, synthesis, enhancement, and manipulation. The repository serves as a valuable resource for individuals interested in exploring innovative AI applications in the audio and music industry.
For similar jobs
metavoice-src
MetaVoice-1B is a 1.2B parameter base model trained on 100K hours of speech for TTS (text-to-speech). It has been built with the following priorities: * Emotional speech rhythm and tone in English. * Zero-shot cloning for American & British voices, with 30s reference audio. * Support for (cross-lingual) voice cloning with finetuning. * We have had success with as little as 1 minute training data for Indian speakers. * Synthesis of arbitrary length text
suno-api
Suno AI API is an open-source project that allows developers to integrate the music generation capabilities of Suno.ai into their own applications. The API provides a simple and convenient way to generate music, lyrics, and other audio content using Suno.ai's powerful AI models. With Suno AI API, developers can easily add music generation functionality to their apps, websites, and other projects.
bark.cpp
Bark.cpp is a C/C++ implementation of the Bark model, a real-time, multilingual text-to-speech generation model. It supports AVX, AVX2, and AVX512 for x86 architectures, and is compatible with both CPU and GPU backends. Bark.cpp also supports mixed F16/F32 precision and 4-bit, 5-bit, and 8-bit integer quantization. It can be used to generate realistic-sounding audio from text prompts.
NSMusicS
NSMusicS is a local music software that is expected to support multiple platforms with AI capabilities and multimodal features. The goal of NSMusicS is to integrate various functions (such as artificial intelligence, streaming, music library management, cross platform, etc.), which can be understood as similar to Navidrome but with more features than Navidrome. It wants to become a plugin integrated application that can almost have all music functions.
ai-voice-cloning
This repository provides a tool for AI voice cloning, allowing users to generate synthetic speech that closely resembles a target speaker's voice. The tool is designed to be user-friendly and accessible, with a graphical user interface that guides users through the process of training a voice model and generating synthetic speech. The tool also includes a variety of features that allow users to customize the generated speech, such as the pitch, volume, and speaking rate. Overall, this tool is a valuable resource for anyone interested in creating realistic and engaging synthetic speech.
RVC_CLI
**RVC_CLI: Retrieval-based Voice Conversion Command Line Interface** This command-line interface (CLI) provides a comprehensive set of tools for voice conversion, enabling you to modify the pitch, timbre, and other characteristics of audio recordings. It leverages advanced machine learning models to achieve realistic and high-quality voice conversions. **Key Features:** * **Inference:** Convert the pitch and timbre of audio in real-time or process audio files in batch mode. * **TTS Inference:** Synthesize speech from text using a variety of voices and apply voice conversion techniques. * **Training:** Train custom voice conversion models to meet specific requirements. * **Model Management:** Extract, blend, and analyze models to fine-tune and optimize performance. * **Audio Analysis:** Inspect audio files to gain insights into their characteristics. * **API:** Integrate the CLI's functionality into your own applications or workflows. **Applications:** The RVC_CLI finds applications in various domains, including: * **Music Production:** Create unique vocal effects, harmonies, and backing vocals. * **Voiceovers:** Generate voiceovers with different accents, emotions, and styles. * **Audio Editing:** Enhance or modify audio recordings for podcasts, audiobooks, and other content. * **Research and Development:** Explore and advance the field of voice conversion technology. **For Jobs:** * Audio Engineer * Music Producer * Voiceover Artist * Audio Editor * Machine Learning Engineer **AI Keywords:** * Voice Conversion * Pitch Shifting * Timbre Modification * Machine Learning * Audio Processing **For Tasks:** * Convert Pitch * Change Timbre * Synthesize Speech * Train Model * Analyze Audio
openvino-plugins-ai-audacity
OpenVINO™ AI Plugins for Audacity* are a set of AI-enabled effects, generators, and analyzers for Audacity®. These AI features run 100% locally on your PC -- no internet connection necessary! OpenVINO™ is used to run AI models on supported accelerators found on the user's system such as CPU, GPU, and NPU. * **Music Separation**: Separate a mono or stereo track into individual stems -- Drums, Bass, Vocals, & Other Instruments. * **Noise Suppression**: Removes background noise from an audio sample. * **Music Generation & Continuation**: Uses MusicGen LLM to generate snippets of music, or to generate a continuation of an existing snippet of music. * **Whisper Transcription**: Uses whisper.cpp to generate a label track containing the transcription or translation for a given selection of spoken audio or vocals.
WavCraft
WavCraft is an LLM-driven agent for audio content creation and editing. It applies LLM to connect various audio expert models and DSP function together. With WavCraft, users can edit the content of given audio clip(s) conditioned on text input, create an audio clip given text input, get more inspiration from WavCraft by prompting a script setting and let the model do the scriptwriting and create the sound, and check if your audio file is synthesized by WavCraft.