FunAudioLLM-APP

FunAudioLLM-APP

None

Stars: 169

Visit
 screenshot

FunAudioLLM-APP is a repository hosting two applications: Voice Chat for interactive AI-driven dialogues and Voice Translation for real-time language translation. The project leverages advanced audio understanding and speech generation models to enhance audio experiences. Users can visit the FunAudioLLM Homepage, CosyVoice Paper, and FunAudioLLM Technical Report for more details. The applications aim to break down language barriers and provide a natural chatting experience in various settings.

README:

funaudiollm-app repo

Welcome to the funaudiollm-app repository! This project hosts two exciting applications leveraging advanced audio understand and speech generation models to bring your audio experiences to life:

Voice Chat : This application is designed to provide an interactive and natural chatting experience, making it easier to adopt sophisticated AI-driven dialogues in various settings.

Voice Translation: Break down language barriers with our real-time voice translation tool. This application seamlessly translates spoken language on the fly, allowing for effective and fluid communication between speakers of different languages.

For Details, visit FunAudioLLM Homepage, CosyVoice Paper, FunAudioLLM Technical Report

For CosyVoice, visit CosyVoice repo and CosyVoice space.

For SenseVoice, visit SenseVoice repo and SenseVoice space.

Install

Clone and install

  • Clone the repo and submodules
git clone --recursive URL
# If you failed to clone submodule due to network failures, please run following command until success
cd funaudiollm-app
git submodule update --init --recursive
  • prepare environments in the submodules according to cosyvoice & sensevoice repo. If you have already prepared the aforementioned resources elsewhere, you can also try modifying the code related to resource path configuration in the app.py file (line 15-18).

  • execute the code below.

pip install -r requirements.txt

Basic Usage

prepare

dashscope api token.

pem file

voice chat

cd voice_chat
sudo CUDA_VISIBLE_DEVICES="0" DS_API_TOKEN="YOUR-DS-API-TOKEN" python app.py >> ./log.txt

https://YOUR-IP-ADDRESS:60001/

voice translation

cd voice_translation
sudo CUDA_VISIBLE_DEVICES="0" DS_API_TOKEN="YOUR-DS-API-TOKEN" python app.py >> ./log.txt

https://YOUR-IP-ADDRESS:60002/

For Tasks:

Click tags to check more tools for each tasks

For Jobs:

Alternative AI tools for FunAudioLLM-APP

Similar Open Source Tools

For similar tasks

For similar jobs