
achatbot
An open source chat bot architecture for voice/vision (and multimodal) assistants, local(CPU/GPU bound) and remote(I/O bound) to run.
Stars: 74

achatbot is a factory tool that allows users to create chat bots with various functionalities such as llm (language models), asr (automatic speech recognition), tts (text-to-speech), vad (voice activity detection), ocr (optical character recognition), and object detection. The tool provides a structured project with features like chat bots for cmd, grpc, and http servers. It supports various chat bot processors, transport connectors, and AI modules for different tasks. Users can run chat bots locally or deploy them on cloud services like vercel, Cloudflare, AWS Lambda, or Docker. The tool also includes UI components for easy deployment and service architecture diagrams for reference.
README:
achatbot factory, create chat bots with llm(tools), asr, tts, vad, ocr, detect object etc..
🌿 Features
-
demo
-
podcast AI Podcast:https://podcast-997.pages.dev/ :)
# need GOOGLE_API_KEY in environment variables # default use language English # websit python -m demo.content_parser_tts instruct-content-tts \ "https://en.wikipedia.org/wiki/Large_language_model" python -m demo.content_parser_tts instruct-content-tts \ --role-tts-voices zh-CN-YunjianNeural \ --role-tts-voices zh-CN-XiaoxiaoNeural \ --language zh \ "https://en.wikipedia.org/wiki/Large_language_model" # pdf # https://web.stanford.edu/~jurafsky/slp3/ed3bookaug20_2024.pdf 600 page is ok~ :) python -m demo.content_parser_tts instruct-content-tts \ "/Users/wuyong/Desktop/Speech and Language Processing.pdf" python -m demo.content_parser_tts instruct-content-tts \ --role-tts-voices zh-CN-YunjianNeural \ --role-tts-voices zh-CN-XiaoxiaoNeural \ --language zh \ "/Users/wuyong/Desktop/Speech and Language Processing.pdf"
-
-
cmd chat bots:
- local-terminal-chat(be/fe)
- remote-queue-chat(be/fe)
- grpc-terminal-chat(be/fe)
- grpc-speaker
- http fastapi_daily_bot_serve (with chat bots pipeline)
- bots with config see notebooks:
-
support transport connector:
- [x] pipe(UNIX socket),
- [x] grpc,
- [x] queue (redis),
- [ ] websocket
- [ ] TCP/IP socket
-
chat bot processors:
- aggreators(llm use, assistant message),
- ai_frameworks
- [x] langchain: RAG
- [ ] llamaindex: RAG
- [ ] autoagen: multi Agents
- realtime voice inference(RTVI),
- transport:
- webRTC: (daily,livekit KISS)
- [x] daily: audio, video(image)
- [x] livekit: audio, video(image)
- [x] agora: audio, video(image)
- [x] small_webrtc: audio, video(image)
- [x] Websocket server
- webRTC: (daily,livekit KISS)
- ai processor: llm, tts, asr etc..
- llm_processor:
- [x] openai(use openai sdk)
- [x] google gemini(use google-generativeai sdk)
- [x] litellm(use openai input/output format proxy sdk)
- llm_processor:
-
core module:
- local llm:
- [x] llama-cpp (support text,vision with function-call model)
- [x] llm_llamacpp_generator
- [x] fastdeploy:
- [x] llm_fastdeploy_vision_ernie4v
- [x] llm_fastdeploy_generator
- [x] tensorrt_llm:
- [x] llm_trtllm_generator
- [x] llm_trtllm_runner_generator
- [x] sglang:
- [x] llm_sglang_generator
- [x] vllm:
- [x] llm_vllm_generator
- [x] llm_vllm_vision_skyworkr1v
- [x] transformers(manual, pipeline) (support text; vision,vision+image; speech,voice; vision+voice)
- [x] llm_transformers_manual_vision_llama
- [x] llm_transformers_manual_vision_molmo
- [x] llm_transformers_manual_vision_qwen
- [x] llm_transformers_manual_vision_deepseek
- [x] llm_transformers_manual_vision_janus_flow
- [x] llm_transformers_manual_vision_janus
- [x] llm_transformers_manual_vision_smolvlm
- [x] llm_transformers_manual_vision_gemma
- [x] llm_transformers_manual_vision_fastvlm
- [x] llm_transformers_manual_vision_kimi
- [x] llm_transformers_manual_vision_mimo
- [x] llm_transformers_manual_vision_keye
- [x] llm_transformers_manual_vision_glm4v
- [x] llm_transformers_manual_vision_skyworkr1v
- [x] llm_transformers_manual_image_janus_flow
- [x] llm_transformers_manual_image_janus
- [x] llm_transformers_manual_speech_llasa
- [x] llm_transformers_manual_speech_step
- [x] llm_transformers_manual_voice_glm
- [x] llm_transformers_manual_vision_voice_minicpmo, llm_transformers_manual_voice_minicpmo,llm_transformers_manual_audio_minicpmo,llm_transformers_manual_text_speech_minicpmo,llm_transformers_manual_instruct_speech_minicpmo,llm_transformers_manual_vision_minicpmo
- [x] llm_transformers_manual_qwen2_5omni, llm_transformers_manual_qwen2_5omni_audio_asr,llm_transformers_manual_qwen2_5omni_vision,llm_transformers_manual_qwen2_5omni_speech,llm_transformers_manual_qwen2_5omni_vision_voice,llm_transformers_manual_qwen2_5omni_text_voice,llm_transformers_manual_qwen2_5omni_audio_voice
- [x] llm_transformers_manual_kimi_voice,llm_transformers_manual_kimi_audio_asr,llm_transformers_manual_kimi_text_voice
- [x] llm_transformers_manual_vita_text llm_transformers_manual_vita_audio_asr llm_transformers_manual_vita_tts llm_transformers_manual_vita_text_voice llm_transformers_manual_vita_voice
- [x] llm_transformers_manual_phi4_vision_speech,llm_transformers_manual_phi4_audio_asr,llm_transformers_manual_phi4_audio_translation,llm_transformers_manual_phi4_vision,llm_transformers_manual_phi4_audio_chat
- [x] llm_transformers_manual_vision_speech_gemma3n,llm_transformers_manual_vision_gemma3n,llm_transformers_manual_gemma3n_audio_asr,llm_transformers_manual_gemma3n_audio_translation
- [x] llama-cpp (support text,vision with function-call model)
- remote api llm: personal-ai(like openai api, other ai provider)
- local llm:
-
AI modules:
- functions:
- [x] search: search,search1,serper
- [x] weather: openweathermap
- speech:
- [x] asr:
- [x] whisper_asr, whisper_timestamped_asr, whisper_faster_asr, whisper_transformers_asr, whisper_mlx_asr
- [x] whisper_groq_asr
- [x] sense_voice_asr
- [x] minicpmo_asr (whisper)
- [x] qwen2_5omni_asr (whisper)
- [x] kimi_asr (whisper)
- [x] vita_asr (sensevoice-small)
- [x] phi4_asr (conformer)
- [x] gemma3n_asr (matformer)
- [x] audio_stream: daily_room_audio_stream(in/out), pyaudio_stream(in/out)
- [x] detector: porcupine_wakeword,pyannote_vad,webrtc_vad,silero_vad,webrtc_silero_vad,fsmn_vad
- [x] player: stream_player
- [x] recorder: rms_recorder, wakeword_rms_recorder, vad_recorder, wakeword_vad_recorder
- [x] tts:
- [x] tts_edge
- [x] tts_g
- [x] tts_coqui
- [x] tts_chat
- [x] tts_cosy_voice,tts_cosy_voice2
- [x] tts_f5
- [x] tts_openvoicev2
- [x] tts_kokoro,tts_onnx_kokoro
- [x] tts_fishspeech
- [x] tts_llasa
- [x] tts_minicpmo
- [x] tts_zonos
- [x] tts_step
- [x] tts_spark
- [x] tts_orpheus
- [x] tts_mega3
- [x] tts_vita
- [x] vad_analyzer:
- [x] daily_webrtc_vad_analyzer
- [x] silero_vad_analyzer
- [x] asr:
- vision
- [x] OCR(Optical Character Recognition):
- [x] Detector:
- [x] YOLO (You Only Look Once)
- [ ] RT-DETR v2 (RealTime End-to-End Object Detection with Transformers)
- functions:
-
gen modules config(*.yaml, local/test/prod) from env with file:
.env
u also use HfArgumentParser this module's args to local cmd parse args -
deploy to cloud ☁️ serverless:
- vercel (frontend ui pages)
- Cloudflare(frontend ui pages), personal ai workers
- fastapi-daily-chat-bot on cerebrium (provider aws)
- fastapi-daily-chat-bot on leptonai
- fastapi-daily-chat-bot on modal
- aws lambda + api Gateway
- docker -> k8s/k3s
- etc...
🌻 Service Deployment Architecture
-
[x] ui/web-client-ui deploy it to cloudflare page with vite, access https://chat-client-weedge.pages.dev/
-
[x] ui/educator-client deploy it to cloudflare page with vite, access https://educator-client.pages.dev/
-
[x] chat-bot-rtvi-web-sandbox use this web sandbox to test config, actions with DailyRTVIGeneralBot
-
[x] vite-react-rtvi-web-voice rtvi web voice chat bots, diff cctv roles etc, u can diy your own role by change the system prompt with DailyRTVIGeneralBot deploy it to cloudflare page with vite, access https://role-chat.pages.dev/
-
[x] vite-react-web-vision deploy it to cloudflare page with vite, access https://vision-weedge.pages.dev/
-
[x] nextjs-react-web-storytelling deploy it to cloudflare page worker with nextjs, access https://storytelling.pages.dev/
-
[x] websocket-demo: websocket audio chat bot demo
-
[x] webrtc-demo: webrtc audio chat bot demo
-
[x] webrtc websocket voice avatar:
- [x] webrtc+websocket lam audio2expression avatar bot demo intro: native js logic, get audio to play and print expression from websocket pb avatar_data_frames Message
- [x] lam_audio2expression_avatar_ts: http signaling service and use vite+ts+gaussian-splat-renderer-for-lam to play audio and render expression from websocket pb avatar_data_frames Message
- [x] lam_audio2expression_avatar_ts_v2: websocket signaling service and use vite+ts+gaussian-splat-renderer-for-lam to play audio and render expression from websocket pb avatar_data_frames Message, access https://avatar-2lm.pages.dev/
- [x] deploy/modal(KISS) 👍🏻
- [x] deploy/leptonai(KISS)👍🏻
- [x] deploy/cerebrium/fastapi-daily-chat-bot :)
- [x] deploy/aws/fastapi-daily-chat-bot :|
- [x] deploy/docker/fastapi-daily-chat-bot 🏃
[!NOTE]
python --version
>=3.10 with asyncio-task if installachatbot[tts_openvoicev2]
need install melo-ttspip install git+https://github.com/myshell-ai/MeloTTS.git
if some other nested loop code with achatbot lib, you need to add the following code: (PS: cmd/bots/base.py had done)
import nest_asyncio nest_asyncio.apply()
[!TIP] use uv + pip to run, install the required dependencies fastly, e.g.:
uv pip install achatbot
uv pip install "achatbot[fastapi_bot_server]"
python3 -m venv .venv_achatbot
source .venv_achatbot/bin/activate
pip install achatbot
# optional-dependencies e.g.
pip install "achatbot[fastapi_bot_server]"
git clone --recursive https://github.com/ai-bot-pro/chat-bot.git
cd chat-bot
python3 -m venv .venv_achatbot
source .venv_achatbot/bin/activate
bash scripts/pypi_achatbot.sh dev
# optional-dependencies e.g.
pip install "dist/achatbot-{$version}-py3-none-any.whl[fastapi_bot_server]"
# install dependencies (replace $version) (if use cpu(default) install lite_avatar)
pip install "dist/achatbot-{$version}-py3-none-any.whl[fastapi_bot_server,livekit,livekit-api,daily,agora,silero_vad_analyzer,sense_voice_asr,openai_llm_processor,google_llm_processor,litellm_processor,together_ai,tts_edge,lite_avatar]"
# install dependencies (replace $version) (if use gpu(cuda) install lite_avatar_gpu)
pip install "dist/achatbot-{$version}-py3-none-any.whl[fastapi_bot_server,livekit,livekit-api,daily,agora,silero_vad_analyzer,sense_voice_asr,openai_llm_processor,google_llm_processor,litellm_processor,together_ai,tts_edge,lite_avatar_gpu]"
# download model weights
huggingface-cli download weege007/liteavatar --local-dir ./models/weege007/liteavatar
huggingface-cli download FunAudioLLM/SenseVoiceSmall --local-dir ./models/FunAudioLLM/SenseVoiceSmall
# run local lite-avatar chat bot
python -m src.cmd.bots.main -f config/bots/daily_liteavatar_echo_bot.json
python -m src.cmd.bots.main -f config/bots/daily_liteavatar_chat_bot.json
More details: https://github.com/ai-bot-pro/achatbot/pull/161
# install dependencies (replace $version)
pip install "dist/achatbot-{$version}-py3-none-any.whl[fastapi_bot_server,silero_vad_analyzer,sense_voice_asr,openai_llm_processor,google_llm_processor,litellm_processor,together_ai,tts_edge,lam_audio2expression_avatar]"
pip install spleeter==2.4.2
pip install typing_extensions==4.14.0 aiortc==1.13.0 transformers==4.36.2 protobuf==5.29.4
# download model weights
wget https://virutalbuy-public.oss-cn-hangzhou.aliyuncs.com/share/aigc3d/data/LAM/LAM_audio2exp_streaming.tar -P ./models/LAM_audio2exp/
tar -xzvf ./models/LAM_audio2exp/LAM_audio2exp_streaming.tar -C ./models/LAM_audio2exp && rm ./models/LAM_audio2exp/LAM_audio2exp_streaming.tar
git clone --depth 1 https://www.modelscope.cn/AI-ModelScope/wav2vec2-base-960h.git ./models/facebook/wav2vec2-base-960h
huggingface-cli download FunAudioLLM/SenseVoiceSmall --local-dir ./models/FunAudioLLM/SenseVoiceSmall
# run http signaling service + webrtc + websocket local lam_audio2expression-avatar chat bot
python -m src.cmd.webrtc_websocket.fastapi_ws_signaling_bot_serve -f config/bots/small_webrtc_fastapi_websocket_avatar_echo_bot.json
python -m src.cmd.webrtc_websocket.fastapi_ws_signaling_bot_serve -f config/bots/small_webrtc_fastapi_websocket_avatar_chat_bot.json
# run http signaling service + webrtc + websocket voice avatar agent web ui
cd ui/webrtc_websocket/lam_audio2expression_avatar_ts && npm install && npm run dev
# run websocket signaling service + webrtc + websocket local lam_audio2expression-avatar chat bot
python -m src.cmd.webrtc_websocket.fastapi_ws_signaling_bot_serve_v2 -f config/bots/small_webrtc_fastapi_websocket_avatar_echo_bot.json
python -m src.cmd.webrtc_websocket.fastapi_ws_signaling_bot_serve_v2 -f config/bots/small_webrtc_fastapi_websocket_avatar_chat_bot.json
# run websocket signaling service + webrtc + websocket voice avatar agent web ui
cd ui/webrtc_websocket/lam_audio2expression_avatar_ts_v2 && npm install && npm run dev
More details: https://github.com/ai-bot-pro/achatbot/pull/164 | online lam_audio2expression avatar: https://avatar-2lm.pages.dev/
HTTP signaling service + webrtc + websocket transports I/O bridge:
Websocket signaling service + webrtc + websocket transports I/O bridge:
Websocket signaling service + websocket + webrtc-queue transports I/O bridge:
Local/Global Scheduler + webrtc-queue bots :
Chat Bot | optional-dependencies | Colab | Device | Pipeline Desc |
---|---|---|---|---|
daily_bot livekit_bot agora_bot |
e.g.: agora_channel_audio_stream| daily_room_audio_stream | livekit_room_audio_stream, sense_voice_asr, groq | together api llm(text), tts_edge |
CPU (free, 2 cores) | e.g.: daily | livekit room in stream -> silero (vad) -> sense_voice (asr) -> groq | together (llm) -> edge (tts) -> daily | livekit room out stream |
|
generate_audio2audio | remote_queue_chat_bot_be_worker | T4(free) | e.g.: pyaudio in stream -> silero (vad) -> sense_voice (asr) -> qwen (llm) -> cosy_voice (tts) -> pyaudio out stream |
|
daily_describe_vision_tools_bot livekit_describe_vision_tools_bot agora_describe_vision_tools_bot |
e.g.: daily_room_audio_stream |livekit_room_audio_stream deepgram_asr, goole_gemini, tts_edge |
CPU(free, 2 cores) | e.g.: daily |livekit room in stream -> silero (vad) -> deepgram (asr) -> google gemini -> edge (tts) -> daily |livekit room out stream |
|
daily_describe_vision_bot livekit_describe_vision_bot agora_describe_vision_bot |
e.g.: daily_room_audio_stream | livekit_room_audio_stream sense_voice_asr, llm_transformers_manual_vision_qwen, tts_edge |
achatbot_vision_qwen_vl.ipynb: achatbot_vision_janus.ipynb: achatbot_vision_minicpmo.ipynb: achatbot_kimivl.ipynb: achatbot_phi4_multimodal.ipynb: |
- Qwen2-VL-2B-Instruct T4(free) - Qwen2-VL-7B-Instruct L4 - Llama-3.2-11B-Vision-Instruct L4 - allenai/Molmo-7B-D-0924 A100 |
e.g.: daily | livekit room in stream -> silero (vad) -> sense_voice (asr) -> qwen-vl (llm) -> edge (tts) -> daily | livekit room out stream |
daily_chat_vision_bot livekit_chat_vision_bot agora_chat_vision_bot |
e.g.: daily_room_audio_stream |livekit_room_audio_stream sense_voice_asr, llm_transformers_manual_vision_qwen, tts_edge |
- Qwen2-VL-2B-Instruct T4(free) - Qwen2-VL-7B-Instruct L4 - Ll ama-3.2-11B-Vision-Instruct L4 - allenai/Molmo-7B-D-0924 A100 |
e.g.: daily | livekit room in stream -> silero (vad) -> sense_voice (asr) -> llm answer guide qwen-vl (llm) -> edge (tts) -> daily | livekit room out stream |
|
daily_chat_tools_vision_bot livekit_chat_tools_vision_bot agora_chat_tools_vision_bot |
e.g.: daily_room_audio_stream | livekit_room_audio_stream sense_voice_asr, groq api llm(text), tools: - llm_transformers_manual_vision_qwen, tts_edge |
- Qwen2-VL-2B-Instruct<br /> T4(free) - Qwen2-VL-7B-Instruct L4 - Llama-3.2-11B-Vision-Instruct L4 - allenai/Molmo-7B-D-0924 A100 |
e.g.: daily | livekit room in stream -> silero (vad) -> sense_voice (asr) ->llm with tools qwen-vl -> edge (tts) -> daily | livekit room out stream |
|
daily_annotate_vision_bot livekit_annotate_vision_bot agora_annotate_vision_bot |
e.g.: daily_room_audio_stream | livekit_room_audio_stream vision_yolo_detector tts_edge |
T4(free) | e.g.: daily | livekit room in stream vision_yolo_detector -> edge (tts) -> daily | livekit room out stream |
|
daily_detect_vision_bot livekit_detect_vision_bot agora_detect_vision_bot |
e.g.: daily_room_audio_stream | livekit_room_audio_stream vision_yolo_detector tts_edge |
T4(free) | e.g.: daily | livekit room in stream vision_yolo_detector -> edge (tts) -> daily | livekit room out stream |
|
daily_ocr_vision_bot livekit_ocr_vision_bot agora_ocr_vision_bot |
e.g.: daily_room_audio_stream | livekit_room_audio_stream sense_voice_asr, vision_transformers_got_ocr tts_edge |
T4(free) | e.g.: daily | livekit room in stream -> silero (vad) -> sense_voice (asr) vision_transformers_got_ocr -> edge (tts) -> daily | livekit room out stream |
|
daily_month_narration_bot | e.g.: daily_room_audio_stream groq |together api llm(text), hf_sd, together api (image) tts_edge |
when use sd model with diffusers T4(free) cpu+cuda (slow) L4 cpu+cuda A100 all cuda |
e.g.: daily room in stream -> together (llm) -> hf sd gen image model -> edge (tts) -> daily room out stream |
|
daily_storytelling_bot | e.g.: daily_room_audio_stream groq |together api llm(text), hf_sd, together api (image) tts_edge |
cpu (2 cores) when use sd model with diffusers T4(free) cpu+cuda (slow) L4 cpu+cuda A100 all cuda |
e.g.: daily room in stream -> together (llm) -> hf sd gen image model -> edge (tts) -> daily room out stream |
|
websocket_server_bot fastapi_websocket_server_bot |
e.g.: websocket_server sense_voice_asr, groq |together api llm(text), tts_edge |
cpu(2 cores) | e.g.: websocket protocol in stream -> silero (vad) -> sense_voice (asr) -> together (llm) -> edge (tts) -> websocket protocol out stream |
|
daily_natural_conversation_bot | e.g.: daily_room_audio_stream sense_voice_asr, groq |together api llm(NLP task), gemini-1.5-flash (chat) tts_edge |
cpu(2 cores) | e.g.: daily room in stream -> together (llm NLP task) -> gemini-1.5-flash model (chat) -> edge (tts) -> daily room out stream |
|
fastapi_websocket_moshi_bot | e.g.: websocket_server moshi opus stream voice llm |
L4/A100 | websocket protocol in stream -> silero (vad) -> moshi opus stream voice llm -> websocket protocol out stream |
|
daily_asr_glm_voice_bot daily_glm_voice_bot |
e.g.: daily_room_audio_stream glm voice llm |
T4/L4/A100 | e.g.: daily room in stream ->glm4-voice -> daily room out stream |
|
daily_freeze_omni_voice_bot | e.g.: daily_room_audio_stream freezeOmni voice llm |
L4/A100 | e.g.: daily room in stream ->freezeOmni-voice -> daily room out stream |
|
daily_asr_minicpmo_voice_bot daily_minicpmo_voice_bot daily_minicpmo_vision_voice_bot |
e.g.: daily_room_audio_stream minicpmo llm |
T4: MiniCPM-o-2_6-int4 L4/A100: MiniCPM-o-2_6 |
e.g.: daily room in stream ->minicpmo -> daily room out stream |
|
livekit_asr_qwen2_5omni_voice_bot livekit_qwen2_5omni_voice_bot livekit_qwen2_5omni_vision_voice_bot |
e.g.: livekit_room_audio_stream qwen2.5omni llm |
A100 | e.g.: livekit room in stream ->qwen2.5omni -> livekit room out stream |
|
livekit_asr_kimi_voice_bot livekit_kimi_voice_bot |
e.g.: livekit_room_audio_stream kimi audio llm |
A100 | e.g.: livekit room in stream -> Kimi-Audio -> livekit room out stream |
|
livekit_asr_vita_voice_bot livekit_vita_voice_bot |
e.g.: livekit_room_audio_stream vita audio llm |
L4/100 | e.g.: livekit room in stream -> VITA-Audio -> livekit room out stream |
|
daily_phi4_voice_bot daily_phi4_vision_speech_bot |
e.g.: daily_room_audio_stream phi4-multimodal llm |
L4/100 | e.g.: daily room in stream -> phi4-multimodal -> edge (tts) -> daily room out stream |
|
daliy_multi_mcp_bot livekit_multi_mcp_bot agora_multi_mcp_bot |
e.g.: agora_channel_audio_stream |daily_room_audio_stream |livekit_room_audio_stream, sense_voice_asr, groq |together api llm(text), mcp tts_edge |
CPU (free, 2 cores) | e.g.: agora | daily |livekit room in stream -> silero (vad) -> sense_voice (asr) -> groq |together (llm) -> mcp server tools -> edge (tts) -> daily |livekit room out stream |
|
daily_liteavatar_chat_bot daily_liteavatar_echo_bot livekit_musetalk_chat_bot livekit_musetalk_echo_bot |
e.g.: agora_channel_audio_stream |daily_room_audio_stream |livekit_room_audio_stream, sense_voice_asr, groq |together api llm(text), tts_edge avatar |
achatbot_avatar_musetalk.ipynb: |
CPU/T4/L4 | e.g.: agora |daily |livekit room in stream -> silero (vad) -> sense_voice (asr) -> groq |together (llm) -> edge (tts) -> avatar -> daily |livekit room out stream |
🌑 Run local chat bots
[!NOTE]
run src code, replace achatbot to src, don't need set
ACHATBOT_PKG=1
e.g.:TQDM_DISABLE=True \ python -m src.cmd.local-terminal-chat.generate_audio2audio > log/std_out.log
PyAudio need install python3-pyaudio e.g. ubuntu
apt-get install python3-pyaudio
, macosbrew install portaudio
see: https://pypi.org/project/PyAudio/llm llama-cpp-python init use cpu Pre-built Wheel to install, if want to use other lib(cuda), see: https://github.com/abetlen/llama-cpp-python#installation-configuration
install
pydub
need installffmpeg
see: https://www.ffmpeg.org/download.html
-
run
pip install "achatbot[local_terminal_chat_bot]"
to install dependencies to run local terminal chat bot; -
create achatbot data dir in
$HOME
dirmkdir -p ~/.achatbot/{log,config,models,records,videos}
; -
cp .env.example .env
, and check.env
, add key/value env params; -
select a model ckpt to download:
- vad model ckpt (default vad ckpt model use silero vad)
# vad pyannote segmentation ckpt huggingface-cli download pyannote/segmentation-3.0 --local-dir ~/.achatbot/models/pyannote/segmentation-3.0 --local-dir-use-symlinks False
- asr model ckpt (default whipser ckpt model use base size)
# asr openai whisper ckpt wget https://openaipublic.azureedge.net/main/whisper/models/ed3a0b6b1c0edf879ad9b11b1af5a0e6ab5db9205f891f668f8b0e6c6326e34e/base.pt -O ~/.achatbot/models/base.pt # asr hf openai whisper ckpt for transformers pipeline to load huggingface-cli download openai/whisper-base --local-dir ~/.achatbot/models/openai/whisper-base --local-dir-use-symlinks False # asr hf faster whisper (CTranslate2) huggingface-cli download Systran/faster-whisper-base --local-dir ~/.achatbot/models/Systran/faster-whisper-base --local-dir-use-symlinks False # asr SenseVoice ckpt huggingface-cli download FunAudioLLM/SenseVoiceSmall --local-dir ~/.achatbot/models/FunAudioLLM/SenseVoiceSmall --local-dir-use-symlinks False
- llm model ckpt (default llamacpp ckpt(ggml) model use qwen-2 instruct 1.5B size)
# llm llamacpp Qwen2-Instruct huggingface-cli download Qwen/Qwen2-1.5B-Instruct-GGUF qwen2-1_5b-instruct-q8_0.gguf --local-dir ~/.achatbot/models --local-dir-use-symlinks False # llm llamacpp Qwen1.5-chat huggingface-cli download Qwen/Qwen1.5-7B-Chat-GGUF qwen1_5-7b-chat-q8_0.gguf --local-dir ~/.achatbot/models --local-dir-use-symlinks False # llm llamacpp phi-3-mini-4k-instruct huggingface-cli download microsoft/Phi-3-mini-4k-instruct-gguf Phi-3-mini-4k-instruct-q4.gguf --local-dir ~/.achatbot/models --local-dir-use-symlinks False
- tts model ckpt (default whipser ckpt model use base size)
# tts chatTTS huggingface-cli download 2Noise/ChatTTS --local-dir ~/.achatbot/models/2Noise/ChatTTS --local-dir-use-symlinks False # tts coquiTTS huggingface-cli download coqui/XTTS-v2 --local-dir ~/.achatbot/models/coqui/XTTS-v2 --local-dir-use-symlinks False # tts cosy voice git lfs install git clone https://www.modelscope.cn/iic/CosyVoice-300M.git ~/.achatbot/models/CosyVoice-300M git clone https://www.modelscope.cn/iic/CosyVoice-300M-SFT.git ~/.achatbot/models/CosyVoice-300M-SFT git clone https://www.modelscope.cn/iic/CosyVoice-300M-Instruct.git ~/.achatbot/models/CosyVoice-300M-Instruct #git clone https://www.modelscope.cn/iic/CosyVoice-ttsfrd.git ~/.achatbot/models/CosyVoice-ttsfrd
-
run local terminal chat bot with env; e.g.
- use dufault env params to run local chat bot
ACHATBOT_PKG=1 TQDM_DISABLE=True \ python -m achatbot.cmd.local-terminal-chat.generate_audio2audio > ~/.achatbot/log/std_out.log
🌒 Run remote http fastapi daily chat bots
-
run
pip install "achatbot[fastapi_daily_bot_server]"
to install dependencies to run http fastapi daily chat bot; -
run below cmd to start http server, see api docs: http://0.0.0.0:4321/docs
ACHATBOT_PKG=1 python -m achatbot.cmd.http.server.fastapi_daily_bot_serve
-
run chat bot processor, e.g.
- run a daily langchain rag bot api, with ui/educator-client
[!NOTE] need process youtube audio save to local file with
pytube
, runpip install "achatbot[pytube,deep_translator]"
to install dependencies and transcribe/translate to text, then chunks to vector store, and run langchain rag bot api; run data process:ACHATBOT_PKG=1 python -m achatbot.cmd.bots.rag.data_process.youtube_audio_transcribe_to_tidb
or download processed data from hf dataset weege007/youtube_videos, then chunks to vector store .
curl -XPOST "http://0.0.0.0:4321/bot_join/chat-bot/DailyLangchainRAGBot" \ -H "Content-Type: application/json" \ -d $'{"config":{"llm":{"model":"llama-3.1-70b-versatile","messages":[{"role":"system","content":""}],"language":"zh"},"tts":{"tag":"cartesia_tts_processor","args":{"voice_id":"eda5bbff-1ff1-4886-8ef1-4e69a77640a0","language":"zh"}},"asr":{"tag":"deepgram_asr_processor","args":{"language":"zh","model":"nova-2"}}}}' | jq .
- run a simple daily chat bot api, with ui/web-client-ui (default language: zh)
curl -XPOST "http://0.0.0.0:4321/bot_join/DailyBot" \ -H "Content-Type: application/json" \ -d '{}' | jq .
🌓 Run remote rpc chat bot worker
- run
pip install "achatbot[remote_rpc_chat_bot_be_worker]"
to install dependencies to run rpc chat bot BE worker; e.g. :- use dufault env params to run rpc chat bot BE worker
ACHATBOT_PKG=1 RUN_OP=be TQDM_DISABLE=True \
TTS_TAG=tts_edge \
python -m achatbot.cmd.grpc.terminal-chat.generate_audio2audio > ~/.achatbot/log/be_std_out.log
- run
pip install "achatbot[remote_rpc_chat_bot_fe]"
to install dependencies to run rpc chat bot FE;
ACHATBOT_PKG=1 RUN_OP=fe \
TTS_TAG=tts_edge \
python -m achatbot.cmd.grpc.terminal-chat.generate_audio2audio > ~/.achatbot/log/fe_std_out.log
🌔 Run remote queue chat bot worker
-
run
pip install "achatbot[remote_queue_chat_bot_be_worker]"
to install dependencies to run queue chat bot worker; e.g.:- use default env params to run
ACHATBOT_PKG=1 REDIS_PASSWORD=$redis_pwd RUN_OP=be TQDM_DISABLE=True \ python -m achatbot.cmd.remote-queue-chat.generate_audio2audio > ~/.achatbot/log/be_std_out.log
- sense_voice(asr) -> qwen (llm) -> cosy_voice (tts)
u can login redislabs create 30M free databases; set
REDIS_HOST
,REDIS_PORT
andREDIS_PASSWORD
to run, e.g.:
ACHATBOT_PKG=1 RUN_OP=be \ TQDM_DISABLE=True \ REDIS_PASSWORD=$redis_pwd \ REDIS_HOST=redis-14241.c256.us-east-1-2.ec2.redns.redis-cloud.com \ REDIS_PORT=14241 \ ASR_TAG=sense_voice_asr \ ASR_LANG=zn \ ASR_MODEL_NAME_OR_PATH=~/.achatbot/models/FunAudioLLM/SenseVoiceSmall \ N_GPU_LAYERS=33 FLASH_ATTN=1 \ LLM_MODEL_NAME=qwen \ LLM_MODEL_PATH=~/.achatbot/models/qwen1_5-7b-chat-q8_0.gguf \ TTS_TAG=tts_cosy_voice \ python -m achatbot.cmd.remote-queue-chat.generate_audio2audio > ~/.achatbot/log/be_std_out.log
-
run
pip install "achatbot[remote_queue_chat_bot_fe]"
to install the required packages to run quueue chat bot frontend; e.g.:- use default env params to run (default vad_recorder)
ACHATBOT_PKG=1 RUN_OP=fe \ REDIS_PASSWORD=$redis_pwd \ REDIS_HOST=redis-14241.c256.us-east-1-2.ec2.redns.redis-cloud.com \ REDIS_PORT=14241 \ python -m achatbot.cmd.remote-queue-chat.generate_audio2audio > ~/.achatbot/log/fe_std_out.log
- with wake word
ACHATBOT_PKG=1 RUN_OP=fe \ REDIS_PASSWORD=$redis_pwd \ REDIS_HOST=redis-14241.c256.us-east-1-2.ec2.redns.redis-cloud.com \ REDIS_PORT=14241 \ RECORDER_TAG=wakeword_rms_recorder \ python -m achatbot.cmd.remote-queue-chat.generate_audio2audio > ~/.achatbot/log/fe_std_out.log
- default pyaudio player stream with tts tag out sample info(rate,channels..), e.g.: (be use tts_cosy_voice out stream info)
ACHATBOT_PKG=1 RUN_OP=fe \ REDIS_PASSWORD=$redis_pwd \ REDIS_HOST=redis-14241.c256.us-east-1-2.ec2.redns.redis-cloud.com \ REDIS_PORT=14241 \ RUN_OP=fe \ TTS_TAG=tts_cosy_voice \ python -m achatbot.cmd.remote-queue-chat.generate_audio2audio > ~/.achatbot/log/fe_std_out.log
remote_queue_chat_bot_be_worker in colab examples :
- sense_voice(asr) -> qwen (llm) -> cosy_voice (tts)
🌕 Run remote grpc tts speaker bot
- run
pip install "achatbot[remote_grpc_tts_server]"
to install dependencies to run grpc tts speaker bot server;
ACHATBOT_PKG=1 python -m achatbot.cmd.grpc.speaker.server.serve
- run
pip install "achatbot[remote_grpc_tts_client]"
to install dependencies to run grpc tts speaker bot client;
ACHATBOT_PKG=1 TTS_TAG=tts_edge IS_RELOAD=1 python -m src.cmd.grpc.speaker.client
ACHATBOT_PKG=1 TTS_TAG=tts_g IS_RELOAD=1 python -m src.cmd.grpc.speaker.client
ACHATBOT_PKG=1 TTS_TAG=tts_coqui IS_RELOAD=1 python -m src.cmd.grpc.speaker.client
ACHATBOT_PKG=1 TTS_TAG=tts_chat IS_RELOAD=1 python -m src.cmd.grpc.speaker.client
ACHATBOT_PKG=1 TTS_TAG=tts_cosy_voice IS_RELOAD=1 python -m src.cmd.grpc.speaker.client
ACHATBOT_PKG=1 TTS_TAG=tts_fishspeech IS_RELOAD=1 python -m src.cmd.grpc.speaker.client
ACHATBOT_PKG=1 TTS_TAG=tts_f5 IS_RELOAD=1 python -m src.cmd.grpc.speaker.client
ACHATBOT_PKG=1 TTS_TAG=tts_openvoicev2 IS_RELOAD=1 python -m src.cmd.grpc.speaker.client
ACHATBOT_PKG=1 TTS_TAG=tts_kokoro IS_RELOAD=1 python -m src.cmd.grpc.speaker.client
ACHATBOT_PKG=1 TTS_TAG=tts_onnx_kokoro IS_RELOAD=1 KOKORO_ESPEAK_NG_LIB_PATH=/usr/local/lib/libespeak-ng.1.dylib KOKORO_LANGUAGE=cmn python -m src.cmd.grpc.speaker.client
ACHATBOT_PKG=1 TTS_TAG=tts_cosy_voice2 \
COSY_VOICE_MODELS_DIR=./models/FunAudioLLM/CosyVoice2-0.5B \
COSY_VOICE_REFERENCE_AUDIO_PATH=./test/audio_files/asr_example_zh.wav \
IS_RELOAD=1 python -m src.cmd.grpc.speaker.client
📹 Multimodal Interaction
- stream-ocr (realtime-object-detection)
- Embodied Intelligence: Robots that touch the world, perceive and move
achatbot is released under the BSD 3 license. (Additional code in this distribution is covered by the MIT and Apache Open Source licenses.) However you may have other legal obligations that govern your use of content, such as the terms of service for third-party models.
For Tasks:
Click tags to check more tools for each tasksFor Jobs:
Alternative AI tools for achatbot
Similar Open Source Tools

achatbot
achatbot is a factory tool that allows users to create chat bots with various functionalities such as llm (language models), asr (automatic speech recognition), tts (text-to-speech), vad (voice activity detection), ocr (optical character recognition), and object detection. The tool provides a structured project with features like chat bots for cmd, grpc, and http servers. It supports various chat bot processors, transport connectors, and AI modules for different tasks. Users can run chat bots locally or deploy them on cloud services like vercel, Cloudflare, AWS Lambda, or Docker. The tool also includes UI components for easy deployment and service architecture diagrams for reference.

llama.cpp
The main goal of llama.cpp is to enable LLM inference with minimal setup and state-of-the-art performance on a wide range of hardware - locally and in the cloud. It provides a Plain C/C++ implementation without any dependencies, optimized for Apple silicon via ARM NEON, Accelerate and Metal frameworks, and supports various architectures like AVX, AVX2, AVX512, and AMX. It offers integer quantization for faster inference, custom CUDA kernels for NVIDIA GPUs, Vulkan and SYCL backend support, and CPU+GPU hybrid inference. llama.cpp is the main playground for developing new features for the ggml library, supporting various models and providing tools and infrastructure for LLM deployment.

llama.cpp
llama.cpp is a C++ implementation of LLaMA, a large language model from Meta. It provides a command-line interface for inference and can be used for a variety of tasks, including text generation, translation, and question answering. llama.cpp is highly optimized for performance and can be run on a variety of hardware, including CPUs, GPUs, and TPUs.

VideoLLaMA2
VideoLLaMA 2 is a project focused on advancing spatial-temporal modeling and audio understanding in video-LLMs. It provides tools for multi-choice video QA, open-ended video QA, and video captioning. The project offers model zoo with different configurations for visual encoder and language decoder. It includes training and evaluation guides, as well as inference capabilities for video and image processing. The project also features a demo setup for running a video-based Large Language Model web demonstration.

intel-extension-for-transformers
Intel® Extension for Transformers is an innovative toolkit designed to accelerate GenAI/LLM everywhere with the optimal performance of Transformer-based models on various Intel platforms, including Intel Gaudi2, Intel CPU, and Intel GPU. The toolkit provides the below key features and examples: * Seamless user experience of model compressions on Transformer-based models by extending [Hugging Face transformers](https://github.com/huggingface/transformers) APIs and leveraging [Intel® Neural Compressor](https://github.com/intel/neural-compressor) * Advanced software optimizations and unique compression-aware runtime (released with NeurIPS 2022's paper [Fast Distilbert on CPUs](https://arxiv.org/abs/2211.07715) and [QuaLA-MiniLM: a Quantized Length Adaptive MiniLM](https://arxiv.org/abs/2210.17114), and NeurIPS 2021's paper [Prune Once for All: Sparse Pre-Trained Language Models](https://arxiv.org/abs/2111.05754)) * Optimized Transformer-based model packages such as [Stable Diffusion](examples/huggingface/pytorch/text-to-image/deployment/stable_diffusion), [GPT-J-6B](examples/huggingface/pytorch/text-generation/deployment), [GPT-NEOX](examples/huggingface/pytorch/language-modeling/quantization#2-validated-model-list), [BLOOM-176B](examples/huggingface/pytorch/language-modeling/inference#BLOOM-176B), [T5](examples/huggingface/pytorch/summarization/quantization#2-validated-model-list), [Flan-T5](examples/huggingface/pytorch/summarization/quantization#2-validated-model-list), and end-to-end workflows such as [SetFit-based text classification](docs/tutorials/pytorch/text-classification/SetFit_model_compression_AGNews.ipynb) and [document level sentiment analysis (DLSA)](workflows/dlsa) * [NeuralChat](intel_extension_for_transformers/neural_chat), a customizable chatbot framework to create your own chatbot within minutes by leveraging a rich set of [plugins](https://github.com/intel/intel-extension-for-transformers/blob/main/intel_extension_for_transformers/neural_chat/docs/advanced_features.md) such as [Knowledge Retrieval](./intel_extension_for_transformers/neural_chat/pipeline/plugins/retrieval/README.md), [Speech Interaction](./intel_extension_for_transformers/neural_chat/pipeline/plugins/audio/README.md), [Query Caching](./intel_extension_for_transformers/neural_chat/pipeline/plugins/caching/README.md), and [Security Guardrail](./intel_extension_for_transformers/neural_chat/pipeline/plugins/security/README.md). This framework supports Intel Gaudi2/CPU/GPU. * [Inference](https://github.com/intel/neural-speed/tree/main) of Large Language Model (LLM) in pure C/C++ with weight-only quantization kernels for Intel CPU and Intel GPU (TBD), supporting [GPT-NEOX](https://github.com/intel/neural-speed/tree/main/neural_speed/models/gptneox), [LLAMA](https://github.com/intel/neural-speed/tree/main/neural_speed/models/llama), [MPT](https://github.com/intel/neural-speed/tree/main/neural_speed/models/mpt), [FALCON](https://github.com/intel/neural-speed/tree/main/neural_speed/models/falcon), [BLOOM-7B](https://github.com/intel/neural-speed/tree/main/neural_speed/models/bloom), [OPT](https://github.com/intel/neural-speed/tree/main/neural_speed/models/opt), [ChatGLM2-6B](https://github.com/intel/neural-speed/tree/main/neural_speed/models/chatglm), [GPT-J-6B](https://github.com/intel/neural-speed/tree/main/neural_speed/models/gptj), and [Dolly-v2-3B](https://github.com/intel/neural-speed/tree/main/neural_speed/models/gptneox). Support AMX, VNNI, AVX512F and AVX2 instruction set. We've boosted the performance of Intel CPUs, with a particular focus on the 4th generation Intel Xeon Scalable processor, codenamed [Sapphire Rapids](https://www.intel.com/content/www/us/en/products/docs/processors/xeon-accelerated/4th-gen-xeon-scalable-processors.html).

gpt_academic
GPT Academic is a powerful tool that leverages the capabilities of large language models (LLMs) to enhance academic research and writing. It provides a user-friendly interface that allows researchers, students, and professionals to interact with LLMs and utilize their abilities for various academic tasks. With GPT Academic, users can access a wide range of features and functionalities, including: * **Summarization and Paraphrasing:** GPT Academic can summarize complex texts, articles, and research papers into concise and informative summaries. It can also paraphrase text to improve clarity and readability. * **Question Answering:** Users can ask GPT Academic questions related to their research or studies, and the tool will provide comprehensive and well-informed answers based on its knowledge and understanding of the relevant literature. * **Code Generation and Explanation:** GPT Academic can generate code snippets and provide explanations for complex coding concepts. It can also help debug code and suggest improvements. * **Translation:** GPT Academic supports translation of text between multiple languages, making it a valuable tool for researchers working with international collaborations or accessing resources in different languages. * **Citation and Reference Management:** GPT Academic can help users manage their citations and references by automatically generating citations in various formats and providing suggestions for relevant references based on the user's research topic. * **Collaboration and Note-Taking:** GPT Academic allows users to collaborate on projects and take notes within the tool. They can share their work with others and access a shared workspace for real-time collaboration. * **Customizable Interface:** GPT Academic offers a customizable interface that allows users to tailor the tool to their specific needs and preferences. They can choose from a variety of themes, adjust the layout, and add or remove features to create a personalized workspace. Overall, GPT Academic is a versatile and powerful tool that can significantly enhance the productivity and efficiency of academic research and writing. It empowers users to leverage the capabilities of LLMs and unlock new possibilities for academic exploration and knowledge creation.

chatgpt-auto-continue
ChatGPT Auto-Continue is a userscript that automatically continues generating ChatGPT responses when chats cut off. It relies on the powerful chatgpt.js library and is easy to install and use. Simply install Tampermonkey and ChatGPT Auto-Continue, and visit chat.openai.com as normal. Multi-reply conversations will automatically continue generating when cut-off!

UMOE-Scaling-Unified-Multimodal-LLMs
Uni-MoE is a MoE-based unified multimodal model that can handle diverse modalities including audio, speech, image, text, and video. The project focuses on scaling Unified Multimodal LLMs with a Mixture of Experts framework. It offers enhanced functionality for training across multiple nodes and GPUs, as well as parallel processing at both the expert and modality levels. The model architecture involves three training stages: building connectors for multimodal understanding, developing modality-specific experts, and incorporating multiple trained experts into LLMs using the LoRA technique on mixed multimodal data. The tool provides instructions for installation, weights organization, inference, training, and evaluation on various datasets.

VideoRefer
VideoRefer Suite is a tool designed to enhance the fine-grained spatial-temporal understanding capabilities of Video Large Language Models (Video LLMs). It consists of three primary components: Model (VideoRefer) for perceiving, reasoning, and retrieval for user-defined regions at any specified timestamps, Dataset (VideoRefer-700K) for high-quality object-level video instruction data, and Benchmark (VideoRefer-Bench) to evaluate object-level video understanding capabilities. The tool can understand any object within a video.

LLaMA-Factory
LLaMA Factory is a unified framework for fine-tuning 100+ large language models (LLMs) with various methods, including pre-training, supervised fine-tuning, reward modeling, PPO, DPO and ORPO. It features integrated algorithms like GaLore, BAdam, DoRA, LongLoRA, LLaMA Pro, LoRA+, LoftQ and Agent tuning, as well as practical tricks like FlashAttention-2, Unsloth, RoPE scaling, NEFTune and rsLoRA. LLaMA Factory provides experiment monitors like LlamaBoard, TensorBoard, Wandb, MLflow, etc., and supports faster inference with OpenAI-style API, Gradio UI and CLI with vLLM worker. Compared to ChatGLM's P-Tuning, LLaMA Factory's LoRA tuning offers up to 3.7 times faster training speed with a better Rouge score on the advertising text generation task. By leveraging 4-bit quantization technique, LLaMA Factory's QLoRA further improves the efficiency regarding the GPU memory.

LLMs-Zero-to-Hero
LLMs-Zero-to-Hero is a repository dedicated to training large language models (LLMs) from scratch, covering topics such as dense models, MOE models, pre-training, supervised fine-tuning, direct preference optimization, reinforcement learning from human feedback, and deploying large models. The repository provides detailed learning notes for different chapters, code implementations, and resources for training and deploying LLMs. It aims to guide users from being beginners to proficient in building and deploying large language models.

InternLM
InternLM is a powerful language model series with features such as 200K context window for long-context tasks, outstanding comprehensive performance in reasoning, math, code, chat experience, instruction following, and creative writing, code interpreter & data analysis capabilities, and stronger tool utilization capabilities. It offers models in sizes of 7B and 20B, suitable for research and complex scenarios. The models are recommended for various applications and exhibit better performance than previous generations. InternLM models may match or surpass other open-source models like ChatGPT. The tool has been evaluated on various datasets and has shown superior performance in multiple tasks. It requires Python >= 3.8, PyTorch >= 1.12.0, and Transformers >= 4.34 for usage. InternLM can be used for tasks like chat, agent applications, fine-tuning, deployment, and long-context inference.

chatgpt-infinity
ChatGPT Infinity is a free and powerful add-on that makes ChatGPT generate infinite answers on any topic. It offers customizable topic selection, multilingual support, adjustable response interval, and auto-scroll feature for a seamless chat experience.

vectordb-recipes
This repository contains examples, applications, starter code, & tutorials to help you kickstart your GenAI projects. * These are built using LanceDB, a free, open-source, serverless vectorDB that **requires no setup**. * It **integrates into python data ecosystem** so you can simply start using these in your existing data pipelines in pandas, arrow, pydantic etc. * LanceDB has **native Typescript SDK** using which you can **run vector search** in serverless functions! This repository is divided into 3 sections: - Examples - Get right into the code with minimal introduction, aimed at getting you from an idea to PoC within minutes! - Applications - Ready to use Python and web apps using applied LLMs, VectorDB and GenAI tools - Tutorials - A curated list of tutorials, blogs, Colabs and courses to get you started with GenAI in greater depth.

ms-copilot-play
Microsoft Copilot Play is a Cloudflare Worker service that accelerates Microsoft Copilot functionalities in China. It allows high-speed access to Microsoft Copilot features like chatting, notebook, plugins, image generation, and sharing. The service filters out meaningless requests used for statistics, saving up to 80% of Cloudflare Worker requests. Users can deploy the service easily with Cloudflare Worker, ensuring fast and unlimited access with no additional operations. The service leverages the power of Microsoft Copilot, based on OpenAI GPT-4, and utilizes Bing search to answer questions.

mnn-llm
MNN-LLM is a high-performance inference engine for large language models (LLMs) on mobile and embedded devices. It provides optimized implementations of popular LLM models, such as ChatGPT, BLOOM, and GPT-3, enabling developers to easily integrate these models into their applications. MNN-LLM is designed to be efficient and lightweight, making it suitable for resource-constrained devices. It supports various deployment options, including mobile apps, web applications, and embedded systems. With MNN-LLM, developers can leverage the power of LLMs to enhance their applications with natural language processing capabilities, such as text generation, question answering, and dialogue generation.
For similar tasks

achatbot
achatbot is a factory tool that allows users to create chat bots with various functionalities such as llm (language models), asr (automatic speech recognition), tts (text-to-speech), vad (voice activity detection), ocr (optical character recognition), and object detection. The tool provides a structured project with features like chat bots for cmd, grpc, and http servers. It supports various chat bot processors, transport connectors, and AI modules for different tasks. Users can run chat bots locally or deploy them on cloud services like vercel, Cloudflare, AWS Lambda, or Docker. The tool also includes UI components for easy deployment and service architecture diagrams for reference.
For similar jobs

sweep
Sweep is an AI junior developer that turns bugs and feature requests into code changes. It automatically handles developer experience improvements like adding type hints and improving test coverage.

teams-ai
The Teams AI Library is a software development kit (SDK) that helps developers create bots that can interact with Teams and Microsoft 365 applications. It is built on top of the Bot Framework SDK and simplifies the process of developing bots that interact with Teams' artificial intelligence capabilities. The SDK is available for JavaScript/TypeScript, .NET, and Python.

ai-guide
This guide is dedicated to Large Language Models (LLMs) that you can run on your home computer. It assumes your PC is a lower-end, non-gaming setup.

classifai
Supercharge WordPress Content Workflows and Engagement with Artificial Intelligence. Tap into leading cloud-based services like OpenAI, Microsoft Azure AI, Google Gemini and IBM Watson to augment your WordPress-powered websites. Publish content faster while improving SEO performance and increasing audience engagement. ClassifAI integrates Artificial Intelligence and Machine Learning technologies to lighten your workload and eliminate tedious tasks, giving you more time to create original content that matters.

chatbot-ui
Chatbot UI is an open-source AI chat app that allows users to create and deploy their own AI chatbots. It is easy to use and can be customized to fit any need. Chatbot UI is perfect for businesses, developers, and anyone who wants to create a chatbot.

BricksLLM
BricksLLM is a cloud native AI gateway written in Go. Currently, it provides native support for OpenAI, Anthropic, Azure OpenAI and vLLM. BricksLLM aims to provide enterprise level infrastructure that can power any LLM production use cases. Here are some use cases for BricksLLM: * Set LLM usage limits for users on different pricing tiers * Track LLM usage on a per user and per organization basis * Block or redact requests containing PIIs * Improve LLM reliability with failovers, retries and caching * Distribute API keys with rate limits and cost limits for internal development/production use cases * Distribute API keys with rate limits and cost limits for students

uAgents
uAgents is a Python library developed by Fetch.ai that allows for the creation of autonomous AI agents. These agents can perform various tasks on a schedule or take action on various events. uAgents are easy to create and manage, and they are connected to a fast-growing network of other uAgents. They are also secure, with cryptographically secured messages and wallets.

griptape
Griptape is a modular Python framework for building AI-powered applications that securely connect to your enterprise data and APIs. It offers developers the ability to maintain control and flexibility at every step. Griptape's core components include Structures (Agents, Pipelines, and Workflows), Tasks, Tools, Memory (Conversation Memory, Task Memory, and Meta Memory), Drivers (Prompt and Embedding Drivers, Vector Store Drivers, Image Generation Drivers, Image Query Drivers, SQL Drivers, Web Scraper Drivers, and Conversation Memory Drivers), Engines (Query Engines, Extraction Engines, Summary Engines, Image Generation Engines, and Image Query Engines), and additional components (Rulesets, Loaders, Artifacts, Chunkers, and Tokenizers). Griptape enables developers to create AI-powered applications with ease and efficiency.