data:image/s3,"s3://crabby-images/74c83/74c83df2ebf176f02fdd6a78b77f5efae33d2d47" alt="june"
june
Local voice chatbot for engaging conversations, powered by Ollama, Hugging Face Transformers, and Coqui TTS Toolkit
Stars: 640
data:image/s3,"s3://crabby-images/49267/4926793bcf158a8b2e15fca82e0f5b58949b1173" alt="screenshot"
june-va is a local voice chatbot that combines Ollama for language model capabilities, Hugging Face Transformers for speech recognition, and the Coqui TTS Toolkit for text-to-speech synthesis. It provides a flexible, privacy-focused solution for voice-assisted interactions on your local machine, ensuring that no data is sent to external servers. The tool supports various interaction modes including text input/output, voice input/text output, text input/audio output, and voice input/audio output. Users can customize the tool's behavior with a JSON configuration file and utilize voice conversion features for voice cloning. The application can be further customized using a configuration file with attributes for language model, speech-to-text model, and text-to-speech model configurations.
README:
june is a local voice chatbot that combines the power of Ollama (for language model capabilities), Hugging Face Transformers (for speech recognition), and the Coqui TTS Toolkit (for text-to-speech synthesis). It provides a flexible, privacy-focused solution for voice-assisted interactions on your local machine, ensuring that no data is sent to external servers.
- Text Input/Output: Provide text inputs to the assistant and receive text responses.
- Voice Input/Text Output: Use your microphone to give voice inputs, and receive text responses from the assistant.
- Text Input/Audio Output: Provide text inputs and receive both text and synthesised audio responses from the assistant.
- Voice Input/Audio Output (Default): Use your microphone for voice inputs, and receive responses in both text and synthesised audio form.
- Ollama
- Python 3.10 or greater (with pip)
-
Python development package (e.g.
apt install python3-dev
for Debian) — only for GNU/Linux -
PortAudio development package (e.g.
apt install portaudio19-dev
for Debian) — only for GNU/Linux -
PortAudio (e.g.
brew install portaudio
using Homebrew) — only for macOS - Microsoft Visual C++ 14.0 or greater — only for Windows
To install june directly from the GitHub repository:
pip install git+https://github.com/mezbaul-h/june.git@master
Alternatively, you can clone the repository and install it locally:
git clone https://github.com/mezbaul-h/june.git
cd june
pip install .
Pull the language model (default is llama3.1:8b-instruct-q4_0
) with Ollama first, if you haven't already:
ollama pull llama3.1:8b-instruct-q4_0
Next, run the program (with default configuration):
june-va
This will use llama3.1:8b-instruct-q4_0 for LLM capabilities, openai/whisper-small.en for speech recognition, and tts_models/en/ljspeech/glow-tts
for audio synthesis.
You can also customize behaviour of the program with a json configuration file:
june-va --config path/to/config.json
[!NOTE] The configuration file is optional. To learn more about the structure of the config file, see the Customization section.
The application can be customised using a configuration file. The config file must be a JSON file. The default configuration is as follows:
{
"llm": {
"disable_chat_history": false,
"model": "llama3.1:8b-instruct-q4_0"
},
"stt": {
"device": "torch device identifier (`cuda` if available; otherwise `cpu`",
"generation_args": {
"batch_size": 8
},
"model": "openai/whisper-small.en"
},
"tts": {
"device": "torch device identifier (`cuda` if available; otherwise `cpu`",
"model": "tts_models/en/ljspeech/glow-tts"
}
}
When you use a configuration file, it overrides the default configuration but does not overwrite it. So you can partially modify the configuration if you desire. For instance, if you do not wish to use speech recognition and only want to provide prompts through text, you can disable that by using a config file with the following configuration:
{
"stt": null
}
Similarly, you can disable the audio synthesiser, or both, to only use the virtual assistant in text mode.
If you only want to modify the device on which you want to load a particular type of model, without changing the other default attributes of the model, you could use:
{
"tts": {
"device": "cpu"
}
}
-
llm.device
: Torch device identifier (e.g.,cpu
,cuda
,mps
) on which the pipeline will be allocated. -
llm.disable_chat_history
: Boolean indicating whether to disable or enable chat history. Enabling chat history will make interactions more dynamic, as the model will have access to previous contexts, but it will consume more processing power. Disabling it will result in less interactive conversations but will use fewer processing resources. -
llm.model
: Name of the text-generation model tag on Ollama. Ensure this is a valid model tag that exists on your machine. -
llm.system_prompt
: Give a system prompt to the model. If the underlying model does not support a system prompt, an error will be raised.
-
tts.device
: Torch device identifier (e.g.,cpu
,cuda
,mps
) on which the pipeline will be allocated. -
stt.generation_args
: Object containing generation arguments accepted by Hugging Face's speech recognition pipeline. -
stt.model
: Name of the speech recognition model on Hugging Face. Ensure this is a valid model ID that exists on Hugging Face.
-
tts.device
: Torch device identifier (e.g.,cpu
,cuda
,mps
) on which the pipeline will be allocated. -
tts.generation_args
: Object containing generation arguments accepted by Coqui's TTS API. -
tts.model
: Name of the text-to-speech model supported by the Coqui's TTS Toolkit. Ensure this is a valid model ID.
After seeing the [system]> Listening for sound...
message, you can speak directly into the microphone. Unlike typical voice assistants, there's no wake command required. Simply start speaking, and the tool will automatically detect and process your voice input. Once you finish speaking, maintain silence for 3 seconds to allow the assistant to process your voice input.
Many of the models (e.g., tts_models/multilingual/multi-dataset/xtts_v2
) supported by Coqui's TTS Toolkit support voice cloning. You can use your own speaker profile with a small audio clip (approximately 1 minute for most models). Once you have the clip, you can instruct the assistant to use it with a custom configuration like the following:
{
"tts": {
"model": "tts_models/multilingual/multi-dataset/xtts_v2",
"generation_args": {
"language": "en",
"speaker_wav": "/path/to/your/target/voice.wav"
}
}
}
Yes, you can easily integrate a remotely hosted Ollama instance with june instead of using a local instance. Here's how to do it:
- Set the
OLLAMA_HOST
environment variable to the appropriate URL of your remote Ollama instance. - Run the program as usual.
To use a remote Ollama instance, you would use a command like this:
OLLAMA_HOST=http://localhost:11434 june-va
For Tasks:
Click tags to check more tools for each tasksFor Jobs:
Alternative AI tools for june
Similar Open Source Tools
data:image/s3,"s3://crabby-images/49267/4926793bcf158a8b2e15fca82e0f5b58949b1173" alt="june Screenshot"
june
june-va is a local voice chatbot that combines Ollama for language model capabilities, Hugging Face Transformers for speech recognition, and the Coqui TTS Toolkit for text-to-speech synthesis. It provides a flexible, privacy-focused solution for voice-assisted interactions on your local machine, ensuring that no data is sent to external servers. The tool supports various interaction modes including text input/output, voice input/text output, text input/audio output, and voice input/audio output. Users can customize the tool's behavior with a JSON configuration file and utilize voice conversion features for voice cloning. The application can be further customized using a configuration file with attributes for language model, speech-to-text model, and text-to-speech model configurations.
data:image/s3,"s3://crabby-images/be500/be500acc209a4c6750977bd9d399569f6660c44d" alt="lexido Screenshot"
lexido
Lexido is an innovative assistant for the Linux command line, designed to boost your productivity and efficiency. Powered by Gemini Pro 1.0 and utilizing the free API, Lexido offers smart suggestions for commands based on your prompts and importantly your current environment. Whether you're installing software, managing files, or configuring system settings, Lexido streamlines the process, making it faster and more intuitive.
data:image/s3,"s3://crabby-images/e6a4d/e6a4d3a27b65ae2387abbeecde0382d468dc9f91" alt="langserve Screenshot"
langserve
LangServe helps developers deploy `LangChain` runnables and chains as a REST API. This library is integrated with FastAPI and uses pydantic for data validation. In addition, it provides a client that can be used to call into runnables deployed on a server. A JavaScript client is available in LangChain.js.
data:image/s3,"s3://crabby-images/d6641/d6641e4dc04e88dd98d63874787ed576f0e9d342" alt="shellChatGPT Screenshot"
shellChatGPT
ShellChatGPT is a shell wrapper for OpenAI's ChatGPT, DALL-E, Whisper, and TTS, featuring integration with LocalAI, Ollama, Gemini, Mistral, Groq, and GitHub Models. It provides text and chat completions, vision, reasoning, and audio models, voice-in and voice-out chatting mode, text editor interface, markdown rendering support, session management, instruction prompt manager, integration with various service providers, command line completion, file picker dialogs, color scheme personalization, stdin and text file input support, and compatibility with Linux, FreeBSD, MacOS, and Termux for a responsive experience.
data:image/s3,"s3://crabby-images/120d0/120d008c6c6b74fbae7941ec3603b9af067e932f" alt="bedrock-claude-chat Screenshot"
bedrock-claude-chat
This repository is a sample chatbot using the Anthropic company's LLM Claude, one of the foundational models provided by Amazon Bedrock for generative AI. It allows users to have basic conversations with the chatbot, personalize it with their own instructions and external knowledge, and analyze usage for each user/bot on the administrator dashboard. The chatbot supports various languages, including English, Japanese, Korean, Chinese, French, German, and Spanish. Deployment is straightforward and can be done via the command line or by using AWS CDK. The architecture is built on AWS managed services, eliminating the need for infrastructure management and ensuring scalability, reliability, and security.
data:image/s3,"s3://crabby-images/312b6/312b69b6878902d4b74a7454dcb61bced5b34503" alt="allms Screenshot"
allms
allms is a versatile and powerful library designed to streamline the process of querying Large Language Models (LLMs). Developed by Allegro engineers, it simplifies working with LLM applications by providing a user-friendly interface, asynchronous querying, automatic retrying mechanism, error handling, and output parsing. It supports various LLM families hosted on different platforms like OpenAI, Google, Azure, and GCP. The library offers features for configuring endpoint credentials, batch querying with symbolic variables, and forcing structured output format. It also provides documentation, quickstart guides, and instructions for local development, testing, updating documentation, and making new releases.
data:image/s3,"s3://crabby-images/831e4/831e456e8c1fcea1c3e3131b89e77c252a1ed0bf" alt="patchwork Screenshot"
patchwork
PatchWork is an open-source framework designed for automating development tasks using large language models. It enables users to automate workflows such as PR reviews, bug fixing, security patching, and more through a self-hosted CLI agent and preferred LLMs. The framework consists of reusable atomic actions called Steps, customizable LLM prompts known as Prompt Templates, and LLM-assisted automations called Patchflows. Users can run Patchflows locally in their CLI/IDE or as part of CI/CD pipelines. PatchWork offers predefined patchflows like AutoFix, PRReview, GenerateREADME, DependencyUpgrade, and ResolveIssue, with the flexibility to create custom patchflows. Prompt templates are used to pass queries to LLMs and can be customized. Contributions to new patchflows, steps, and the core framework are encouraged, with chat assistants available to aid in the process. The roadmap includes expanding the patchflow library, introducing a debugger and validation module, supporting large-scale code embeddings, parallelization, fine-tuned models, and an open-source GUI. PatchWork is licensed under AGPL-3.0 terms, while custom patchflows and steps can be shared using the Apache-2.0 licensed patchwork template repository.
data:image/s3,"s3://crabby-images/2c1fa/2c1fa871907b0d66843350134131d37f3d1683b5" alt="LLM-Finetuning-Toolkit Screenshot"
LLM-Finetuning-Toolkit
LLM Finetuning toolkit is a config-based CLI tool for launching a series of LLM fine-tuning experiments on your data and gathering their results. It allows users to control all elements of a typical experimentation pipeline - prompts, open-source LLMs, optimization strategy, and LLM testing - through a single YAML configuration file. The toolkit supports basic, intermediate, and advanced usage scenarios, enabling users to run custom experiments, conduct ablation studies, and automate fine-tuning workflows. It provides features for data ingestion, model definition, training, inference, quality assurance, and artifact outputs, making it a comprehensive tool for fine-tuning large language models.
data:image/s3,"s3://crabby-images/5592c/5592cab93cf55ba1f79a84aa07642a8ff1002b7a" alt="mem0 Screenshot"
mem0
Mem0 is a tool that provides a smart, self-improving memory layer for Large Language Models, enabling personalized AI experiences across applications. It offers persistent memory for users, sessions, and agents, self-improving personalization, a simple API for easy integration, and cross-platform consistency. Users can store memories, retrieve memories, search for related memories, update memories, get the history of a memory, and delete memories using Mem0. It is designed to enhance AI experiences by enabling long-term memory storage and retrieval.
data:image/s3,"s3://crabby-images/d66ee/d66eead5390b0ccd5eee1b1ed2377ff46fe81091" alt="neo4j-graphrag-python Screenshot"
neo4j-graphrag-python
The Neo4j GraphRAG package for Python is an official repository that provides features for creating and managing vector indexes in Neo4j databases. It aims to offer developers a reliable package with long-term commitment, maintenance, and fast feature updates. The package supports various Python versions and includes functionalities for creating vector indexes, populating them, and performing similarity searches. It also provides guidelines for installation, examples, and development processes such as installing dependencies, making changes, and running tests.
data:image/s3,"s3://crabby-images/3c36a/3c36a1c2705aee568f73dd80fb85eaf2d295f998" alt="aimeos-typo3 Screenshot"
aimeos-typo3
Aimeos is a professional, full-featured, and high-performance e-commerce extension for TYPO3. It can be installed in an existing TYPO3 website within 5 minutes and can be adapted, extended, overwritten, and customized to meet specific needs.
data:image/s3,"s3://crabby-images/bcfa0/bcfa02f4a9687ded7dd30d35f299ae81d76d5a19" alt="HippoRAG Screenshot"
HippoRAG
HippoRAG is a novel retrieval augmented generation (RAG) framework inspired by the neurobiology of human long-term memory that enables Large Language Models (LLMs) to continuously integrate knowledge across external documents. It provides RAG systems with capabilities that usually require a costly and high-latency iterative LLM pipeline for only a fraction of the computational cost. The tool facilitates setting up retrieval corpus, indexing, and retrieval processes for LLMs, offering flexibility in choosing different online LLM APIs or offline LLM deployments through LangChain integration. Users can run retrieval on pre-defined queries or integrate directly with the HippoRAG API. The tool also supports reproducibility of experiments and provides data, baselines, and hyperparameter tuning scripts for research purposes.
data:image/s3,"s3://crabby-images/9b7a8/9b7a8fbc8f6cf81dca1875d5ba6f55c61f89a258" alt="shell-ai Screenshot"
shell-ai
Shell-AI (`shai`) is a CLI utility that enables users to input commands in natural language and receive single-line command suggestions. It leverages natural language understanding and interactive CLI tools to enhance command line interactions. Users can describe tasks in plain English and receive corresponding command suggestions, making it easier to execute commands efficiently. Shell-AI supports cross-platform usage and is compatible with Azure OpenAI deployments, offering a user-friendly and efficient way to interact with the command line.
data:image/s3,"s3://crabby-images/45171/45171b0aea4bc8aa530d8c253519c3633f3f91e9" alt="chatgpt-vscode Screenshot"
chatgpt-vscode
ChatGPT-VSCode is a Visual Studio Code integration that allows users to prompt OpenAI's GPT-4, GPT-3.5, GPT-3, and Codex models within the editor. It offers features like using improved models via OpenAI API Key, Azure OpenAI Service deployments, generating commit messages, storing conversation history, explaining and suggesting fixes for compile-time errors, viewing code differences, and more. Users can customize prompts, quick fix problems, save conversations, and export conversation history. The extension is designed to enhance developer experience by providing AI-powered assistance directly within VS Code.
data:image/s3,"s3://crabby-images/06bea/06bea01ff563d767f6a6c9938402f861307fb36a" alt="code2prompt Screenshot"
code2prompt
Code2Prompt is a powerful command-line tool that generates comprehensive prompts from codebases, designed to streamline interactions between developers and Large Language Models (LLMs) for code analysis, documentation, and improvement tasks. It bridges the gap between codebases and LLMs by converting projects into AI-friendly prompts, enabling users to leverage AI for various software development tasks. The tool offers features like holistic codebase representation, intelligent source tree generation, customizable prompt templates, smart token management, Gitignore integration, flexible file handling, clipboard-ready output, multiple output options, and enhanced code readability.
data:image/s3,"s3://crabby-images/4a934/4a934518fa67542b9a8e0ff40779339df6b7bccd" alt="rclip Screenshot"
rclip
rclip is a command-line photo search tool powered by the OpenAI's CLIP neural network. It allows users to search for images using text queries, similar image search, and combining multiple queries. The tool extracts features from photos to enable searching and indexing, with options for previewing results in supported terminals or custom viewers. Users can install rclip on Linux, macOS, and Windows using different installation methods. The repository follows the Conventional Commits standard and welcomes contributions from the community.
For similar tasks
data:image/s3,"s3://crabby-images/49267/4926793bcf158a8b2e15fca82e0f5b58949b1173" alt="june Screenshot"
june
june-va is a local voice chatbot that combines Ollama for language model capabilities, Hugging Face Transformers for speech recognition, and the Coqui TTS Toolkit for text-to-speech synthesis. It provides a flexible, privacy-focused solution for voice-assisted interactions on your local machine, ensuring that no data is sent to external servers. The tool supports various interaction modes including text input/output, voice input/text output, text input/audio output, and voice input/audio output. Users can customize the tool's behavior with a JSON configuration file and utilize voice conversion features for voice cloning. The application can be further customized using a configuration file with attributes for language model, speech-to-text model, and text-to-speech model configurations.
data:image/s3,"s3://crabby-images/ed089/ed089e5eb1575e7001cb3a9d8b07fe07d7da4f45" alt="serverless-rag-demo Screenshot"
serverless-rag-demo
The serverless-rag-demo repository showcases a solution for building a Retrieval Augmented Generation (RAG) system using Amazon Opensearch Serverless Vector DB, Amazon Bedrock, Llama2 LLM, and Falcon LLM. The solution leverages generative AI powered by large language models to generate domain-specific text outputs by incorporating external data sources. Users can augment prompts with relevant context from documents within a knowledge library, enabling the creation of AI applications without managing vector database infrastructure. The repository provides detailed instructions on deploying the RAG-based solution, including prerequisites, architecture, and step-by-step deployment process using AWS Cloudshell.
data:image/s3,"s3://crabby-images/de674/de6740b702e19c01133d985412035fb8f73361bc" alt="llm Screenshot"
llm
The 'llm' package for Emacs provides an interface for interacting with Large Language Models (LLMs). It abstracts functionality to a higher level, concealing API variations and ensuring compatibility with various LLMs. Users can set up providers like OpenAI, Gemini, Vertex, Claude, Ollama, GPT4All, and a fake client for testing. The package allows for chat interactions, embeddings, token counting, and function calling. It also offers advanced prompt creation and logging capabilities. Users can handle conversations, create prompts with placeholders, and contribute by creating providers.
data:image/s3,"s3://crabby-images/25047/250476a4412670a86d6f053475358d7767cdfef5" alt="parakeet Screenshot"
parakeet
Parakeet is a Go library for creating GenAI apps with Ollama. It enables the creation of generative AI applications that can generate text-based content. The library provides tools for simple completion, completion with context, chat completion, and more. It also supports function calling with tools and Wasm plugins. Parakeet allows users to interact with language models and create AI-powered applications easily.
data:image/s3,"s3://crabby-images/b7414/b7414c7122540ef0acfc6c8396491100cf67b38f" alt="sparkle Screenshot"
sparkle
Sparkle is a tool that streamlines the process of building AI-driven features in applications using Large Language Models (LLMs). It guides users through creating and managing agents, defining tools, and interacting with LLM providers like OpenAI. Sparkle allows customization of LLM provider settings, model configurations, and provides a seamless integration with Sparkle Server for exposing agents via an OpenAI-compatible chat API endpoint.
For similar jobs
data:image/s3,"s3://crabby-images/7689b/7689ba1fce50eb89a5e34075170d6aaee3c49f87" alt="weave Screenshot"
weave
Weave is a toolkit for developing Generative AI applications, built by Weights & Biases. With Weave, you can log and debug language model inputs, outputs, and traces; build rigorous, apples-to-apples evaluations for language model use cases; and organize all the information generated across the LLM workflow, from experimentation to evaluations to production. Weave aims to bring rigor, best-practices, and composability to the inherently experimental process of developing Generative AI software, without introducing cognitive overhead.
data:image/s3,"s3://crabby-images/8e2b1/8e2b14a8be773aeb6b9d0718bb2d71764f9c0e5c" alt="agentcloud Screenshot"
agentcloud
AgentCloud is an open-source platform that enables companies to build and deploy private LLM chat apps, empowering teams to securely interact with their data. It comprises three main components: Agent Backend, Webapp, and Vector Proxy. To run this project locally, clone the repository, install Docker, and start the services. The project is licensed under the GNU Affero General Public License, version 3 only. Contributions and feedback are welcome from the community.
data:image/s3,"s3://crabby-images/83afc/83afcd39fd69a41723dd590c7594d452ad40edd5" alt="VisionCraft Screenshot"
VisionCraft
The VisionCraft API is a free API for using over 100 different AI models. From images to sound.
data:image/s3,"s3://crabby-images/065d0/065d091551616e8781269d4b98673eee8b08234f" alt="kaito Screenshot"
kaito
Kaito is an operator that automates the AI/ML inference model deployment in a Kubernetes cluster. It manages large model files using container images, avoids tuning deployment parameters to fit GPU hardware by providing preset configurations, auto-provisions GPU nodes based on model requirements, and hosts large model images in the public Microsoft Container Registry (MCR) if the license allows. Using Kaito, the workflow of onboarding large AI inference models in Kubernetes is largely simplified.
data:image/s3,"s3://crabby-images/52deb/52debff8b9ec1f98a59c9a8adb3fb876ef4c37f1" alt="Azure-Analytics-and-AI-Engagement Screenshot"
Azure-Analytics-and-AI-Engagement
The Azure-Analytics-and-AI-Engagement repository provides packaged Industry Scenario DREAM Demos with ARM templates (Containing a demo web application, Power BI reports, Synapse resources, AML Notebooks etc.) that can be deployed in a customer’s subscription using the CAPE tool within a matter of few hours. Partners can also deploy DREAM Demos in their own subscriptions using DPoC.
data:image/s3,"s3://crabby-images/14692/14692b607acfa20205823a15490263e6039ee57c" alt="executorch Screenshot"
executorch
ExecuTorch is an end-to-end solution for enabling on-device inference capabilities across mobile and edge devices including wearables, embedded devices and microcontrollers. It is part of the PyTorch Edge ecosystem and enables efficient deployment of PyTorch models to edge devices. Key value propositions of ExecuTorch are: * **Portability:** Compatibility with a wide variety of computing platforms, from high-end mobile phones to highly constrained embedded systems and microcontrollers. * **Productivity:** Enabling developers to use the same toolchains and SDK from PyTorch model authoring and conversion, to debugging and deployment to a wide variety of platforms. * **Performance:** Providing end users with a seamless and high-performance experience due to a lightweight runtime and utilizing full hardware capabilities such as CPUs, NPUs, and DSPs.
data:image/s3,"s3://crabby-images/92e8b/92e8b3e4029ac3751e3a216b0b2bd2cc4ad82999" alt="autogen Screenshot"
autogen
AutoGen is a framework that enables the development of LLM applications using multiple agents that can converse with each other to solve tasks. AutoGen agents are customizable, conversable, and seamlessly allow human participation. They can operate in various modes that employ combinations of LLMs, human inputs, and tools.
data:image/s3,"s3://crabby-images/c92ac/c92accb591e608b2d38283e73dd764fb033bff25" alt="tabby Screenshot"
tabby
Tabby is a self-hosted AI coding assistant, offering an open-source and on-premises alternative to GitHub Copilot. It boasts several key features: * Self-contained, with no need for a DBMS or cloud service. * OpenAPI interface, easy to integrate with existing infrastructure (e.g Cloud IDE). * Supports consumer-grade GPUs.