Best AI tools for< Enable Voice Commands >
20 - AI tool Sites
Voqal
Voqal is a natural speech programming assistant designed for software developers. It utilizes advanced technologies like GPT-4o & Gemini 1.5 Flash integration to enable voice-based coding, navigation, execution, debugging, and refactoring. Voqal supports multiple spoken languages and offers a hands-free coding experience, making it ideal for developers looking for a more intuitive way to interact with their IDEs. The platform provides a guide on setting up Voqal, using basic and advanced features, and customizing it to suit individual coding styles. Embrace the future of programming with Voqal!
PayGenie
The website offers an AI-powered invoicing assistant that enables users to create and manage invoices effortlessly using voice commands and automation. It aims to save users time by automating invoice creation and reducing errors. The tool provides customizable templates, real-time insights, and smart time tracking features to streamline the invoicing process. Users can join the waitlist to experience the future of invoicing with AI-driven automation.
Dubbing AI
Dubbing AI is a free real-time AI voice changer that allows you to change your voice in real-time while speaking. It offers a variety of voice effects and filters that you can use to customize your voice. You can also use Dubbing AI to create funny or unique voiceovers for your videos or presentations.
Picovoice
Picovoice is an on-device Voice AI and local LLM platform designed for enterprises. It offers a range of voice AI and LLM solutions, including speech-to-text, noise suppression, speaker recognition, speech-to-index, wake word detection, and more. Picovoice empowers developers to build virtual assistants and AI-powered products with compliance, reliability, and scalability in mind. The platform allows enterprises to process data locally without relying on third-party remote servers, ensuring data privacy and security. With a focus on cutting-edge AI technology, Picovoice enables users to stay ahead of the curve and adapt quickly to changing customer needs.
Spoken AI
Spoken AI is an innovative AI tool that enables users to interact with technology through voice commands. It leverages cutting-edge natural language processing and machine learning algorithms to understand and respond to spoken language. With Spoken AI, users can perform various tasks hands-free, such as setting reminders, sending messages, playing music, and getting weather updates. The application aims to enhance user experience by providing a seamless and intuitive way to engage with devices using voice input.
Vapi
Vapi is a Voice AI tool designed specifically for developers. It enables developers to interact with their code using voice commands, making the coding process more efficient and hands-free. With Vapi, developers can perform various tasks such as writing code, debugging, and running tests simply by speaking. The tool is equipped with advanced natural language processing capabilities to accurately interpret and execute voice commands. Vapi aims to revolutionize the way developers work by providing a seamless and intuitive coding experience.
Meta AI
Meta AI is an advanced artificial intelligence tool that enables users to learn, create, and explore the world around them. With features like AI Studio for creating custom AIs and Llama for building the future of AI, Meta AI offers cutting-edge technology to bring visions to life. Users can engage with AI characters, identify objects, and have conversations using voice commands. The platform is designed to make AI more accessible and engaging for everyone, with a focus on open collaboration and innovation.
SpeakStruct
SpeakStruct is an AI-powered application that enables professionals, businesses, and developers to effortlessly convert voice input into structured formats using customizable templates. The platform leverages advanced AI and natural language processing to ensure high accuracy in voice transcription and data structuring, making it ideal for various industries such as sales & marketing, customer support, product & engineering, financial/mortgage advisors, and healthcare professionals. SpeakStruct's flexible template builder allows users to tailor the application to their specific needs, capturing voice input from any channel and transforming it into a consistent, structured format.
Witlingo
Witlingo is a multi-channel, multi-lingual community engagement and communication platform that focuses on senior living. It offers a generative AI home product that simplifies pricing, facilitates testimonials, and enables easy login. The platform allows users to send notifications and receive responses via text, phone, and smart speakers in over 20 languages. Witlingo aims to voice-enable the world by providing digital audio glossaries and voicebot services.
AviaryAI
AviaryAI is an AI tool that offers outbound AI voice agents, real-time translation, and a knowledge base tailored for the financial services industry. It aims to help credit unions, insurance companies, and banks enhance customer interactions, streamline processes, and drive revenue through generative AI technology. AviaryAI is backed by Y Combinator and emphasizes secure, compliant, and ethical AI development. With a focus on deep domain expertise and quick implementation, AviaryAI enables organizations to maximize outreach, save time, and improve multilingual communication.
Ascenscia
Ascenscia is a specialized AI voice assistant designed to streamline lab digitization processes. It integrates with laboratory software and machines to enable hands-free interactions, automating data collection, optimizing workflows, and accelerating R&D cycles. Ascenscia offers features such as data accessibility, data capturing, inventory access, and additional task management. The application is designed for scientific labs, addressing concerns with precision, safety, and adaptability. It boasts high accuracy in understanding scientific terminologies, end-to-end data encryption, multi-lingual support, and customization options for different lab workflows.
Voicemy.ai
Voicemy.ai is an AI application that allows users to create AI voices and songs. Users can clone voices of famous personalities, compose melodies, and convert text into spoken words using chosen voice models. The platform aims to inspire creativity and enable users to share their passion with the world.
SpeakShift
SpeakShift is a language translation business that provides a comprehensive suite of software and solutions that enable real-time translation of speech, video, and live streaming presentations. Their AI-powered voice translation technology enables seamless communication between people who speak different languages. SpeakShift's video dubbing services make it easy to create multilingual content that resonates with viewers worldwide. Their perception-enabled language analytics technology provides real-time insights about the language used in your content.
Lingvanex
Lingvanex is a cloud-based machine translation and speech recognition platform that provides businesses with a variety of tools to translate text, documents, and speech in over 100 languages. The platform is powered by artificial intelligence (AI) and machine learning (ML) technologies, which enable it to deliver high-quality translations that are both accurate and fluent. Lingvanex also offers a variety of features that make it easy for businesses to integrate translation and speech recognition into their workflows, including APIs, SDKs, and plugins for popular programming languages and platforms.
International Institute of Business Analysis (IIBA)
The International Institute of Business Analysis (IIBA) website is a global standard platform that offers resources, certifications, and best practices for business analysts. It provides a curated body of knowledge to advance careers in business analysis and enable successful organizational change. The website features tools, templates, and expert insights to enhance business analysis practices and drive value through artificial intelligence integration.
Media Monk
Media Monk is an AI-powered ecosystem that serves as the Swiss Army-Knife of AI Business Tools for Marketers. It offers a comprehensive suite of tools for inbound marketing, outbound marketing, client education, client communication, and extensions. The platform leverages AI technology to help marketers increase brand visibility, produce content at scale, elevate brand messaging, and streamline sales and marketing tasks efficiently. Media Monk's AI-powered features enable users to automate content creation, optimize content distribution, track content performance, and enhance search engine optimization. The platform also offers tools for targeted outreach powered by AI, including email outreach and phone outreach, to deliver personalized and impactful messages across multiple channels.
Merton
Merton is an AI-powered communication tool designed to provide a voice to the voiceless. It enables voice-impaired users to express their needs, thoughts, and feelings naturally and swiftly through a user-friendly interface. The application features an AI-powered Communication Board that predicts users' next phrases, a Pain Tracker for pinpointing areas of pain using eye movements, and prioritizes user privacy. Merton significantly enhances communication for individuals with limited or no motor functions, improving caregiving processes and response times.
ZeroBot
ZeroBot is the internet's leading voice-enabled chatbot. It allows users to have conversations with AI agents that are tailored to their specific needs. ZeroBot is powered by the Groq LPU™ Inference Engine, which provides instant and smooth chat experiences. With ZeroBot, users can create and speak with AI agents anywhere, anytime.
VirtualFantasy.ai
VirtualFantasy.ai is an AI-powered virtual companion platform that utilizes advanced artificial intelligence algorithms to provide users with personalized assistance and companionship. The platform offers a wide range of features such as virtual conversations, emotional support, task reminders, entertainment recommendations, and personalized insights. VirtualFantasy.ai aims to enhance users' daily lives by offering a virtual companion that can engage in meaningful interactions and provide support whenever needed.
Replica Studios
Replica Studios is an AI tool that provides cutting-edge text-to-speech and speech-to-speech solutions in multiple languages for creative professionals. It offers fully licensed AI models safe for commercial use, allowing users to customize voices for various creative and professional use cases, such as gaming, animation, film, audiobooks, e-learning, and social media. The tool enables users to generate voice overs and dialogue instantly, manage scripts, and create unique voices using Voice Lab. Replica Studios prioritizes ethical voice AI by collaborating with voice actors and ensuring commercial use compliance.
20 - Open Source AI Tools
recognizer
Recognizer is a Python library for speech recognition. It provides a simple interface to transcribe speech from audio files or live audio input. The library supports multiple speech recognition engines, including Google Speech Recognition, Sphinx, and Wit.ai. Recognizer is easy to use and can be integrated into various applications to enable voice commands, transcription, and speech-to-text functionality.
talk-to-chatgpt
Talk-To-ChatGPT is a Google Chrome and Microsoft Edge extension that enables users to interact with the ChatGPT AI using voice commands for speech recognition and text-to-speech responses. The tool enhances the conversational experience by allowing users to speak to the AI and receive spoken responses, making interactions more natural and engaging. It also supports ElevenLabs API integration for creating custom voices for text-to-speech. The extension provides settings for voice, language, and more, and can be installed from the Chrome and Edge web stores or manually. While the project has been discontinued due to upcoming desktop apps from OpenAI, it has been used to assist individuals with disabilities and the elderly in interacting with ChatGPT.
alan-sdk-ios
Alan AI SDK for iOS is a powerful tool that allows developers to quickly create AI agents for their iOS apps. With Alan AI Platform, users can easily design, embed, and host conversational experiences in their applications. The platform offers a web-based IDE called Alan AI Studio for creating dialog scenarios, lightweight SDKs for embedding AI agents, and a backend powered by top-notch speech recognition and natural language understanding technologies. Alan AI enables human-like conversations and actions through voice commands, with features like on-the-fly updates, dialog flow testing, and analytics.
Open-LLM-VTuber
Open-LLM-VTuber is a project in early stages of development that allows users to interact with Large Language Models (LLM) using voice commands and receive responses through a Live2D talking face. The project aims to provide a minimum viable prototype for offline use on macOS, Linux, and Windows, with features like long-term memory using MemGPT, customizable LLM backends, speech recognition, and text-to-speech providers. Users can configure the project to chat with LLMs, choose different backend services, and utilize Live2D models for visual representation. The project supports perpetual chat, offline operation, and GPU acceleration on macOS, addressing limitations of existing solutions on macOS.
tb1
A Telegram bot for accessing Google Gemini, MS Bing, etc. The bot responds to the keywords 'bot' and 'google' to provide information. It can handle voice messages, text files, images, and links. It can generate images based on descriptions, extract text from images, and summarize content. The bot can interact with various AI models and perform tasks like voice control, text-to-speech, and text recognition. It supports long texts, large responses, and file transfers. Users can interact with the bot using voice commands and text. The bot can be customized for different AI providers and has features for both users and administrators.
DeepBattler
DeepBattler is a tool designed for Hearthstone Battlegrounds players, providing real-time strategic advice and insights to improve gameplay experience. It integrates with the Hearthstone Deck Tracker plugin and offers voice-assisted guidance. The tool is powered by a large language model (LLM) and can match the strength of top players on EU servers. Users can set up the tool by adding dependencies, configuring the plugin path, and launching the LLM agent. DeepBattler is licensed for personal, educational, and non-commercial use, with guidelines on non-commercial distribution and acknowledgment of external contributions.
Simulator-Controller
Simulator Controller is a modular administration and controller application for Sim Racing, featuring a comprehensive plugin automation framework for external controller hardware. It includes voice chat capable Assistants like Virtual Race Engineer, Race Strategist, Race Spotter, and Driving Coach. The tool offers features for setup, strategy development, monitoring races, and more. Developed in AutoHotkey, it supports various simulation games and integrates with third-party applications for enhanced functionality.
AIlice
AIlice is a fully autonomous, general-purpose AI agent that aims to create a standalone artificial intelligence assistant, similar to JARVIS, based on the open-source LLM. AIlice achieves this goal by building a "text computer" that uses a Large Language Model (LLM) as its core processor. Currently, AIlice demonstrates proficiency in a range of tasks, including thematic research, coding, system management, literature reviews, and complex hybrid tasks that go beyond these basic capabilities. AIlice has reached near-perfect performance in everyday tasks using GPT-4 and is making strides towards practical application with the latest open-source models. We will ultimately achieve self-evolution of AI agents. That is, AI agents will autonomously build their own feature expansions and new types of agents, unleashing LLM's knowledge and reasoning capabilities into the real world seamlessly.
big-AGI
big-AGI is an AI suite designed for professionals seeking function, form, simplicity, and speed. It offers best-in-class Chats, Beams, and Calls with AI personas, visualizations, coding, drawing, side-by-side chatting, and more, all wrapped in a polished UX. The tool is powered by the latest models from 12 vendors and open-source servers, providing users with advanced AI capabilities and a seamless user experience. With continuous updates and enhancements, big-AGI aims to stay ahead of the curve in the AI landscape, catering to the needs of both developers and AI enthusiasts.
june
june-va is a local voice chatbot that combines Ollama for language model capabilities, Hugging Face Transformers for speech recognition, and the Coqui TTS Toolkit for text-to-speech synthesis. It provides a flexible, privacy-focused solution for voice-assisted interactions on your local machine, ensuring that no data is sent to external servers. The tool supports various interaction modes including text input/output, voice input/text output, text input/audio output, and voice input/audio output. Users can customize the tool's behavior with a JSON configuration file and utilize voice conversion features for voice cloning. The application can be further customized using a configuration file with attributes for language model, speech-to-text model, and text-to-speech model configurations.
ASR-LLM-TTS
ASR-LLM-TTS is a repository that provides detailed tutorials for setting up the environment, including installing anaconda, ffmpeg, creating virtual environments, and installing necessary libraries such as pytorch, torchaudio, edge-tts, funasr, and more. It also introduces features like voiceprint recognition, custom wake words, and conversation history memory. The repository combines CosyVoice for speech synthesis, SenceVoice for speech recognition, and QWen2.5 for dialogue understanding. It offers multiple speech synthesis methods including CoosyVoice, pyttsx3, and edgeTTS, with scripts for interactive inference provided. The repository aims to enable real-time speech interaction and multi-modal interactions involving audio and video.
AGiXT
AGiXT is a dynamic Artificial Intelligence Automation Platform engineered to orchestrate efficient AI instruction management and task execution across a multitude of providers. Our solution infuses adaptive memory handling with a broad spectrum of commands to enhance AI's understanding and responsiveness, leading to improved task completion. The platform's smart features, like Smart Instruct and Smart Chat, seamlessly integrate web search, planning strategies, and conversation continuity, transforming the interaction between users and AI. By leveraging a powerful plugin system that includes web browsing and command execution, AGiXT stands as a versatile bridge between AI models and users. With an expanding roster of AI providers, code evaluation capabilities, comprehensive chain management, and platform interoperability, AGiXT is consistently evolving to drive a multitude of applications, affirming its place at the forefront of AI technology.
wingman-ai
Wingman AI allows you to use your voice to talk to various AI providers and LLMs, process your conversations, and ultimately trigger actions such as pressing buttons or reading answers. Our _Wingmen_ are like characters and your interface to this world, and you can easily control their behavior and characteristics, even if you're not a developer. AI is complex and it scares people. It's also **not just ChatGPT**. We want to make it as easy as possible for you to get started. That's what _Wingman AI_ is all about. It's a **framework** that allows you to build your own Wingmen and use them in your games and programs. The idea is simple, but the possibilities are endless. For example, you could: * **Role play** with an AI while playing for more immersion. Have air traffic control (ATC) in _Star Citizen_ or _Flight Simulator_. Talk to Shadowheart in Baldur's Gate 3 and have her respond in her own (cloned) voice. * Get live data such as trade information, build guides, or wiki content and have it read to you in-game by a _character_ and voice you control. * Execute keystrokes in games/applications and create complex macros. Trigger them in natural conversations with **no need for exact phrases.** The AI understands the context of your dialog and is quite _smart_ in recognizing your intent. Say _"It's raining! I can't see a thing!"_ and have it trigger a command you simply named _WipeVisors_. * Automate tasks on your computer * improve accessibility * ... and much more
whisper_dictation
Whisper Dictation is a fast, offline, privacy-focused tool for voice typing, AI voice chat, voice control, and translation. It allows hands-free operation, launching and controlling apps, and communicating with OpenAI ChatGPT or a local chat server. The tool also offers the option to speak answers out loud and draw pictures. It includes client and server versions, inspired by the Star Trek series, and is designed to keep data off the internet and confidential. The project is optimized for dictation and translation tasks, with voice control capabilities and AI image generation using stable-diffusion API.
awesome-mcp-servers
Awesome MCP Servers is a curated list of Model Context Protocol (MCP) servers that enable AI models to securely interact with local and remote resources through standardized server implementations. The list includes production-ready and experimental servers that extend AI capabilities through file access, database connections, API integrations, and other contextual services.
Awesome-AITools
This repo collects AI-related utilities. ## All Categories * All Categories * ChatGPT and other closed-source LLMs * AI Search engine * Open Source LLMs * GPT/LLMs Applications * LLM training platform * Applications that integrate multiple LLMs * AI Agent * Writing * Programming Development * Translation * AI Conversation or AI Voice Conversation * Image Creation * Speech Recognition * Text To Speech * Voice Processing * AI generated music or sound effects * Speech translation * Video Creation * Video Content Summary * OCR(Optical Character Recognition)
Linly-Talker
Linly-Talker is an innovative digital human conversation system that integrates the latest artificial intelligence technologies, including Large Language Models (LLM) 🤖, Automatic Speech Recognition (ASR) 🎙️, Text-to-Speech (TTS) 🗣️, and voice cloning technology 🎤. This system offers an interactive web interface through the Gradio platform 🌐, allowing users to upload images 📷 and engage in personalized dialogues with AI 💬.
13 - OpenAI Gpts
Your Lingo AI Coach
Welcome! I'm a voice-focused language teacher for interactive speaking practice. To enable voice, download the app and tap the headphone button next to my chat window. Then choose your preferred voice. When you're ready, tell me what language you'd like to learn. It's FREE!
Cyber Guardian
I'm your personal cybersecurity advisor, here to help you stay safe online.
AI Use Case Analyst for Sales & Marketing
Enables sales & marketing leadership to identify high-value AI use cases
Agenda Writing for Sales Professionals
Enables salespeople to write best practice sales agendas
Terpene Tracker GPT
Web-enabled cannabis and terpene profile analyzer with image recognition
The Amazonian Interview Coach
A role-play enabled Amazon/AWS interview coach specializing in STAR format and Leadership Principles.
AI Chat Gbt
Discover the revolutionary power of AI Chat Gbt, a platform that enables natural language conversations with advanced artificial intelligence. Engage in dialogue, ask questions, and receive intelligent responses to enhance your interactive communication experience.
Chatjpd
Discover the revolutionary power of Chatjpd, a platform that enables natural language conversations with advanced artificial intelligence. Engage in dialogue, ask questions, and receive intelligent responses to enhance your interactive communication experience.
Chatgp3
Discover the revolutionary power of Chatgp3, a platform that enables natural language conversations with advanced artificial intelligence. Engage in dialogue, ask questions, and receive intelligent responses to enhance your interactive communication experience.
Chhatgpt
Discover the revolutionary power of Chhatgpt, a platform that enables natural language conversations with advanced artificial intelligence. Engage in dialogue, ask questions, and receive intelligent responses to enhance your interactive communication experience.