Best AI tools for< Create Voice Commands >
20 - AI tool Sites
Dubbing AI
Dubbing AI is a free real-time AI voice changer that allows you to change your voice in real-time while speaking. It offers a variety of voice effects and filters that you can use to customize your voice. You can also use Dubbing AI to create funny or unique voiceovers for your videos or presentations.
PayGenie
The website offers an AI-powered invoicing assistant that enables users to create and manage invoices effortlessly using voice commands and automation. It aims to save users time by automating invoice creation and reducing errors. The tool provides customizable templates, real-time insights, and smart time tracking features to streamline the invoicing process. Users can join the waitlist to experience the future of invoicing with AI-driven automation.
Whisprlist
Whisprlist is an AI-powered task management application that allows users to create and organize tasks effortlessly through voice commands. The application provides personalized assistance by understanding user commands and generating daily agenda emails to help users stay focused and productive. With a simple pricing structure and a range of features, Whisprlist aims to streamline task management for individuals and teams.
Glimmer AI
Glimmer AI is a cutting-edge platform that revolutionizes the way presentations are created and delivered. Leveraging the power of GPT-3 and DALL·E 2, Glimmer AI empowers users to generate visually captivating presentations based on their text and voice commands. With its intuitive interface and seamless workflow, Glimmer AI simplifies the presentation process, enabling users to focus on delivering impactful messages.
Hints
Hints is a sales AI assistant that helps sales reps to get more hours in a day while keeping CRM data accurate automatically. It works with Salesforce, Hubspot, and Pipedrive. With Hints, sales reps can log and retrieve CRM data on any device with chat and voice, get guidance on their next steps, and reminders of what's missing. Hints can also help sales reps to create complex CRM updates in seconds, find duplicates, suggest actions, automatically create associations, and look up sales data through chat and voice commands. Hints can assist sales reps in building the perfect sales process for their team and provides fast onboarding for new sales reps.
Meta AI
Meta AI is an advanced artificial intelligence tool that enables users to learn, create, and explore the world around them. With features like AI Studio for creating custom AIs and Llama for building the future of AI, Meta AI offers cutting-edge technology to bring visions to life. Users can engage with AI characters, identify objects, and have conversations using voice commands. The platform is designed to make AI more accessible and engaging for everyone, with a focus on open collaboration and innovation.
Bibit AI
Bibit AI is a real estate marketing AI designed to enhance the efficiency and effectiveness of real estate marketing and sales. It can help create listings, descriptions, and property content, and offers a host of other features. Bibit AI is the world's first AI for Real Estate. We are transforming the real estate industry by boosting efficiency and simplifying tasks like listing creation and content generation.
Wispr Flow
Wispr Flow is an AI-powered voice dictation tool that allows users to write 3 times faster using their voice in over 100 languages. It offers features like AI commands, auto-edits, and context-awareness, making it a game-changer for professionals in various fields. Trusted by professionals, Wispr Flow runs on a private cloud ensuring data security and offers a whispering mode for discreet dictation. The tool adapts to different writing styles and applications, enhancing productivity and accuracy for users.
Ad-Free AI Chat
Ad-Free AI Chat is an Android app available in the Play Store. It offers ad-free GPT voice chat, games, language learning, and research assistance. Users can create their own commands to automate tasks and use the app with Android Auto. The app supports ChatGPT 4, ChatGPT 3.5, and Google Gemini Pro.
Wispr Flow
Wispr Flow is an AI-powered voice dictation tool that allows users to write 3x faster in any application by using their voice. It offers features like AI commands, auto-edits, and support for over 100 languages. The tool adapts to the user's voice and style based on the application being used, making it a valuable productivity tool for professionals across various industries.
AI VC
AI VC was an AI tool designed to provide virtual consultations to users. It allowed users to have virtual conversations with an AI-powered virtual consultant to seek advice or information on various topics. The tool aimed to simulate a real-life conversation experience and provide personalized recommendations based on user input. Users could interact with the virtual consultant through text or voice commands, making it a convenient and user-friendly platform for seeking guidance and support.
ToDoIt
ToDoIt is a voice and AI-powered to-do list application that helps users manage their tasks efficiently using natural language. Users can create tasks in less than 10 seconds by speaking, receive task recommendations based on their inputs, and enjoy smart task automation for improved productivity. The app offers different pricing plans with features like AI voice transcription, AI-powered task recommendations, and unlimited task recommendation refreshes. ToDoIt prioritizes user privacy and security by securely storing data and deleting audio files after transcription. Users can leave feedback through Insighto and benefit from the app's responsive web version.
Voice AI Note
Voice AI Note is a web-based application that allows users to quickly and easily create voice notes using advanced AI. With Voice AI Note, you can create voice notes that are fluent, accurate, and sound natural. The application is easy to use and requires no prior experience with AI or voice recording. Simply enter the text you want to convert to speech, and Voice AI Note will do the rest.
Voicemy.ai
Voicemy.ai is an AI application that allows users to create AI voices and songs. Users can clone voices of famous personalities, compose melodies, and convert text into spoken words using chosen voice models. The platform aims to inspire creativity and enable users to share their passion with the world.
MetaVoice Studio
MetaVoice Studio is an AI-powered voice-over platform that allows users to create high-quality, realistic voice-overs for videos, presentations, and other projects. With MetaVoice Studio, users can choose from a variety of voices, languages, and accents to create voice-overs that sound natural and engaging. The platform also offers a range of editing tools that allow users to fine-tune their voice-overs to perfection.
Millis AI
Millis AI is an instant, natural, and affordable voice AI platform designed for developers to create cutting-edge voice agents with low latency. The platform offers optimized conversation flow handling, affordable accessibility, seamless integration, and scalable expertise. With rates starting at $0.06/min, Millis AI enables users to build human-like voice agents that can manage interruptions and understand human intent. The platform also provides DevOps engineers' expertise in scaling systems for enterprise-level applications.
PlayAI
PlayAI is an AI tool designed for businesses and developers to create voice interfaces effortlessly. The platform allows users to generate conversational agents by simply tapping or clicking, enabling them to shuffle, share, and clone voices. PlayAI offers a user-friendly interface for building agents, making it easy to customize and deploy voice interactions. With a focus on simplicity and efficiency, PlayAI aims to revolutionize the way businesses and developers engage with their audience through voice technology.
VoiceDub
VoiceDub is an AI-powered application that allows users to create voice covers of their favorite songs. It offers a wide range of AI voices to choose from, as well as the ability to clone your own voice. VoiceDub also includes a text-to-speech feature, which allows users to generate studio-quality vocals from text. The application is easy to use and produces high-quality results, making it a great choice for musicians, singers, and content creators.
LazyBird
LazyBird is an AI Voice-Over Generator that provides realistic voices with natural intonations, offering the best AI voice-over experience to captivate your audience. Users can easily create voice-overs by uploading scripts, selecting voices, editing timing, and exporting the final result. With a wide range of characters, accents, and tones to choose from, LazyBird allows users to find the perfect voice for their content. Additionally, users can sync their video and audio files with AI-generated voice-overs, access a rich library of stock videos and images, and enjoy features like granular word-level control, 60+ natural-sounding voices, 100+ languages and accents, advanced audio timeline, and more.
SRVO
SRVO is a voice over service that provides high-quality, professional voice overs for a variety of purposes, including commercials, e-learning, and audiobooks. With a team of experienced voice actors and a state-of-the-art recording studio, SRVO can create custom voice overs that meet the specific needs of each client. SRVO also offers a variety of additional services, such as scriptwriting, audio editing, and mixing.
20 - Open Source AI Tools
alan-sdk-ios
Alan AI SDK for iOS is a powerful tool that allows developers to quickly create AI agents for their iOS apps. With Alan AI Platform, users can easily design, embed, and host conversational experiences in their applications. The platform offers a web-based IDE called Alan AI Studio for creating dialog scenarios, lightweight SDKs for embedding AI agents, and a backend powered by top-notch speech recognition and natural language understanding technologies. Alan AI enables human-like conversations and actions through voice commands, with features like on-the-fly updates, dialog flow testing, and analytics.
Simulator-Controller
Simulator Controller is a modular administration and controller application for Sim Racing, featuring a comprehensive plugin automation framework for external controller hardware. It includes voice chat capable Assistants like Virtual Race Engineer, Race Strategist, Race Spotter, and Driving Coach. The tool offers features for setup, strategy development, monitoring races, and more. Developed in AutoHotkey, it supports various simulation games and integrates with third-party applications for enhanced functionality.
talon-ai-tools
Control large language models and AI tools through voice commands using the Talon Voice dictation engine. This tool is designed to help users quickly edit text, code by voice, reduce keyboard use for those with health issues, and speed up workflow by using AI commands across the desktop. It prompts and extends tools like Github Copilot and OpenAI API for text and image generation. Users can set up the tool by downloading the repo, obtaining an OpenAI API key, and customizing the endpoint URL for preferred models. The tool can be used without an OpenAI key and can be exclusively used with Copilot for those not needing LLM integration.
talk-to-chatgpt
Talk-To-ChatGPT is a Google Chrome and Microsoft Edge extension that enables users to interact with the ChatGPT AI using voice commands for speech recognition and text-to-speech responses. The tool enhances the conversational experience by allowing users to speak to the AI and receive spoken responses, making interactions more natural and engaging. It also supports ElevenLabs API integration for creating custom voices for text-to-speech. The extension provides settings for voice, language, and more, and can be installed from the Chrome and Edge web stores or manually. While the project has been discontinued due to upcoming desktop apps from OpenAI, it has been used to assist individuals with disabilities and the elderly in interacting with ChatGPT.
Open-LLM-VTuber
Open-LLM-VTuber is a project in early stages of development that allows users to interact with Large Language Models (LLM) using voice commands and receive responses through a Live2D talking face. The project aims to provide a minimum viable prototype for offline use on macOS, Linux, and Windows, with features like long-term memory using MemGPT, customizable LLM backends, speech recognition, and text-to-speech providers. Users can configure the project to chat with LLMs, choose different backend services, and utilize Live2D models for visual representation. The project supports perpetual chat, offline operation, and GPU acceleration on macOS, addressing limitations of existing solutions on macOS.
tb1
A Telegram bot for accessing Google Gemini, MS Bing, etc. The bot responds to the keywords 'bot' and 'google' to provide information. It can handle voice messages, text files, images, and links. It can generate images based on descriptions, extract text from images, and summarize content. The bot can interact with various AI models and perform tasks like voice control, text-to-speech, and text recognition. It supports long texts, large responses, and file transfers. Users can interact with the bot using voice commands and text. The bot can be customized for different AI providers and has features for both users and administrators.
rosa
ROSA is an AI Agent designed to interact with ROS-based robotics systems using natural language queries. It can generate system reports, read and parse ROS log files, adapt to new robots, and run various ROS commands using natural language. The tool is versatile for robotics research and development, providing an easy way to interact with robots and the ROS environment.
OSHW-SenseCAP-Watcher
SenseCAP Watcher is a monitoring device built on ESP32S3 with Himax WiseEye2 HX6538 AI chip, excelling in image and vector data processing. It features a camera, microphone, and speaker for visual, auditory, and interactive capabilities. With LLM-enabled SenseCraft suite, it understands commands, perceives surroundings, and triggers actions. The repository provides firmware, hardware documentation, and applications for the Watcher, along with detailed guides for setup, task assignment, and firmware flashing.
live2d-TTS-LLM-GPT-SoVITS-Vtuber
This repository is a modification based on the pixi-live2d-display project. It provides a platform for TTS (Text-to-Speech) functionality and a large model voice chat page. Users can install node.js, run the provided commands, and access the specified URLs to utilize the features.
gp.nvim
Gp.nvim (GPT prompt) Neovim AI plugin provides a seamless integration of GPT models into Neovim, offering features like streaming responses, extensibility via hook functions, minimal dependencies, ChatGPT-like sessions, instructable text/code operations, speech-to-text support, and image generation directly within Neovim. The plugin aims to enhance the Neovim experience by leveraging the power of AI models in a user-friendly and native way.
Top-AI-Tools
Top AI Tools is a comprehensive, community-curated directory that aims to catalog and showcase the most outstanding AI-powered products. This index is not exhaustive, but rather a compilation of our research and contributions from the community.
big-AGI
big-AGI is an AI suite designed for professionals seeking function, form, simplicity, and speed. It offers best-in-class Chats, Beams, and Calls with AI personas, visualizations, coding, drawing, side-by-side chatting, and more, all wrapped in a polished UX. The tool is powered by the latest models from 12 vendors and open-source servers, providing users with advanced AI capabilities and a seamless user experience. With continuous updates and enhancements, big-AGI aims to stay ahead of the curve in the AI landscape, catering to the needs of both developers and AI enthusiasts.
ASR-LLM-TTS
ASR-LLM-TTS is a repository that provides detailed tutorials for setting up the environment, including installing anaconda, ffmpeg, creating virtual environments, and installing necessary libraries such as pytorch, torchaudio, edge-tts, funasr, and more. It also introduces features like voiceprint recognition, custom wake words, and conversation history memory. The repository combines CosyVoice for speech synthesis, SenceVoice for speech recognition, and QWen2.5 for dialogue understanding. It offers multiple speech synthesis methods including CoosyVoice, pyttsx3, and edgeTTS, with scripts for interactive inference provided. The repository aims to enable real-time speech interaction and multi-modal interactions involving audio and video.
weixin-dyh-ai
WeiXin-Dyh-AI is a backend management system that supports integrating WeChat subscription accounts with AI services. It currently supports integration with Ali AI, Moonshot, and Tencent Hyunyuan. Users can configure different AI models to simulate and interact with AI in multiple modes: text-based knowledge Q&A, text-to-image drawing, image description, text-to-voice conversion, enabling human-AI conversations on WeChat. The system allows hierarchical AI prompt settings at system, subscription account, and WeChat user levels. Users can configure AI model types, providers, and specific instances. The system also supports rules for allocating models and keys at different levels. It addresses limitations of WeChat's messaging system and offers features like text-based commands and voice support for interactions with AI.
ChatGPT-OpenAI-Smart-Speaker
ChatGPT Smart Speaker is a project that enables speech recognition and text-to-speech functionalities using OpenAI and Google Speech Recognition. It provides scripts for running on PC/Mac and Raspberry Pi, allowing users to interact with a smart speaker setup. The project includes detailed instructions for setting up the required hardware and software dependencies, along with customization options for the OpenAI model engine, language settings, and response randomness control. The Raspberry Pi setup involves utilizing the ReSpeaker hardware for voice feedback and light shows. The project aims to offer an advanced smart speaker experience with features like wake word detection and response generation using AI models.
AiTreasureBox
AiTreasureBox is a versatile AI tool that provides a collection of pre-trained models and algorithms for various machine learning tasks. It simplifies the process of implementing AI solutions by offering ready-to-use components that can be easily integrated into projects. With AiTreasureBox, users can quickly prototype and deploy AI applications without the need for extensive knowledge in machine learning or deep learning. The tool covers a wide range of tasks such as image classification, text generation, sentiment analysis, object detection, and more. It is designed to be user-friendly and accessible to both beginners and experienced developers, making AI development more efficient and accessible to a wider audience.
AICoverGen
AICoverGen is an autonomous pipeline designed to create covers using any RVC v2 trained AI voice from YouTube videos or local audio files. It caters to developers looking to incorporate singing functionality into AI assistants/chatbots/vtubers, as well as individuals interested in hearing their favorite characters sing. The tool offers a WebUI for easy conversions, cover generation from local audio files, volume control for vocals and instrumentals, pitch detection method control, pitch change for vocals and instrumentals, and audio output format options. Users can also download and upload RVC models via the WebUI, run the pipeline using CLI, and access various advanced options for voice conversion and audio mixing.
20 - OpenAI Gpts
Voice Memo
Record your thoughts with ChatGPT Voice Conversations 💡. Get started by clicking the 🎧 icon right to the chat input. Available on mobile only. Ask 'how do you work?' to learn more.
Bring Your Writing Voice to Every Task
This GPT will help you recreate your writing voice across multiple tasks. All you need is a prior writing sample (email, blog, article, tweet) and a new task.
Vedic Voice
A scholar in Hindu literature providing positive, brief insights against negativity.
Commerce Cloud Guru
Professional voice for SFCC B2C Commerce Cloud expertise. 🔒 Unlock the full potential of B2C Commerce Cloud
The Master in Brand Identity - GetMax
Guiding startups to creating unique brand/product voice & tone for content marketing.
Transcript to Social Post
Transforms transcripts (from Whatsapp voice memos) into engaging social media content.
Text Playground
Best AI-powered Text Playground!! I am your go-to assistant for text-to other media conversions. Flawelessly convert any text to voice, image, or video!! I am here to help. Ask me anything!!
CliniType EHR
Voice-to-text, Vision-to-text transcription, Transcript-to-‘Clinical format’ integrated with CDS. Writes clinical notes, referral letter, generate PDF,prepare discharge summary. (Ultimate aid for clinicians)
English Mentor
I assist with English learning, mind maps, voice conversations, and writing.
Racon Gunner Scribe
Expert in TTRPG blogging, crafting visually enriched, SEO-optimized content in Racon Gunner's voice.
Slogan Expert
Hi there! 👋 I'm your Slogan Expert Jason. ✍️ Need a catchy tagline in any language? I'm your guy! 💡 Let's connect and give your brand a voice that stands out. 🚀 Keep in touch for top-notch slogan advice! 📣
Confident Communicator
Generates, elevates, and transforms all types of communications, empowering you to effortlessly create messages in your style, invent new voices, or tap into its collection of learned tones.