Best AI tools for< Allow Microphone Access >
20 - AI tool Sites

Ermine.ai
Ermine.ai is an AI-powered tool for local audio recording and transcription. It allows users to transcribe audio files in real-time using a transcription model that is loaded and initialized in the user's browser. The tool currently supports English transcription and requires Chrome browser for usage. Ermine.ai aims to provide a seamless and efficient transcription experience for users, with the promise of faster sessions in subsequent uses.

GPT4Audio
GPT4Audio is an AI-based desktop application that offers speech-to-text and text-to-speech capabilities. It allows users to transcribe and translate audio files from multiple languages, as well as dictate text and generate audio recordings in real time. The application also includes an Article Wizard feature that can help users create homework essays, marketing content, articles, or blogs quickly and easily.

Screen Story
Screen Story is a Mac screen recorder tool designed to capture and record screens with ease. It allows users to create high-quality videos, demos, GIFs, and tutorials without the need for video editing skills. The application offers features like automatic zoom, smooth cursor movement, offline recording, webcam and microphone support, and a simple editing interface. Screen Story is trusted by entrepreneurs, designers, marketers, and developers for its efficiency and user-friendly design patterns.

Auxillary
Auxillary is an AI-powered chatbot copilot that seamlessly integrates with your SaaS platform, empowering users to interact with your product through natural conversation. It offers a range of capabilities, including answering queries, executing actions, providing guidance, and enhancing user experience. With Auxillary, users can navigate complex tasks, access information quickly, and receive proactive assistance, all within a user-friendly chat interface. It simplifies workflows, streamlines processes, and delivers personalized experiences, making it an invaluable tool for businesses looking to enhance their SaaS platform.

Santa Cat
Santa Cat is an AI-powered virtual assistant designed to bring holiday cheer and festive fun to users. It allows you to engage in interactive conversations with a virtual Santa Cat, creating a unique and enjoyable experience. Developed by Daily About & Help, Santa Cat is the purr-fect companion for spreading joy and laughter during the holiday season.

TypeflowAI
TypeflowAI is a platform that allows users to create AI tools without coding in minutes. Users can easily build, share, and embed AI tools into their websites to enhance SEO, increase traffic, and generate more leads. The platform offers features such as creating dynamic lead magnets, AI quizzes, calculators, and more, along with customization options to fit users' branding. With flexible pricing plans and integrations with popular apps, TypeflowAI simplifies the process of creating AI tools and lead magnets for businesses and individuals.

Momento AI
Momento AI is an innovative AI application that allows users to create their very own AI clone. With Momento AI, users can engage in home chats, explore various topics, and create personalized content. The application provides a seamless experience for users to interact with their AI clone and enhance their digital presence. Momento AI leverages advanced AI technology to deliver a unique and personalized experience to users.

CreateApp.ai
CreateApp.ai is an AI-powered app development platform that allows users to develop apps in days, not months. It is trusted by leading companies and startup incubators. CreateApp.ai's first step towards its vision is CreatePrototype.ai, which allows users to describe their idea in plain English and build an app prototype in minutes. CreateApp.ai is coming soon, and users can sign up for early access. With CreateApp.ai, users can develop apps in plain English, without any tech knowledge required. CreateApp.ai takes care of everything, from app design and development to app maintenance. CreateApp.ai is the easiest way to build apps.

MyFaceSwap
MyFaceSwap is a free online AI tool that specializes in face swapping and lip syncing for videos and shorts. Users can easily upload images and videos to swap faces, create lip sync videos, and generate entertaining content. The platform ensures privacy and data security by deleting uploaded photos after video creation. MyFaceSwap enables users to unleash their creativity, make stunning videos, and become the star of any movie or music video.

Digital First AI
Digital First AI is an AI-powered marketing workflow platform that offers personalized marketing strategies, practical tactics, and targeted campaigns. The platform enables users to transform their marketing strategies into action with AI agents, maximizing ROI and outpacing the competition. From data mining to strategy creation, Digital First AI provides a secure data room, customizable AI-powered workflows, scalable content production, and insight generation for trend analysis. The platform is designed to streamline marketing processes, enhance creativity, and drive data-driven decision-making for businesses and marketing agencies.

IG Lead Gen
IG Lead Gen is an AI-powered tool designed to automate Instagram lead generation for B2B founders. It offers custom lead filtering based on metrics like Follower count, Following count, Age of Lead, Verification Status, and Link in Bio. The tool utilizes proprietary AI technology to identify and scrape active Instagram users likely to convert to customers. Users can effortlessly export leads in various formats through the advanced dashboard. IG Lead Gen aims to streamline the process of generating targeted leads, saving time, and enabling users to focus on growing their business.

Charly AI Solutions
Charly AI Solutions is a leading AI automation company that offers custom chatbots, phone assistants, and other AI solutions to empower businesses. Their AI applications include Humanizer Pro for humanizing AI-generated texts, Recruiter Pro for ranking job compatibility, Cooking Pro for cooking assistance, and Script Pro for generating and analyzing YouTube scripts. The company also provides services to automate tasks using generative artificial intelligence and offers personalized AI solutions for enhanced efficiency and customer relationships.

Runner H
Runner H is an AI tool that enables users to create, run, and scale web automations effortlessly. It offers a platform for building super intelligence through VLMs, LLMs, and Agents API Beta. Users can join the API beta to access advanced features and functionalities. The tool aims to put AI to work for users, providing a seamless experience for automating tasks and processes.

Chatterbot
Chatterbot is an AI chatbot application designed for individuals and businesses to improve productivity and customer support. It offers advanced features such as chat history tracking, bot personas customization, support for GPT-3.5 & GPT-4, data import from various sources, privacy & security measures, no-code platform, multi-lingual support, branding customization, and access control. The application empowers users to leverage AI chatbots for customer service, sales assistance, personal assistance, onboarding, and training purposes.

CFR Explorer
CFR Explorer is an AI-powered tool that allows users to ask questions about regulations in Title 14 and receive answers from AI. Users can search for specific regulations, such as requirements for general aviation pilots or VFR weather requirements for Class C airspace. The tool is currently in beta, aiming to gather feedback for system improvement. Users are advised not to share private information in queries, and the tool's creators are not liable for the content generated.

Osher.ai
Osher.ai is a personal AI for businesses that allows users to interact with websites, intranets, knowledge bases, process documents, spreadsheets, and procedures. It can be used to train custom AIs on internal knowledge bases, process documents, and files. Osher.ai also offers private and public AIs, and users can customize their AIs' personality, purpose, and welcome message.

Midjourney
Midjourney is a free online AI image generator that allows users to create high-quality images from simple text prompts. It is powered by advanced machine learning algorithms that can understand the meaning of words and convert them into realistic and visually appealing images. Midjourney is easy to use and does not require any special hardware or software. Users simply need to enter a text description of the image they want to generate and Midjourney will create it in a matter of seconds.

Engage AI
Engage AI is a generative AI tool that helps businesses increase their LinkedIn engagement and lead generation. It offers a range of features to help users create personalized and engaging content, including the ability to generate comments, connection requests, and profile content. Engage AI also provides insights into LinkedIn trends and best practices, and offers a variety of resources to help users get the most out of the platform.

RoboCoder
RoboCoder is an AI tool that leverages GPT-4 Turbo to convert specifications into code, making programming easier. By integrating with VS Code's APIs, RoboCoder can open and edit files, serving as an AI collaborator for developers. Users can access the VS Code extension for $5 per month and communicate directly with OpenAI by providing their API key. RoboCoder is designed to streamline the coding process and enhance productivity for programmers.

Engage AI
Engage AI is a generative AI tool designed for LinkedIn users to enhance their engagement, content creation, and prospect nurturing. It helps users automate and personalize interactions with prospects, optimize their profiles, and generate more leads on LinkedIn. The tool leverages AI technology to provide meaningful recommendations, eliminate distractions, and assist in relationship-building on the platform.
20 - Open Source AI Tools

ChatGPT-OpenAI-Smart-Speaker
ChatGPT Smart Speaker is a project that enables speech recognition and text-to-speech functionalities using OpenAI and Google Speech Recognition. It provides scripts for running on PC/Mac and Raspberry Pi, allowing users to interact with a smart speaker setup. The project includes detailed instructions for setting up the required hardware and software dependencies, along with customization options for the OpenAI model engine, language settings, and response randomness control. The Raspberry Pi setup involves utilizing the ReSpeaker hardware for voice feedback and light shows. The project aims to offer an advanced smart speaker experience with features like wake word detection and response generation using AI models.

org-ai
org-ai is a minor mode for Emacs org-mode that provides access to generative AI models, including OpenAI API (ChatGPT, DALL-E, other text models) and Stable Diffusion. Users can use ChatGPT to generate text, have speech input and output interactions with AI, generate images and image variations using Stable Diffusion or DALL-E, and use various commands outside org-mode for prompting using selected text or multiple files. The tool supports syntax highlighting in AI blocks, auto-fill paragraphs on insertion, and offers block options for ChatGPT, DALL-E, and other text models. Users can also generate image variations, use global commands, and benefit from Noweb support for named source blocks.

OSHW-SenseCAP-Watcher
SenseCAP Watcher is a monitoring device built on ESP32S3 with Himax WiseEye2 HX6538 AI chip, excelling in image and vector data processing. It features a camera, microphone, and speaker for visual, auditory, and interactive capabilities. With LLM-enabled SenseCraft suite, it understands commands, perceives surroundings, and triggers actions. The repository provides firmware, hardware documentation, and applications for the Watcher, along with detailed guides for setup, task assignment, and firmware flashing.

june
june-va is a local voice chatbot that combines Ollama for language model capabilities, Hugging Face Transformers for speech recognition, and the Coqui TTS Toolkit for text-to-speech synthesis. It provides a flexible, privacy-focused solution for voice-assisted interactions on your local machine, ensuring that no data is sent to external servers. The tool supports various interaction modes including text input/output, voice input/text output, text input/audio output, and voice input/audio output. Users can customize the tool's behavior with a JSON configuration file and utilize voice conversion features for voice cloning. The application can be further customized using a configuration file with attributes for language model, speech-to-text model, and text-to-speech model configurations.

Linly-Talker
Linly-Talker is an innovative digital human conversation system that integrates the latest artificial intelligence technologies, including Large Language Models (LLM) 🤖, Automatic Speech Recognition (ASR) 🎙️, Text-to-Speech (TTS) 🗣️, and voice cloning technology 🎤. This system offers an interactive web interface through the Gradio platform 🌐, allowing users to upload images 📷 and engage in personalized dialogues with AI 💬.

Open-LLM-VTuber
Open-LLM-VTuber is a project in early stages of development that allows users to interact with Large Language Models (LLM) using voice commands and receive responses through a Live2D talking face. The project aims to provide a minimum viable prototype for offline use on macOS, Linux, and Windows, with features like long-term memory using MemGPT, customizable LLM backends, speech recognition, and text-to-speech providers. Users can configure the project to chat with LLMs, choose different backend services, and utilize Live2D models for visual representation. The project supports perpetual chat, offline operation, and GPU acceleration on macOS, addressing limitations of existing solutions on macOS.

gpt-home
GPT Home is a project that allows users to build their own home assistant using Raspberry Pi and OpenAI API. It serves as a guide for setting up a smart home assistant similar to Google Nest Hub or Amazon Alexa. The project integrates various components like OpenAI, Spotify, Philips Hue, and OpenWeatherMap to provide a personalized home assistant experience. Users can follow the detailed instructions provided to build their own version of the home assistant on Raspberry Pi, with optional components for customization. The project also includes system configurations, dependencies installation, and setup scripts for easy deployment. Overall, GPT Home offers a DIY solution for creating a smart home assistant using Raspberry Pi and OpenAI technology.

keras-llm-robot
The Keras-llm-robot Web UI project is an open-source tool designed for offline deployment and testing of various open-source models from the Hugging Face website. It allows users to combine multiple models through configuration to achieve functionalities like multimodal, RAG, Agent, and more. The project consists of three main interfaces: chat interface for language models, configuration interface for loading models, and tools & agent interface for auxiliary models. Users can interact with the language model through text, voice, and image inputs, and the tool supports features like model loading, quantization, fine-tuning, role-playing, code interpretation, speech recognition, image recognition, network search engine, and function calling.

llms-tools
The 'llms-tools' repository is a comprehensive collection of AI tools, open-source projects, and research related to Large Language Models (LLMs) and Chatbots. It covers a wide range of topics such as AI in various domains, open-source models, chats & assistants, visual language models, evaluation tools, libraries, devices, income models, text-to-image, computer vision, audio & speech, code & math, games, robotics, typography, bio & med, military, climate, finance, and presentation. The repository provides valuable resources for researchers, developers, and enthusiasts interested in exploring the capabilities of LLMs and related technologies.

amazon-transcribe-live-call-analytics
The Amazon Transcribe Live Call Analytics (LCA) with Agent Assist Sample Solution is designed to help contact centers assess and optimize caller experiences in real time. It leverages Amazon machine learning services like Amazon Transcribe, Amazon Comprehend, and Amazon SageMaker to transcribe and extract insights from contact center audio. The solution provides real-time supervisor and agent assist features, integrates with existing contact centers, and offers a scalable, cost-effective approach to improve customer interactions. The end-to-end architecture includes features like live call transcription, call summarization, AI-powered agent assistance, and real-time analytics. The solution is event-driven, ensuring low latency and seamless processing flow from ingested speech to live webpage updates.

simple-openai
Simple-OpenAI is a Java library that provides a simple way to interact with the OpenAI API. It offers consistent interfaces for various OpenAI services like Audio, Chat Completion, Image Generation, and more. The library uses CleverClient for HTTP communication, Jackson for JSON parsing, and Lombok to reduce boilerplate code. It supports asynchronous requests and provides methods for synchronous calls as well. Users can easily create objects to communicate with the OpenAI API and perform tasks like text-to-speech, transcription, image generation, and chat completions.

MooER
MooER (摩耳) is an LLM-based speech recognition and translation model developed by Moore Threads. It allows users to transcribe speech into text (ASR) and translate speech into other languages (AST) in an end-to-end manner. The model was trained using 5K hours of data and is now also available with an 80K hours version. MooER is the first LLM-based speech model trained and inferred using domestic GPUs. The repository includes pretrained models, inference code, and a Gradio demo for a better user experience.

voice-chat-ai
Voice Chat AI is a project that allows users to interact with different AI characters using speech. Users can choose from various characters with unique personalities and voices, and have conversations or role play with them. The project supports OpenAI, xAI, or Ollama language models for chat, and provides text-to-speech synthesis using XTTS, OpenAI TTS, or ElevenLabs. Users can seamlessly integrate visual context into conversations by having the AI analyze their screen. The project offers easy configuration through environment variables and can be run via WebUI or Terminal. It also includes a huge selection of built-in characters for engaging conversations.

videokit
VideoKit is a full-featured user-generated content solution for Unity Engine, enabling video recording, camera streaming, microphone streaming, social sharing, and conversational interfaces. It is cross-platform, with C# source code available for inspection. Users can share media, save to camera roll, pick from camera roll, stream camera preview, record videos, remove background, caption audio, and convert text commands. VideoKit requires Unity 2022.3+ and supports Android, iOS, macOS, Windows, and WebGL platforms.
10 - OpenAI Gpts

Art MaGPT
I allow users to remake images with a similar concept to their uploaded image, without the risk of copyright infringement. I will transform your images into unique art pieces of various art styles. Upload an image to get started or pick from the options below:

Teacher Bot
The ultimate assistant for our hard working teachers that will allow lesson planning, adapting that lesson plan to kids with different special needs, creating amazing picture and illustration files for decorating your classroom, as well as photo grading possibilities and more!!

Uncrop.AI
Uncrop.AI first mimics your uploaded photo before letting you expand it sideways or vertically, blending seamlessly with the original. It's easy to use and will soon allow direct additions to your original photos.

Future Alloy Oracle
High Entropy Alloys & AI-human interactions expert with a hint of sci-fi fun.

Pet Breed Mixer
Allows users to upload pictures of their pets and witness fascinating visualizations of potential crossbreeds with other species or different breeds.

MagicUnprotect
This GPT allows to interact with the Unprotect DB to retrieve knowledge about malware evasion techniques

WM ACC Score
This Custom GPT ACC Score allows you to input a an thought and analyze where on the ACC spectrum it falls.

MultiAgent Wizard
Automatically creates new agents for specific tasks, and allows them to collaborate to complete tasks.
Ai PDF is a GPT (uses the popular Ai PDF plugin) that allows you to chat and ask questions of your PDF documents and have it explained to you by ChatGPT. We also include page references to help you fact-check all answers.