Best AI tools for< control media playback >
20 - AI tool Sites
Controlla Voice
Controlla Voice is a web-based application that allows users to control their computer using their voice. The application uses speech recognition technology to convert spoken words into commands that can be executed by the computer. Controlla Voice can be used to perform a variety of tasks, including opening and closing programs, navigating the web, and controlling media playback.
Personamo Workflow
Personamo Workflow is an AI-powered application that allows users to control and customize their content feeds using LEGO-like blocks. Users can adjust the signal-to-noise ratio of their feed, prioritize content, and filter out unwanted information from various sources such as news sites, blogs, and subreddits. The application aims to provide a personalized and efficient way for users to consume and organize information in one place.
IC Light AI
IC Light AI is a free online tool powered by cutting-edge AI technology that offers revolutionary tools for controlling and manipulating image lighting. Users can easily transform and control image lighting using text prompts or background images to achieve perfect illumination and consistency in their photos, particularly portraits. With IC Light AI, users have unmatched freedom in lighting adjustments, enabling precise control to match new environments. The tool utilizes advanced deep learning techniques to provide users with effortless crafting of beautifully illuminated results with remarkable ease.
UnrealPhotoshoot
UnrealPhotoshoot is an AI-powered tool that allows users to generate hyper-realistic person images. With a few clicks, users can specify the person's appearance, outfit, pose, and location. This tool is ideal for marketing campaigns, e-commerce, and social media.
Blimey
Blimey is an AI-powered 3D scene builder that allows users to generate realistic images from scratch. With Blimey, users can control the composition, colors, and camera angles of their scenes, resulting in images that are tailored to their specific needs. Blimey is perfect for creating images for marketing, advertising, social media, and more.
MaskMyPrompt
MaskMyPrompt is an AI tool designed to anonymize prompts before sending them to ChatGPT. It ensures that your prompt data remains private by masking names and sensitive information. The tool is programmed by Mike Ushakov and ChatGPT, leveraging the power of Transformers.js. Users can reach out for support, bug reports, feature requests, or new data types via email or Twitter.
Omnifact
Omnifact is a privacy-first generative AI tool designed for businesses. It allows teams to harness the power of generative AI while maintaining control over their data. Omnifact's advanced data masking and customizable content filtering ensure that no sensitive or confidential information is shared with third-party LLM providers.
RenderNet
RenderNet is an AI-powered image generation tool that allows users to create images with consistent character designs. It is designed to help artists and designers create high-quality images for games, animations, and other projects.
ChatTTS
ChatTTS is a natural and expressive text-to-speech tool designed for dialogue applications. It supports mixed language input and offers multi-speaker capabilities with precise control over prosodic elements like laughter, pauses, and intonation. Users can explore the unique capabilities of ChatTTS, enjoy conversational TTS optimized for dialogue-based tasks, and benefit from fine-grained control over prosodic features. The tool is multilingual, supporting both English and Chinese languages, and is open-source and customizable with pretrained models available for further research and development.
Higgsfield
Higgsfield is a foundational video model company that wants to democratize social media creation for everyone. They are training a foundational video model that offers unparalleled personalization and control, realistic human characters and motion. Diffuse is a video creation app that empowers anyone to create personalized content with just 1 selfie. It is powered by a preview version of Higgsfield's foundational model. Higgsfield AI builds the foundational video AI model for characters and humans. They aim to change content creation fundamentally by providing complete control over every aspect of video production. Their AI technology reimagines content production, offering unparalleled control and a vast array of settings to bring your vision to life with efficiency and flair. Higgsfield harnesses the latest in AI innovation for storytelling that breaks the mold, allowing for total customization of aesthetics, style, motion, and mood.
Roundabout
Roundabout is a micro influencer marketing platform that helps businesses reach their target audience and save time by providing solutions for brand reach, agencies, and content creation. With top features like finding, controlling, and converting influencers, Roundabout offers a transparent and flexible process for successful influencer campaigns. The platform allows users to work with niche creators, plan and track campaigns across various industries, and benefit from AI-powered tools. Roundabout values data transparency, offers flexible pricing, and provides support for content creation and performance tracking.
Dubformer
Dubformer is an AI-powered dubbing and video localization provider that offers a secure and end-to-end solution for the media industry. With a focus on quality and speed, Dubformer's technology enables the creation of realistic and natural-sounding voice-overs in multiple languages, making video content more accessible and engaging for diverse audiences. The platform combines AI-driven processes with human quality control to ensure broadcast-quality results. Dubformer's services include AI dubbing, accurate and culturally sensitive translations, AI mixing for immersive soundscapes, and AI-powered subtitles and closed captions.
Bark
Bark is a parental control app that uses AI to monitor your child's online activity and alert you to potential dangers. It can scan texts, social media, emails, and other online activity for threats like cyberbullying, pornography, and self-harm. Bark also offers features like screen time management, website and app blocking, and location tracking.
Stylar
Stylar is a powerful AI-powered image generation and design tool that provides users with unparalleled control over image composition and style. With its user-friendly interface and advanced features, Stylar makes it easy for users of all skill levels to create stunning and professional-looking images. Key features of Stylar include predefined styles for effortless design customization, layering, positioning, and sketching tools for intuitive design, and user-friendly interface for all skill levels.
Nullface AI
Nullface AI is an AI-powered platform that allows users to easily generate faceless videos for social media. Users can share their ideas and let the AI do the rest, creating content that is simple, fun, and automatic. The platform leverages sophisticated AI algorithms to craft video content tailored to align with specific preferences and content strategies. Nullface AI offers features such as AI-powered audio, imagery, and subtitles all in one platform, providing comprehensive control over both audio and visual elements. Users can personalize prompts and select different voices for narration, ensuring alignment with brand tone and audience preferences. The platform also allows users to preview and approve videos before they go live, offering complete control over video privacy settings and the ability to download videos for local storage or sharing across various platforms.
Distillery
Distillery is an open-source AI text-to-image generator that allows users to create visual assets, learn about AI, train AI with a single image, control generation with 25+ parameters, and create anything they can imagine. It is perfect for artists, designers, and anyone who wants to bring their imaginative concepts to life.
AIProfilePic.art
AIProfilePic.art is an AI-powered tool that allows users to create stunning profile pictures using their own photos. With just a few clicks, users can generate up to 200 high-resolution, high-quality profile pictures in a variety of art styles. AIProfilePic.art uses a unique approach to avatar creation by combining the power of AI along with AI-backed quality control systems. This ensures that every photo produced goes through a process of quality checks, thus minimizing the chances of unusable avatars.
Photogenic AI
Photogenic AI is a personal AI photo studio that allows users to create an infinite variety of photos using their own AI persona. With Photogenic AI, users can take hundreds of photos, change anything they like, and have finetuned control over the results. Users can also upload photos to copy the style of any photo they like. Photogenic AI is perfect for anyone who wants to create high-quality photos for social media, marketing, or personal use.
OpenArt
OpenArt is a free AI image/art generator that allows users to create unique and original images from scratch or by modifying existing ones. It offers a variety of features, including the ability to generate images in different styles, control the colors and composition of the image, and even create your own AI models. OpenArt is a powerful tool that can be used for a variety of creative projects, from generating images for social media to creating concept art for video games.
Recraft
Recraft is a generative AI design tool that allows users to create and edit digital illustrations, vector art, icons, and 3D graphics in a uniform brand style. It offers a range of features such as the ability to turn a single image into a stylized set, iterate with ease using simple visual controls, play with styles and evolve designs, control color with precision, iterate endlessly, start with text and end with art, edit and repaint with lasso, work easily on an infinite canvas, and explore the community for inspiration. Recraft is trusted by brands and creators alike and is used for a variety of tasks such as creating social media graphics, marketing materials, website design, and product design.
20 - Open Source AI Tools
AIRAVAT
AIRAVAT is a multifunctional Android Remote Access Tool (RAT) with a GUI-based Web Panel that does not require port forwarding. It allows users to access various features on the victim's device, such as reading files, downloading media, retrieving system information, managing applications, SMS, call logs, contacts, notifications, keylogging, admin permissions, phishing, audio recording, music playback, device control (vibration, torch light, wallpaper), executing shell commands, clipboard text retrieval, URL launching, and background operation. The tool requires a Firebase account and tools like ApkEasy Tool or ApkTool M for building. Users can set up Firebase, host the web panel, modify Instagram.apk for RAT functionality, and connect the victim's device to the web panel. The tool is intended for educational purposes only, and users are solely responsible for its use.
TeroSubtitler
Tero Subtitler is an open source, cross-platform, and free subtitle editing software with a user-friendly interface. It offers fully fledged editing with SMPTE and MEDIA modes, support for various subtitle formats, multi-level undo/redo, search and replace, auto-backup, source and transcription modes, translation memory, audiovisual preview, timeline with waveform visualizer, manipulation tools, formatting options, quality control features, translation and transcription capabilities, validation tools, automation for correcting errors, and more. It also includes features like exporting subtitles to MP3, importing/exporting Blu-ray SUP format, generating blank video, generating video with hardcoded subtitles, video dubbing, and more. The tool utilizes powerful multimedia playback engines like mpv, advanced audio/video manipulation tools like FFmpeg, tools for automatic transcription like whisper.cpp/Faster-Whisper, auto-translation API like Google Translate, and ElevenLabs TTS for video dubbing.
lively
Lively Wallpaper is a tool that allows users to set animated desktop wallpapers, bringing their desktop to life. It supports various types of wallpapers including video/GIF, webpage, and application/games. Users can also use any wallpaper as a screensaver, control Lively with command line arguments, and leverage the Lively API for developers to create interactive wallpapers. The tool offers features such as minimal webpage renderer, hardware-accelerated video playback, and integration with Machine Learning inference for dynamic wallpapers. Lively is designed for Windows, is fully open-source and free, and supports Shadertoy.com URLs as wallpapers.
awesome-mobile-robotics
The 'awesome-mobile-robotics' repository is a curated list of important content related to Mobile Robotics and AI. It includes resources such as courses, books, datasets, software and libraries, podcasts, conferences, journals, companies and jobs, laboratories and research groups, and miscellaneous resources. The repository covers a wide range of topics in the field of Mobile Robotics and AI, providing valuable information for enthusiasts, researchers, and professionals in the domain.
BeamNGpy
BeamNGpy is an official Python library providing an API to interact with BeamNG.tech, a video game focused on academia and industry. It allows remote control of vehicles, AI-controlled vehicles, dynamic sensor models, access to road network and scenario objects, and multiple clients. The library comes with low-level functions and higher-level interfaces for complex actions. BeamNGpy requires BeamNG.tech for usage and offers compatibility information for different versions. It also provides troubleshooting tips and encourages user contributions.
LLMTSCS
LLMLight is a novel framework that employs Large Language Models (LLMs) as decision-making agents for Traffic Signal Control (TSC). The framework leverages the advanced generalization capabilities of LLMs to engage in a reasoning and decision-making process akin to human intuition for effective traffic control. LLMLight has been demonstrated to be remarkably effective, generalizable, and interpretable against various transportation-based and RL-based baselines on nine real-world and synthetic datasets.
MediaAI
MediaAI is a repository containing lectures and materials for Aalto University's AI for Media, Art & Design course. The course is a hands-on, project-based crash course focusing on deep learning and AI techniques for artists and designers. It covers common AI algorithms & tools, their applications in art, media, and design, and provides hands-on practice in designing, implementing, and using these tools. The course includes lectures, exercises, and a final project based on students' interests. Students can complete the course without programming by creatively utilizing existing tools like ChatGPT and DALL-E. The course emphasizes collaboration, peer-to-peer tutoring, and project-based learning. It covers topics such as text generation, image generation, optimization, and game AI.
home-llm
Home LLM is a project that provides the necessary components to control your Home Assistant installation with a completely local Large Language Model acting as a personal assistant. The goal is to provide a drop-in solution to be used as a "conversation agent" component by Home Assistant. The 2 main pieces of this solution are Home LLM and Llama Conversation. Home LLM is a fine-tuning of the Phi model series from Microsoft and the StableLM model series from StabilityAI. The model is able to control devices in the user's house as well as perform basic question and answering. The fine-tuning dataset is a custom synthetic dataset designed to teach the model function calling based on the device information in the context. Llama Conversation is a custom component that exposes the locally running LLM as a "conversation agent" in Home Assistant. This component can be interacted with in a few ways: using a chat interface, integrating with Speech-to-Text and Text-to-Speech addons, or running the oobabooga/text-generation-webui project to provide access to the LLM via an API interface.
venom
Venom is a high-performance system developed with JavaScript to create a bot for WhatsApp, support for creating any interaction, such as customer service, media sending, sentence recognition based on artificial intelligence and all types of design architecture for WhatsApp.
WeeaBlind
Weeablind is a program that uses modern AI speech synthesis, diarization, language identification, and voice cloning to dub multi-lingual media and anime. It aims to create a pleasant alternative for folks facing accessibility hurdles such as blindness, dyslexia, learning disabilities, or simply those that don't enjoy reading subtitles. The program relies on state-of-the-art technologies such as ffmpeg, pydub, Coqui TTS, speechbrain, and pyannote.audio to analyze and synthesize speech that stays in-line with the source video file. Users have the option of dubbing every subtitle in the video, setting the start and end times, dubbing only foreign-language content, or full-blown multi-speaker dubbing with speaking rate and volume matching.
ScreenAgent
ScreenAgent is a project focused on creating an environment for Visual Language Model agents (VLM Agent) to interact with real computer screens. The project includes designing an automatic control process for agents to interact with the environment and complete multi-step tasks. It also involves building the ScreenAgent dataset, which collects screenshots and action sequences for various daily computer tasks. The project provides a controller client code, configuration files, and model training code to enable users to control a desktop with a large model.
krita-ai-diffusion
Krita-AI-Diffusion is a plugin for Krita that allows users to generate images from within the program. It offers a variety of features, including inpainting, outpainting, generating images from scratch, refining existing content, live painting, and control over image creation. The plugin is designed to fit into an interactive workflow where AI generation is used as just another tool while painting. It is meant to synergize with traditional tools and the layer stack.
promptpanel
Prompt Panel is a tool designed to accelerate the adoption of AI agents by providing a platform where users can run large language models across any inference provider, create custom agent plugins, and use their own data safely. The tool allows users to break free from walled-gardens and have full control over their models, conversations, and logic. With Prompt Panel, users can pair their data with any language model, online or offline, and customize the system to meet their unique business needs without any restrictions.
FlowTest
FlowTestAI is the world’s first GenAI powered OpenSource Integrated Development Environment (IDE) designed for crafting, visualizing, and managing API-first workflows. It operates as a desktop app, interacting with the local file system, ensuring privacy and enabling collaboration via version control systems. The platform offers platform-specific binaries for macOS, with versions for Windows and Linux in development. It also features a CLI for running API workflows from the command line interface, facilitating automation and CI/CD processes.
OpenNARS-for-Applications
OpenNARS-for-Applications is an implementation of a Non-Axiomatic Reasoning System, a general-purpose reasoner that adapts under the Assumption of Insufficient Knowledge and Resources. The system combines the logic and conceptual ideas of OpenNARS, event handling and procedure learning capabilities of ANSNA and 20NAR1, and the control model from ALANN. It is written in C, offers improved reasoning performance, and has been compared with Reinforcement Learning and means-end reasoning approaches. The system has been used in real-world applications such as assisting first responders, real-time traffic surveillance, and experiments with autonomous robots. It has been developed with a pragmatic mindset focusing on effective implementation of existing theory.
biniou
biniou is a self-hosted webui for various GenAI (generative artificial intelligence) tasks. It allows users to generate multimedia content using AI models and chatbots on their own computer, even without a dedicated GPU. The tool can work offline once deployed and required models are downloaded. It offers a wide range of features for text, image, audio, video, and 3D object generation and modification. Users can easily manage the tool through a control panel within the webui, with support for various operating systems and CUDA optimization. biniou is powered by Huggingface and Gradio, providing a cross-platform solution for AI content generation.
Neurite
Neurite is an innovative project that combines chaos theory and graph theory to create a digital interface that explores hidden patterns and connections for creative thinking. It offers a unique workspace blending fractals with mind mapping techniques, allowing users to navigate the Mandelbrot set in real-time. Nodes in Neurite represent various content types like text, images, videos, code, and AI agents, enabling users to create personalized microcosms of thoughts and inspirations. The tool supports synchronized knowledge management through bi-directional synchronization between mind-mapping and text-based hyperlinking. Neurite also features FractalGPT for modular conversation with AI, local AI capabilities for multi-agent chat networks, and a Neural API for executing code and sequencing animations. The project is actively developed with plans for deeper fractal zoom, advanced control over node placement, and experimental features.
geti-sdk
The Intel® Geti™ SDK is a python package that enables teams to rapidly develop AI models by easing the complexities of model development and enhancing collaboration between teams. It provides tools to interact with an Intel® Geti™ server via the REST API, allowing for project creation, downloading, uploading, deploying for local inference with OpenVINO, setting project and model configuration, launching and monitoring training jobs, and media upload and prediction. The SDK also includes tutorial-style Jupyter notebooks demonstrating its usage.
OpenDAN-Personal-AI-OS
OpenDAN is an open source Personal AI OS that consolidates various AI modules for personal use. It empowers users to create powerful AI agents like assistants, tutors, and companions. The OS allows agents to collaborate, integrate with services, and control smart devices. OpenDAN offers features like rapid installation, AI agent customization, connectivity via Telegram/Email, building a local knowledge base, distributed AI computing, and more. It aims to simplify life by putting AI in users' hands. The project is in early stages with ongoing development and future plans for user and kernel mode separation, home IoT device control, and an official OpenDAN SDK release.
awesome-hallucination-detection
This repository provides a curated list of papers, datasets, and resources related to the detection and mitigation of hallucinations in large language models (LLMs). Hallucinations refer to the generation of factually incorrect or nonsensical text by LLMs, which can be a significant challenge for their use in real-world applications. The resources in this repository aim to help researchers and practitioners better understand and address this issue.
20 - OpenAI Gpts
🤖 SmartLink Integrator 🌎
Your AI bridge to the Internet of Things! Easily connect, control, and automate your smart devices with voice or text commands. 🏠💎
Sim-Low
Meal planner with 1)Calories Control 2)Family/Personal Plan 3)Nutritional Summaries 4)Shopping Lists
Addiction Assistant
A mentor for those with struggling with control over their substance use, offering guidance, resources, and support for sobriety. In case of relapse, it provides practical steps and resources, including web links, phone numbers, and emails.
Project Controlling Advisor
Provides financial oversight and project cost control support.
Hierarchical Topic Exploration
Explore any topic with an advanced hierarchical interactive mapping with streamlined control. Begin with !start [topic].
BITE Model Analyzer by Dr. Steven Hassan
Discover if your group, relationship or organization uses specific methods to recruit and maintain control over people
AI Powerplayed
Navigate the intricate world of corporate politics as Sam Alterman, a visionary tech leader ousted from his CEO role, outmaneuver all and reclaim control of the leading AI company. This interactive game blends strategy, negotiation, and alliances in a high-stakes world of tech. Type Start to begin.
Punaises de Lit
Expert sur les punaises de lit, conseils d'identification et mesures à prendre en cas d'infestation.