witsy

Witsy: desktop AI assistant / universal MCP client

Stars: 1460

Visit

Witsy is a generative AI desktop application that supports various models like OpenAI, Ollama, Anthropic, MistralAI, Google, Groq, and Cerebras. It offers features such as chat completion, image generation, scratchpad for content creation, prompt anywhere functionality, AI commands for productivity, expert prompts for specialization, LLM plugins for additional functionalities, read aloud capabilities, chat with local files, transcription/dictation, Anthropic Computer Use support, local history of conversations, code formatting, image copy/download, and more. Users can interact with the application to generate content, boost productivity, and perform various AI-related tasks.

README:

Witsy

Desktop AI Assistant
Universal MCP Client

Downloads

Download Witsy from witsyai.com or from the releases page.

On macOS you can also brew install --cask witsy.

What is Witsy?

Witsy is a BYOK (Bring Your Own Keys) AI application: it means you need to have API keys for the LLM providers you want to use. Alternatively, you can use Ollama to run models locally on your machine for free and use them in Witsy.

It is the first of very few (only?) universal MCP clients:
Witsy allows you to run MCP servers with virtually any LLM!

Supported AI Providers

Capability	Providers
Chat	OpenAI, Anthropic, Google (Gemini), xAI (Grok), Meta (Llama), Ollama, LM Studio, MistralAI, DeepSeek, OpenRouter, Groq, Cerebras, Azure OpenAI, any provider who supports the OpenAI API standard
Image Creation	OpenAI (DALL-E), Google (Imagen), xAI (Grok), Replicate, fal.ai, HuggingFace, Stable Diffusion WebUI
Video Creation	Replicate, fal.ai
Text-to-Speech	OpenAI, ElevenLabs, Groq
Speech-to-Text	OpenAI (Whisper), fal.ai, Fireworks.ai, Gladia, Groq, nVidia, Speechmatics, Local Whisper, Soniox (realtime and async) any provider who supports the OpenAI API standard
Search Engines	Tavily, Brave, Exa, Local Google Search
MCP Repositories	Smithery.ai
Embeddings	OpenAI, Ollama

Non-exhaustive feature list:

Chat completion with vision models support (describe an image)
Text-to-image and text-to video
Image-to-image (image editing) and image-to-video
LLM plugins to augment LLM: execute python code, search the Internet...
Anthropic MCP server support
Scratchpad to interactively create the best content with any model!
Prompt anywhere allows to generate content directly in any application
AI commands runnable on highlighted text in almost any application
Experts prompts to specialize your bot on a specific topic
Long-term memory plugin to increase relevance of LLM answers
Read aloud of assistant messages
Read aloud of any text in other applications
Chat with your local files and documents (RAG)
Transcription/Dictation (Speech-to-Text)
Realtime Chat aka Voice Mode
Anthropic Computer Use support
Local history of conversations (with automatic titles)
Formatting and copy to clipboard of generated code
Conversation PDF export
Image copy and download

Prompt Anywhere

Generate content in any application:

From any editable content in any application
Hit the Prompt anywhere shortcut (Shift+Control+Space / ^⇧Space)
Enter your prompt in the window that pops up
Watch Witsy enter the text directly in your application!

On Mac, you can define an expert that will automatically be triggered depending on the foreground application. For instance, if you have an expert used to generate linux commands, you can have it selected if you trigger Prompt Anywhere from the Terminal application!

AI Commands

AI commands are quick helpers accessible from a shortcut that leverage LLM to boost your productivity:

Select any text in any application
Hit the AI command shorcut (Alt+Control+Space / ⌃⌥Space)
Select one of the commands and let LLM do their magic!

You can also create custom commands with the prompt of your liking!

Commands inspired by https://the.fibery.io/@public/Public_Roadmap/Roadmap_Item/AI-Assistant-via-ChatGPT-API-170.

Experts

From https://github.com/f/awesome-chatgpt-prompts.

Scratchpad

https://www.youtube.com/watch?v=czcSbG2H-wg

Chat with your documents (RAG)

You can connect each chat with a document repository: Witsy will first search for relevant documents in your local files and provide this info to the LLM. To do so:

Click on the database icon on the left of the prompt
Click Manage and then create a document repository
OpenAI Embedding require on API key, Ollama requires an embedding model
Add documents by clicking the + button on the right hand side of the window
Once your document repository is created, click on the database icon once more and select the document repository you want to use. The icon should turn blue

Transcription / Dictation (Speech-to-Text)

You can transcribe audio recorded on the microphone to text. Transcription can be done using a variety of state of the art speech to text models (which require API key) or using local Whisper model (requires download of large files).

Currently Witsy supports the following speech to text models:

GPT4o-Transcribe
Gladia
Speechmatics (Standards + Enhanced)
Groq Whisper V3
Fireworks.ai Realtime Transcription
fal.ai Wizper V3
fal.ai ElevenLabs
nVidia Microsoft Phi-4 Multimodal

Witsy supports quick shortcuts, so your transcript is always only one button press away.

Once the text is transcribed you can:

Copy it to your clipboard
Summarize it
Translate it to any language
Insert it in the application that was running before you activated the dictation

Anthropic Computer Use

https://www.youtube.com/watch?v=vixl7I07hBk

Setup

You can download a binary from from witsyai.com, from the releases page or build yourself:

npm install
npm start

Prerequisites

To use OpenAI, Anthropic, Google or Mistral AI models, you need to enter your API key:

To use Ollama models, you need to install Ollama and download some models.

To use text-to-speech, you need an

To use Internet search you need a Tavily API key.

TODO

[ ] Implement Soniox for STT
[ ] Workspaces / Projects (whatever the name is)
[ ] Proper database (SQLite3) storage (??)

WIP

DONE

[x] OpenAI GPT-5 support
[x] Agents (multi-step, scheduling...)
[x] Document Repository file change monitoring
[x] OpenAI API response (o3-pro)
[x] ChatGPT history import
[x] Onboarding experience
[x] Add, Edit & Delete System Prompts
[x] Backup/Restore of data and settings
[x] Transcribe Local Audio Files
[x] DeepResearch
[x] Local filesystem access plugin
[x] Close markdown when streaming
[x] Multiple attachments
[x] Custom OpenAI STT support
[x] AI Commands copy/insert/replace shortcuts
[x] Defaults at folder level
[x] Tool selection for chat
[x] Realtime STT with Speechmatics
[x] Meta/Llama AI support
[x] Realtime STT with Fireworks
[x] OpenAI image generation
[x] Azure AI support
[x] Brave Search plugin
[x] Allow user-input models for embeddings
[x] User defined parameters for custom engines
[x] Direct speech-to-text checbox
[x] Quick access buttons on home
[x] fal.ai support (speech-to-text, text-to-image and text-to-video)
[x] Debug console
[x] Design Studio
[x] i18n
[x] Mermaid diagram rendering
[x] Smithery.ai MCP integration
[x] Model Context Protocol
[x] Local Web Search
[x] Model defaults
[x] Speech-to-text language
[x] Model parameters (temperature...)
[x] Favorite models
[x] ElevenLabs Text-to-Speech
[x] Custom engines (OpenAI compatible)
[x] Long-term memory plugin
[x] OpenRouter support
[x] DeepSeek support
[x] Folder mode
[x] All instructions customization
[x] Fork chat (with optional LLM switch)
[x] Realtime chat
[x] Replicate video generation
[x] Together.ai compatibility
[x] Gemini 2.0 Flash support
[x] Groq LLama 3.3 support
[x] xAI Grok Vision Model support
[x] Ollama function-calling
[x] Replicate image generation
[x] AI Commands redesign
[x] Token usage report
[x] OpenAI o1 models support
[x] Groq vision support
[x] Image resize option
[x] Llama 3.2 vision support
[x] YouTube plugin
[x] RAG in Scratchpad
[x] Hugging face image generation
[x] Show prompt used for image generation
[x] Redesigned Prompt window
[x] Anthropic Computer Use
[x] Auto-update refactor (still not Windows)
[x] Dark mode
[x] Conversation mode
[x] Google function calling
[x] Anthropic function calling
[x] Scratchpad
[x] Dictation: OpenAI Whisper + Whisper WebGPU
[x] Auto-select expert based on foremost app (Mac only)
[x] Cerebras support
[x] Local files RAG
[x] Groq model update (8-Sep-2024)
[x] PDF Export of chats
[x] Prompts renamed to Experts. Now editable.
[x] Read aloud
[x] Import/Export commands
[x] Anthropic Sonnet 3.5
[x] Ollama base URL as settings
[x] OpenAI base URL as settings
[x] DALL-E as tool
[x] Google Gemini API
[x] Prompt anywhere
[x] Cancel commands
[x] GPT-4o support
[x] Different default engine/model for commands
[x] Text attachments (TXT, PDF, DOCX, PPTX, XLSX)
[x] MistralAI function calling
[x] Auto-update
[x] History date sections
[x] Multiple selection delete
[x] Search
[x] Groq API
[x] Custom prompts
[x] Sandbox & contextIsolation
[x] Application Menu
[x] Prompt history navigation
[x] Ollama model pull
[x] macOS notarization
[x] Fix when long text is highlighted
[x] Shortcuts for AI commands
[x] Shift to switch AI command behavior
[x] User feedback when running a tool
[x] Download internet content plugin
[x] Tavily Internet search plugin
[x] Python code execution plugin
[x] LLM Tools supprt (OpenAI only)
[x] Mistral AI API integration
[x] Latex rendering
[x] Anthropic API integration
[x] Image generation as b64_json
[x] Text-to-speech
[x] Log file (electron-log)
[x] Conversation language settings
[x] Paste image in prompt
[x] Run commands with default models
[x] Models refresh
[x] Edit commands
[x] Customized commands
[x] Conversation menu (info, save...)
[x] Conversation depth setting
[x] Save attachment on disk
[x] Keep running in system tray
[x] Nicer icon (still temporary)
[x] Rename conversation
[x] Copy/edit messages
[x] New chat window for AI command
[x] AI Commands with shortcut
[x] Auto-switch to vision model
[x] Run at login
[x] Shortcut editor
[x] Chat font size settings
[x] Image attachment for vision
[x] Stop response streaming
[x] Save/Restore window position
[x] Ollama support
[x] View image full screen
[x] Status/Tray bar icon + global shortcut to invoke
[x] Chat themes
[x] Default instructions in settings
[x] Save DALL-E images locally (and delete properly)
[x] OpenAI links in settings
[x] Copy code button
[x] Chat list ordering
[x] OpenAI model choice
[x] CSS variables

For Tasks:

Click tags to check more tools for each tasks

generate content boost productivity specialize bot transcribe audio chat with documents

For Jobs:

ai researcher content creator software developer data scientist digital marketer

Alternative AI tools for witsy

Similar Open Source Tools

witsy

github

: 1.5k

AIDE-Plus

AIDE-Plus is a comprehensive tool for Android app development, offering support for various Java syntax versions, Gradle and Maven build systems, ProGuard, AndroidX, CMake builds, APK/AAB generation, code coloring customization, data binding, and APK signing. It also provides features like AAPT2, D8, runtimeOnly, compileOnly, libgdxNatives, manifest merging, Shizuku installation support, and syntax auto-completion. The tool aims to streamline the development process and enhance the user experience by addressing common issues and providing advanced functionalities.

github

: 136

air780e-forwarder

This repository provides a tool for forwarding SMS and call notifications using various notification methods such as Telegram, PushDeer, Bark, DingTalk, Feishu, WeCom, Pushover, email, Gotify, Inotify, and SMTP protocol. It also allows controlling devices via SMS, scheduling base station positioning, querying data usage, reporting device status, power button operations, low power mode, message queue usage for sending notifications without freezing, automatic resend on notification failure, and support for master-slave mode for message forwarding.

github

: 262

LLMFarm

LLMFarm is an iOS and MacOS app designed to work with large language models (LLM). It allows users to load different LLMs with specific parameters, test the performance of various LLMs on iOS and macOS, and identify the most suitable model for their projects. The tool is based on ggml and llama.cpp by Georgi Gerganov and incorporates sources from rwkv.cpp by saharNooby, Mia by byroneverson, and LlamaChat by alexrozanski. LLMFarm features support for MacOS (13+) and iOS (16+), various inferences and sampling methods, Metal compatibility (not supported on Intel Mac), model setting templates, LoRA adapters support, LoRA finetune support, LoRA export as model support, and more. It also offers a range of inferences including LLaMA, GPTNeoX, Replit, GPT2, Starcoder, RWKV, Falcon, MPT, Bloom, and others. Additionally, it supports multimodal models like LLaVA, Obsidian, and MobileVLM. Users can customize inference options through JSON files and access supported models for download.

github

: 1.5k

handy-ollama

Handy-Ollama is a tutorial for deploying Ollama with hands-on practice, making the deployment of large language models accessible to everyone. The tutorial covers a wide range of content from basic to advanced usage, providing clear steps and practical tips for beginners and experienced developers to learn Ollama from scratch, deploy large models locally, and develop related applications. It aims to enable users to run large models on consumer-grade hardware, deploy models locally, and manage models securely and reliably.

github

: 910

RealScaler

RealScaler is a Windows app powered by RealESRGAN AI to enhance, upscale, and de-noise photos and videos. It provides an easy-to-use GUI for upscaling images and videos using multiple AI models. The tool supports automatic image tiling and merging to avoid GPU VRAM limitations, resizing images/videos before upscaling, interpolation between original and upscaled content, and compatibility with various image and video formats. RealScaler is written in Python and requires Windows 11/10, at least 8GB RAM, and a Directx12 compatible GPU with 4GB VRAM. Future versions aim to enhance performance, support more GPUs, offer a new GUI with Windows 11 style, include audio for upscaled videos, and provide features like metadata extraction and application from original to upscaled files.

github

: 249

dify-helm

Deploy langgenius/dify, an LLM based chat bot app on kubernetes with helm chart.

github

: 340

Eridanus

Eridanus is a powerful data visualization tool designed to help users create interactive and insightful visualizations from their datasets. With a user-friendly interface and a wide range of customization options, Eridanus makes it easy for users to explore and analyze their data in a meaningful way. Whether you are a data scientist, business analyst, or student, Eridanus provides the tools you need to communicate your findings effectively and make data-driven decisions.

github

: 147

feast

Feast is an open source feature store for machine learning, providing a fast path to manage infrastructure for productionizing analytic data. It allows ML platform teams to make features consistently available, avoid data leakage, and decouple ML from data infrastructure. Feast abstracts feature storage from retrieval, ensuring portability across different model training and serving scenarios.

github

: 6.3k

FluidFrames.RIFE

FluidFrames.RIFE is a Windows app powered by RIFE AI to create frame-generated and slowmotion videos. It is written in Python and utilizes external packages such as torch, onnxruntime-directml, customtkinter, OpenCV, moviepy, and Nuitka. The app features an elegant GUI, video frame generation at different speeds, video slow motion, video resizing, multiple GPU support, and compatibility with various video formats. Future versions aim to support different GPU types, enhance the GUI, include audio processing, optimize video processing speed, and introduce new features like saving AI-generated frames and supporting different RIFE AI models.

github

: 128

ChatGPT-On-CS

ChatGPT-On-CS is an intelligent chatbot tool based on large models, supporting various platforms like WeChat, Taobao, Bilibili, Douyin, Weibo, and more. It can handle text, voice, and image inputs, access external resources through plugins, and customize enterprise AI applications based on proprietary knowledge bases. Users can set custom replies, utilize ChatGPT interface for intelligent responses, send images and binary files, and create personalized chatbots using knowledge base files. The tool also features platform-specific plugin systems for accessing external resources and supports enterprise AI applications customization.

github

: 2.2k

ai-tag

AI tag generator that combines 40,000 tags from Bilibili UP main Twelve Today is also very cute with Chinese translations from Novelai, providing Chinese search and tag generation services. It offers a tag community for magicians to directly copy and generate spells. Always free, no ads, no commercial use. The project includes a pure tag parsing library, independent spell parsing library, tag data repository, and a new gallery page with waterfall flow for viewing community images.

github

: 120

midjourney-proxy

Midjourney-proxy is a proxy for the Discord channel of MidJourney, enabling API-based calls for AI drawing. It supports Imagine instructions, adding image base64 as a placeholder, Blend and Describe commands, real-time progress tracking, Chinese prompt translation, prompt sensitive word pre-detection, user-token connection to WSS, multi-account configuration, and more. For more advanced features, consider using midjourney-proxy-plus, which includes Shorten, focus shifting, image zooming, local redrawing, nearly all associated button actions, Remix mode, seed value retrieval, account pool persistence, dynamic maintenance, /info and /settings retrieval, account settings configuration, Niji bot robot, InsightFace face replacement robot, and an embedded management dashboard.

github

: 4.9k

pro-chat

ProChat is a components library focused on quickly building large language model chat interfaces. It empowers developers to create rich, dynamic, and intuitive chat interfaces with features like automatic chat caching, streamlined conversations, message editing tools, auto-rendered Markdown, and programmatic controls. The tool also includes design evolution plans such as customized dialogue rendering, enhanced request parameters, personalized error handling, expanded documentation, and atomic component design.

github

: 514

AI-Vtuber

AI-VTuber is a highly customizable AI VTuber project that integrates with Bilibili live streaming, uses Zhifu API as the language base model, and includes intent recognition, short-term and long-term memory, cognitive library building, song library creation, and integration with various voice conversion, voice synthesis, image generation, and digital human projects. It provides a user-friendly client for operations. The project supports virtual VTuber template construction, multi-person device template management, real-time switching of virtual VTuber templates, and offers various practical tools such as video/audio crawlers, voice recognition, voice separation, voice synthesis, voice conversion, AI drawing, and image background removal.

github

: 188

ap-plugin

AP-PLUGIN is an AI drawing plugin for the Yunzai series robot framework, allowing you to have a convenient AI drawing experience in the input box. It uses the open source Stable Diffusion web UI as the backend, deploys it for free, and generates a variety of images with richer functions.

github

: 103

For similar tasks

chatflow

Chatflow is a tool that provides a chat interface for users to interact with systems using natural language. The engine understands user intent and executes commands for tasks, allowing easy navigation of complex websites/products. This approach enhances user experience, reduces training costs, and boosts productivity.

github

: 124

AiR

AiR is an AI tool built entirely in Rust that delivers blazing speed and efficiency. It features accurate translation and seamless text rewriting to supercharge productivity. AiR is designed to assist non-native speakers by automatically fixing errors and polishing language to sound like a native speaker. The tool is under heavy development with more features on the horizon.

github

: 118

awesome-ai-newsletters

Awesome AI Newsletters is a curated list of AI-related newsletters that provide the latest news, trends, tools, and insights in the field of Artificial Intelligence. It includes a variety of newsletters covering general AI news, prompts for marketing and productivity, AI job opportunities, and newsletters tailored for professionals in the AI industry. Whether you are a beginner looking to stay updated on AI advancements or a professional seeking to enhance your knowledge and skills, this repository offers a collection of valuable resources to help you navigate the world of AI.

github

: 56

ollama-autocoder

Ollama Autocoder is a simple to use autocompletion engine that integrates with Ollama AI. It provides options for streaming functionality and requires specific settings for optimal performance. Users can easily generate text completions by pressing a key or using a command pallete. The tool is designed to work with Ollama API and a specified model, offering real-time generation of text suggestions.

github

: 92

witsy

github

: 1.5k

codegate

CodeGate is a local gateway that enhances the safety of AI coding assistants by ensuring AI-generated recommendations adhere to best practices, safeguarding code integrity, and protecting individual privacy. Developed by Stacklok, CodeGate allows users to confidently leverage AI in their development workflow without compromising security or productivity. It works seamlessly with coding assistants, providing real-time security analysis of AI suggestions. CodeGate is designed with privacy at its core, keeping all data on the user's machine and offering complete control over data.

github

: 602

cline-based-code-generator

HAI Code Generator is a cutting-edge tool designed to simplify and automate task execution while enhancing code generation workflows. Leveraging Specif AI, it streamlines processes like task execution, file identification, and code documentation through intelligent automation and AI-driven capabilities. Built on Cline's powerful foundation for AI-assisted development, HAI Code Generator boosts productivity and precision by automating task execution and integrating file management capabilities. It combines intelligent file indexing, context generation, and LLM-driven automation to minimize manual effort and ensure task accuracy. Perfect for developers and teams aiming to enhance their workflows.

github

: 62

anything-llm

AnythingLLM is a full-stack application that enables you to turn any document, resource, or piece of content into context that any LLM can use as references during chatting. This application allows you to pick and choose which LLM or Vector Database you want to use as well as supporting multi-user management and permissions.

github

: 49.2k

For similar jobs

weave

Weave is a toolkit for developing Generative AI applications, built by Weights & Biases. With Weave, you can log and debug language model inputs, outputs, and traces; build rigorous, apples-to-apples evaluations for language model use cases; and organize all the information generated across the LLM workflow, from experimentation to evaluations to production. Weave aims to bring rigor, best-practices, and composability to the inherently experimental process of developing Generative AI software, without introducing cognitive overhead.

github

: 980

LLMStack

LLMStack is a no-code platform for building generative AI agents, workflows, and chatbots. It allows users to connect their own data, internal tools, and GPT-powered models without any coding experience. LLMStack can be deployed to the cloud or on-premise and can be accessed via HTTP API or triggered from Slack or Discord.

github

: 1.5k

VisionCraft

The VisionCraft API is a free API for using over 100 different AI models. From images to sound.

github

: 94

kaito

Kaito is an operator that automates the AI/ML inference model deployment in a Kubernetes cluster. It manages large model files using container images, avoids tuning deployment parameters to fit GPU hardware by providing preset configurations, auto-provisions GPU nodes based on model requirements, and hosts large model images in the public Microsoft Container Registry (MCR) if the license allows. Using Kaito, the workflow of onboarding large AI inference models in Kubernetes is largely simplified.

github

: 405

PyRIT

PyRIT is an open access automation framework designed to empower security professionals and ML engineers to red team foundation models and their applications. It automates AI Red Teaming tasks to allow operators to focus on more complicated and time-consuming tasks and can also identify security harms such as misuse (e.g., malware generation, jailbreaking), and privacy harms (e.g., identity theft). The goal is to allow researchers to have a baseline of how well their model and entire inference pipeline is doing against different harm categories and to be able to compare that baseline to future iterations of their model. This allows them to have empirical data on how well their model is doing today, and detect any degradation of performance based on future improvements.

github

: 2.9k

tabby

Tabby is a self-hosted AI coding assistant, offering an open-source and on-premises alternative to GitHub Copilot. It boasts several key features: * Self-contained, with no need for a DBMS or cloud service. * OpenAPI interface, easy to integrate with existing infrastructure (e.g Cloud IDE). * Supports consumer-grade GPUs.

github

: 32.1k

spear

SPEAR (Simulator for Photorealistic Embodied AI Research) is a powerful tool for training embodied agents. It features 300 unique virtual indoor environments with 2,566 unique rooms and 17,234 unique objects that can be manipulated individually. Each environment is designed by a professional artist and features detailed geometry, photorealistic materials, and a unique floor plan and object layout. SPEAR is implemented as Unreal Engine assets and provides an OpenAI Gym interface for interacting with the environments via Python.

github

: 224

Magick

Magick is a groundbreaking visual AIDE (Artificial Intelligence Development Environment) for no-code data pipelines and multimodal agents. Magick can connect to other services and comes with nodes and templates well-suited for intelligent agents, chatbots, complex reasoning systems and realistic characters.

github

: 675