
minuet-ai.nvim
💃 Dance with Intelligence in Your Code. Minuet offers code completion as-you-type from popular LLMs including OpenAI, Gemini, Claude, Ollama, Codestral, and more.
Stars: 198

Minuet AI is a Neovim plugin that integrates with nvim-cmp to provide AI-powered code completion using multiple AI providers such as OpenAI, Claude, Gemini, Codestral, and Huggingface. It offers customizable configuration options and streaming support for completion delivery. Users can manually invoke completion or use cost-effective models for auto-completion. The plugin requires API keys for supported AI providers and allows customization of system prompts. Minuet AI also supports changing providers, toggling auto-completion, and provides solutions for input delay issues. Integration with lazyvim is possible, and future plans include implementing RAG on the codebase and virtual text UI support.
README:
- Minuet AI
- Features
- Requirements
- Installation
- Configuration
- API Keys
- Prompt
- Providers
- Commands
- API
- FAQ
- Troubleshooting
- TODO
- Contributing
- Acknowledgement
Minuet AI: Dance with Intelligence in Your Code 💃.
Minuet-ai
brings the grace and harmony of a minuet to your coding process.
Just as dancers move during a minuet.
- AI-powered code completion with dual modes:
- Specialized prompts and various enhancements for chat-based LLMs on code completion tasks.
- Fill-in-the-middle (FIM) completion for compatible models (DeepSeek, Codestral, and others).
- Support for multiple AI providers (OpenAI, Claude, Gemini, Codestral, Huggingface, and OpenAI-compatible services)
- Customizable configuration options
- Streaming support to enable completion delivery even with slower LLMs
- Support
nvim-cmp
,blink-cmp
,virtual text
frontend
With nvim-cmp / blink-cmp frontend:
With virtual text frontend:
- Neovim 0.10+.
- plenary.nvim
- optional: nvim-cmp
- optional: blink.cmp
- An API key for at least one of the supported AI providers
Lazy
specs = {
{
'milanglacier/minuet-ai.nvim',
config = function()
require('minuet').setup {
-- Your configuration options here
}
end,
},
{ 'nvim-lua/plenary.nvim' },
-- optional, if you are using virtual-text frontend, nvim-cmp is not
-- required.
{ 'hrsh7th/nvim-cmp' },
-- optional, if you are using virtual-text frontend, blink is not required.
{ 'Saghen/blink.cmp' },
}
Given the response speed and rate limits of LLM services, we recommend you
either invoke minuet
completion manually or use a cost-effective model like
gemini-flash
or codestral
for auto-completion.
Setting up with virtual text:
require('minuet').setup {
virtualtext = {
auto_trigger_ft = {},
keymap = {
-- accept whole completion
accept = '<A-A>',
-- accept one line
accept_line = '<A-a>',
-- accept n lines (prompts for number)
accept_n_lines = '<A-z>',
-- Cycle to prev completion item, or manually invoke completion
prev = '<A-[>',
-- Cycle to next completion item, or manually invoke completion
next = '<A-]>',
dismiss = '<A-e>',
},
},
}
Setting up with nvim-cmp:
require('cmp').setup {
sources = {
{
-- Include minuet as a source to enable autocompletion
{ name = 'minuet' },
-- and your other sources
}
},
performance = {
-- It is recommended to increase the timeout duration due to
-- the typically slower response speed of LLMs compared to
-- other completion sources. This is not needed when you only
-- need manual completion.
fetching_timeout = 2000,
},
}
-- If you wish to invoke completion manually,
-- The following configuration binds `A-y` key
-- to invoke the configuration manually.
require('cmp').setup {
mapping = {
["<A-y>"] = require('minuet').make_cmp_map()
-- and your other keymappings
},
}
Setting up with blink-cmp:
require('blink-cmp').setup {
keymap = {
-- Manually invoke minuet completion.
['<A-y>'] = require('minuet').make_blink_map(),
},
sources = {
-- Enable minuet for autocomplete
default = { 'lsp', 'path', 'buffer', 'snippets', 'minuet' },
-- For manual completion only, remove 'minuet' from default
providers = {
minuet = {
name = 'minuet',
module = 'minuet.blink',
score_offset = 8, -- Gives minuet higher priority among suggestions
},
},
},
-- Recommended to avoid unnecessary request
completion = { trigger = { prefetch_on_insert = false } },
}
LLM Provider Examples:
Fireworks (llama-3.3-70b
):
require('minuet').setup {
provider = 'openai_compatible',
provider_options = {
openai_compatible = {
api_key = 'FIREWORKS_API_KEY',
end_point = 'https://api.fireworks.ai/inference/v1/chat/completions',
model = 'accounts/fireworks/models/llama-v3p3-70b-instruct',
name = 'Fireworks',
optional = {
max_tokens = 256,
top_p = 0.9,
},
},
},
}
Deepseek:
-- you can use deepseek with both openai_fim_compatible or openai_compatible provider
require('minuet').setup {
provider = 'openai_fim_compatible',
provider_options = {
openai_fim_compatible = {
api_key = 'DEEPSEEK_API_KEY',
name = 'deepseek',
optional = {
max_tokens = 256,
top_p = 0.9,
},
},
},
}
-- or
require('minuet').setup {
provider = 'openai_compatible',
provider_options = {
openai_compatible = {
end_point = 'https://api.deepseek.com/v1/chat/completions',
api_key = 'DEEPSEEK_API_KEY',
name = 'deepseek',
optional = {
max_tokens = 256,
top_p = 0.9,
},
},
},
}
Ollama (qwen-2.5-coder
):
require('minuet').setup {
provider = 'openai_fim_compatible',
n_completions = 1, -- recommend for local model for resource saving
-- I recommend you start with a small context window firstly, and gradually
-- increase it based on your local computing power.
context_window = 512,
provider_options = {
openai_fim_compatible = {
api_key = 'TERM',
name = 'Ollama',
end_point = 'http://localhost:11434/v1/completions',
model = 'qwen2.5-coder:7b',
optional = {
max_tokens = 256,
top_p = 0.9,
},
},
},
}
Minuet AI comes with the following defaults:
default_config = {
-- Enable or disable auto-completion. Note that you still need to add
-- Minuet to your cmp/blink sources. This option controls whether cmp/blink
-- will attempt to invoke minuet when minuet is included in cmp/blink
-- sources. This setting has no effect on manual completion; Minuet will
-- always be enabled when invoked manually. You can use the command
-- `MinuetToggle` to toggle this option.
cmp = {
enable_auto_complete = true,
},
blink = {
enable_auto_complete = true,
},
virtualtext = {
-- Specify the filetypes to enable automatic virtual text completion,
-- e.g., { 'python', 'lua' }. Note that you can still invoke manual
-- completion even if the filetype is not on your auto_trigger_ft list.
auto_trigger_ft = {},
-- specify file types where automatic virtual text completion should be
-- disabled. This option is useful when auto-completion is enabled for
-- all file types i.e., when auto_trigger_ft = { '*' }
auto_trigger_ignore_ft = {},
keymap = {
accept = nil,
accept_line = nil,
accept_n_lines = nil,
-- Cycle to next completion item, or manually invoke completion
next = nil,
-- Cycle to prev completion item, or manually invoke completion
prev = nil,
dismiss = nil,
},
},
provider = 'codestral',
-- the maximum total characters of the context before and after the cursor
-- 16000 characters typically equate to approximately 4,000 tokens for
-- LLMs.
context_window = 16000,
-- when the total characters exceed the context window, the ratio of
-- context before cursor and after cursor, the larger the ratio the more
-- context before cursor will be used. This option should be between 0 and
-- 1, context_ratio = 0.75 means the ratio will be 3:1.
context_ratio = 0.75,
throttle = 1000, -- only send the request every x milliseconds, use 0 to disable throttle.
-- debounce the request in x milliseconds, set to 0 to disable debounce
debounce = 400,
-- Control notification display for request status
-- Notification options:
-- false: Disable all notifications (use boolean false, not string "false")
-- "debug": Display all notifications (comprehensive debugging)
-- "verbose": Display most notifications
-- "warn": Display warnings and errors only
-- "error": Display errors only
notify = 'warn',
-- The request timeout, measured in seconds. When streaming is enabled
-- (stream = true), setting a shorter request_timeout allows for faster
-- retrieval of completion items, albeit potentially incomplete.
-- Conversely, with streaming disabled (stream = false), a timeout
-- occurring before the LLM returns results will yield no completion items.
request_timeout = 3,
-- If completion item has multiple lines, create another completion item
-- only containing its first line. This option only has impact for cmp and
-- blink. For virtualtext, no single line entry will be added.
add_single_line_entry = true,
-- The number of completion items encoded as part of the prompt for the
-- chat LLM. For FIM model, this is the number of requests to send. It's
-- important to note that when 'add_single_line_entry' is set to true, the
-- actual number of returned items may exceed this value. Additionally, the
-- LLM cannot guarantee the exact number of completion items specified, as
-- this parameter serves only as a prompt guideline.
n_completions = 3,
-- Defines the length of non-whitespace context after the cursor used to
-- filter completion text. Set to 0 to disable filtering.
--
-- Example: With after_cursor_filter_length = 3 and context:
--
-- "def fib(n):\n|\n\nfib(5)" (where | represents cursor position),
--
-- if the completion text contains "fib", then "fib" and subsequent text
-- will be removed. This setting filters repeated text generated by the
-- LLM. A large value (e.g., 15) is recommended to avoid false positives.
after_cursor_filter_length = 15,
-- proxy port to use
proxy = nil,
provider_options = {
-- see the documentation in each provider in the following part.
},
-- see the documentation in the `Prompt` section
default_template = {
template = '...',
prompt = '...',
guidelines = '...',
n_completion_template = '...',
},
default_fim_template = {
prompt = '...',
suffix = '...',
},
default_few_shots = { '...' },
}
Minuet AI requires API keys to function. Set the following environment variables:
-
OPENAI_API_KEY
for OpenAI -
GEMINI_API_KEY
for Gemini -
ANTHROPIC_API_KEY
for Claude -
CODESTRAL_API_KEY
for Codestral -
HF_API_KEY
for Huggingface - Custom environment variable for OpenAI-compatible services (as specified in your configuration)
Note: Provide the name of the environment variable to Minuet, not the
actual value. For instance, pass OPENAI_API_KEY
to Minuet, not the value
itself (e.g., sk-xxxx
).
If using Ollama, you need to assign an arbitrary, non-null environment variable as a placeholder for it to function.
See prompt for the default prompt used by minuet
and
instructions on customization.
Note that minuet
employs two distinct prompt systems:
- A system designed for chat-based LLMs (OpenAI, OpenAI-Compatible, Claude, and Gemini)
- A separate system designed for Codestral and OpenAI-FIM-compatible models
the following is the default configuration for OpenAI:
provider_options = {
openai = {
model = 'gpt-4o-mini',
system = "see [System Prompt] section for the default value",
few_shots = "see [System Prompt] section for the default value",
chat_input = "See [Prompt Section for default value]",
stream = true,
optional = {
-- pass any additional parameters you want to send to OpenAI request,
-- e.g.
-- stop = { 'end' },
-- max_tokens = 256,
-- top_p = 0.9,
},
},
}
The following configuration is not the default, but recommended to prevent request timeout from outputing too many tokens.
provider_options = {
openai = {
optional = {
max_tokens = 256,
},
},
}
the following is the default configuration for Claude:
provider_options = {
claude = {
max_tokens = 512,
model = 'claude-3-5-haiku-20241022',
system = "see [System Prompt] section for the default value",
few_shots = "see [System Prompt] section for the default value",
chat_input = "See [Prompt Section for default value]",
stream = true,
optional = {
-- pass any additional parameters you want to send to claude request,
-- e.g.
-- stop_sequences = nil,
},
},
}
Codestral is a text completion model, not a chat model, so the system prompt
and few shot examples does not apply. Note that you should use the
CODESTRAL_API_KEY
, not the MISTRAL_API_KEY
, as they are using different
endpoint. To use the Mistral endpoint, simply modify the end_point
and
api_key
parameters in the configuration.
the following is the default configuration for Codestral:
provider_options = {
codestral = {
model = 'codestral-latest',
end_point = 'https://codestral.mistral.ai/v1/fim/completions',
api_key = 'CODESTRAL_API_KEY',
stream = true,
template = {
prompt = "See [Prompt Section for default value]",
suffix = "See [Prompt Section for default value]",
},
optional = {
stop = nil, -- the identifier to stop the completion generation
max_tokens = nil,
},
},
}
The following configuration is not the default, but recommended to prevent request timeout from outputing too many tokens.
provider_options = {
codestral = {
optional = {
max_tokens = 256,
stop = { '\n\n' },
},
},
}
The following config is the default.
provider_options = {
gemini = {
model = 'gemini-1.5-flash-latest',
system = "see [System Prompt] section for the default value",
few_shots = "see [System Prompt] section for the default value",
chat_input = "See [Prompt Section for default value]",
stream = true,
optional = {},
},
}
The following configuration is not the default, but recommended to prevent request timeout from outputing too many tokens. You can also adjust the safety settings following the example:
provider_options = {
gemini = {
optional = {
generationConfig = {
maxOutputTokens = 256,
},
safetySettings = {
{
-- HARM_CATEGORY_HATE_SPEECH,
-- HARM_CATEGORY_HARASSMENT
-- HARM_CATEGORY_SEXUALLY_EXPLICIT
category = 'HARM_CATEGORY_DANGEROUS_CONTENT',
-- BLOCK_NONE
threshold = 'BLOCK_ONLY_HIGH',
},
},
},
},
}
Gemini appears to perform better with an alternative input structure, unlike other chat-based LLMs. This observation is currently experimental and requires further validation. For details on the experimental prompt setup currently in use by the maintainer, please refer to the prompt documentation.
Use any providers compatible with OpenAI's chat completion API.
For example, you can set the end_point
to
http://localhost:11434/v1/chat/completions
to use ollama
.
Note that not all openAI compatible services has streaming support, you should
change stream=false
to disable streaming in case your services do not support
it.
The following config is the default.
provider_options = {
openai_compatible = {
model = 'llama-3.3-70b-versatile',
system = "see [System Prompt] section for the default value",
few_shots = "see [System Prompt] section for the default value",
chat_input = "See [Prompt Section for default value]",
end_point = 'https://api.groq.com/openai/v1/chat/completions',
api_key = 'GROQ_API_KEY',
name = 'Groq',
stream = true,
optional = {
stop = nil,
max_tokens = nil,
},
}
}
Use any provider compatible with OpenAI's completion API. This request uses the text completion API, not chat completion, so system prompts and few-shot examples are not applicable.
For example, you can set the end_point
to
http://localhost:11434/v1/completions
to use ollama
.
Refer to the Completions Legacy section of the OpenAI documentation for details.
Please note that not all OpenAI-compatible services support streaming. If your
service does not support streaming, you should set stream=false
to disable
it.
Additionally, for Ollama users, it is essential to verify whether the model's
template supports FIM completion. For example, qwen2.5-coder's
template is a
supported model. However if may come as a surprise to some users that,
deepseek-coder
does not support the FIM template, and you should use
deepseek-coder-v2
instead.
provider_options = {
openai_fim_compatible = {
model = 'deepseek-chat',
end_point = 'https://api.deepseek.com/beta/completions',
api_key = 'DEEPSEEK_API_KEY',
name = 'Deepseek',
stream = true,
template = {
prompt = "See [Prompt Section for default value]",
suffix = "See [Prompt Section for default value]",
},
optional = {
stop = nil,
max_tokens = nil,
},
}
}
The following configuration is not the default, but recommended to prevent request timeout from outputing too many tokens.
provider_options = {
openai_fim_compatible = {
optional = {
max_tokens = 256,
stop = { '\n\n' },
},
},
}
Currently only text completion model in huggingface is supported, so the system prompt and few shot examples does not apply.
provider_options = {
huggingface = {
end_point = 'https://api-inference.huggingface.co/models/bigcode/starcoder2-3b',
type = 'completion',
strategies = {
completion = {
markers = {
prefix = '<fim_prefix>',
suffix = '<fim_suffix>',
middle = '<fim_middle>',
},
strategy = 'PSM', -- PSM, SPM or PM
},
},
optional = {
parameters = {
-- The parameter specifications for different LLMs may vary.
-- Ensure you specify the parameters after reading the API
-- documentation.
stop = nil,
max_tokens = nil,
do_sample = nil,
},
},
},
}
The change_provider
command allows you to change the provider after Minuet
has been setup.
Example usage: Minuet change_provider claude
The change_model
command allows you to change both the provider and model in
one command. The format is provider:model
.
Example usage: Minuet change_model gemini:gemini-1.5-pro-latest
Note: For openai_compatible
and openai_fim_compatible
providers, the model
completions in cmdline are determined by the name
field in your
configuration. For example, if you configured:
provider_options.openai_compatible.name = 'Fireworks'
When entering Minuet change_model openai_compatible:
in the cmdline,
you'll see model completions specific to the Fireworks provider.
Enable or disable autocompletion for nvim-cmp
or blink.cmp
. While Minuet
must be added to your cmp/blink sources, this command only controls whether
Minuet is triggered during autocompletion. The command does not affect manual
completion behavior - Minuet remains active and available when manually
invoked.
Example usage: Minuet blink toggle
, Minuet blink enable
, Minuet blink disable
Enable or disable the automatic display of virtual-text
completion in the
current buffer.
Example usage: Minuet virtualtext toggle
, Minuet virtualtext enable
,
Minuet virtualtext disable
.
minuet-ai.nvim
offers the following functions to customize your key mappings:
{
-- accept whole completion
require('minuet.virtualtext').action.accept,
-- accept by line
require('minuet.virtualtext').action.accept_line,
-- accept n lines (prompts for number)
require('minuet.virtualtext').action.accept_n_lines,
require('minuet.virtualtext').action.next,
require('minuet.virtualtext').action.prev,
require('minuet.virtualtext').action.dismiss,
-- whether the virtual text is visible in current buffer
require('minuet.virtualtext').action.is_visible,
}
You can configure the icons of minuet
by using the following snippet
(referenced from cmp's
wiki):
local cmp = require('cmp')
cmp.setup {
formatting = {
format = function(entry, vim_item)
-- Kind icons
vim_item.kind = string.format('%s %s', kind_icons[vim_item.kind], vim_item.kind) -- This concatenates the icons with the name of the item kind
-- Source
vim_item.menu = ({
minuet = "ó±—»"
})[entry.source.name]
return vim_item
end
},
}
When using Minuet with auto-complete enabled, you may occasionally experience a
noticeable delay when pressing <CR>
to move to the next line. This occurs
because Minuet triggers autocompletion at the start of a new line, while cmp
blocks the <CR>
key, awaiting Minuet's response.
To address this issue, consider the following solutions:
- Unbind the
<CR>
key from your cmp keymap. - Utilize cmp's internal API to avoid blocking calls, though be aware that this API may change without prior notice.
Here's an example of the second approach using Lua:
local cmp = require 'cmp'
opts.mapping = {
['<CR>'] = cmp.mapping(function(fallback)
-- use the internal non-blocking call to check if cmp is visible
if cmp.core.view:visible() then
cmp.confirm { select = true }
else
fallback()
end
end),
}
With nvim-cmp:
{
'milanglacier/minuet-ai.nvim',
config = function()
require('minuet').setup {
-- Your configuration options here
}
end
},
{
'nvim-cmp',
optional = true,
opts = function(_, opts)
-- if you wish to use autocomplete
table.insert(opts.sources, 1, {
name = 'minuet',
group_index = 1,
priority = 100,
})
opts.performance = {
-- It is recommended to increase the timeout duration due to
-- the typically slower response speed of LLMs compared to
-- other completion sources. This is not needed when you only
-- need manual completion.
fetching_timeout = 2000,
}
opts.mapping = vim.tbl_deep_extend('force', opts.mapping or {}, {
-- if you wish to use manual complete
['<A-y>'] = require('minuet').make_cmp_map(),
})
end,
}
With blink-cmp:
-- set the following line in your config/options.lua
vim.g.lazyvim_blink_main = true
{
'milanglacier/minuet-ai.nvim',
config = function()
require('minuet').setup {
-- Your configuration options here
}
end,
},
{
'saghen/blink.cmp',
optional = true,
opts = {
keymap = {
['<A-y>'] = {
function(cmp)
cmp.show { providers = { 'minuet' } }
end,
},
},
sources = {
-- if you want to use auto-complete
default = { 'minuet' },
providers = {
minuet = {
name = 'minuet',
module = 'minuet.blink',
score_offset = 100,
},
},
},
},
}
If your setup failed, there are two most likely reasons:
- You are setting the API key to a literal value instead of the environment variable name.
- You are using a model or a context window that is too large, causing
completion items to timeout before returning any tokens. This is
particularly common with local LLM. It is recommended to start with the
following settings to have a better understanding of your provider's inference
speed.
- Begin by testing with manual completions.
- Use a smaller context window (e.g.,
config.context_window = 768
) - Use a smaller model
- Set a longer request timeout (e.g.,
config.request_timeout = 5
)
To diagnose issues, set config.notify = debug
and examine the output.
- Implement
RAG
on the codebase and encode the codebase information into the request to LLM.
Contributions are welcome! Please feel free to submit a Pull Request.
-
cmp-ai: Reference for the integration with
nvim-cmp
. - continue.dev: not a neovim plugin, but I find a lot LLM models from here.
- copilot.lua: Reference for the virtual text frontend.
For Tasks:
Click tags to check more tools for each tasksFor Jobs:
Alternative AI tools for minuet-ai.nvim
Similar Open Source Tools

minuet-ai.nvim
Minuet AI is a Neovim plugin that integrates with nvim-cmp to provide AI-powered code completion using multiple AI providers such as OpenAI, Claude, Gemini, Codestral, and Huggingface. It offers customizable configuration options and streaming support for completion delivery. Users can manually invoke completion or use cost-effective models for auto-completion. The plugin requires API keys for supported AI providers and allows customization of system prompts. Minuet AI also supports changing providers, toggling auto-completion, and provides solutions for input delay issues. Integration with lazyvim is possible, and future plans include implementing RAG on the codebase and virtual text UI support.

model.nvim
model.nvim is a tool designed for Neovim users who want to utilize AI models for completions or chat within their text editor. It allows users to build prompts programmatically with Lua, customize prompts, experiment with multiple providers, and use both hosted and local models. The tool supports features like provider agnosticism, programmatic prompts in Lua, async and multistep prompts, streaming completions, and chat functionality in 'mchat' filetype buffer. Users can customize prompts, manage responses, and context, and utilize various providers like OpenAI ChatGPT, Google PaLM, llama.cpp, ollama, and more. The tool also supports treesitter highlights and folds for chat buffers.

gen.nvim
gen.nvim is a tool that allows users to generate text using Language Models (LLMs) with customizable prompts. It requires Ollama with models like `llama3`, `mistral`, or `zephyr`, along with Curl for installation. Users can use the `Gen` command to generate text based on predefined or custom prompts. The tool provides key maps for easy invocation and allows for follow-up questions during conversations. Additionally, users can select a model from a list of installed models and customize prompts as needed.

neocodeium
NeoCodeium is a free AI completion plugin powered by Codeium, designed for Neovim users. It aims to provide a smoother experience by eliminating flickering suggestions and allowing for repeatable completions using the `.` key. The plugin offers performance improvements through cache techniques, displays suggestion count labels, and supports Lua scripting. Users can customize keymaps, manage suggestions, and interact with the AI chat feature. NeoCodeium enhances code completion in Neovim, making it a valuable tool for developers seeking efficient coding assistance.

pg_vectorize
pg_vectorize is a Postgres extension that automates text to embeddings transformation, enabling vector search and LLM applications with minimal function calls. It integrates with popular LLMs, provides workflows for vector search and RAG, and automates Postgres triggers for updating embeddings. The tool is part of the VectorDB Stack on Tembo Cloud, offering high-level APIs for easy initialization and search.

dexter
Dexter is a set of mature LLM tools used in production at Dexa, with a focus on real-world RAG (Retrieval Augmented Generation). It is a production-quality RAG that is extremely fast and minimal, and handles caching, throttling, and batching for ingesting large datasets. It also supports optional hybrid search with SPLADE embeddings, and is a minimal TS package with full typing that uses `fetch` everywhere and supports Node.js 18+, Deno, Cloudflare Workers, Vercel edge functions, etc. Dexter has full docs and includes examples for basic usage, caching, Redis caching, AI function, AI runner, and chatbot.

parrot.nvim
Parrot.nvim is a Neovim plugin that prioritizes a seamless out-of-the-box experience for text generation. It simplifies functionality and focuses solely on text generation, excluding integration of DALLE and Whisper. It supports persistent conversations as markdown files, custom hooks for inline text editing, multiple providers like Anthropic API, perplexity.ai API, OpenAI API, Mistral API, and local/offline serving via ollama. It allows custom agent definitions, flexible API credential support, and repository-specific instructions with a `.parrot.md` file. It does not have autocompletion or hidden requests in the background to analyze files.

pinecone-ts-client
The official Node.js client for Pinecone, written in TypeScript. This client library provides a high-level interface for interacting with the Pinecone vector database service. With this client, you can create and manage indexes, upsert and query vector data, and perform other operations related to vector search and retrieval. The client is designed to be easy to use and provides a consistent and idiomatic experience for Node.js developers. It supports all the features and functionality of the Pinecone API, making it a comprehensive solution for building vector-powered applications in Node.js.

llm.nvim
llm.nvim is a plugin for Neovim that enables code completion using LLM models. It supports 'ghost-text' code completion similar to Copilot and allows users to choose their model for code generation via HTTP requests. The plugin interfaces with multiple backends like Hugging Face, Ollama, Open AI, and TGI, providing flexibility in model selection and configuration. Users can customize the behavior of suggestions, tokenization, and model parameters to enhance their coding experience. llm.nvim also includes commands for toggling auto-suggestions and manually requesting suggestions, making it a versatile tool for developers using Neovim.

APIMyLlama
APIMyLlama is a server application that provides an interface to interact with the Ollama API, a powerful AI tool to run LLMs. It allows users to easily distribute API keys to create amazing things. The tool offers commands to generate, list, remove, add, change, activate, deactivate, and manage API keys, as well as functionalities to work with webhooks, set rate limits, and get detailed information about API keys. Users can install APIMyLlama packages with NPM, PIP, Jitpack Repo+Gradle or Maven, or from the Crates Repository. The tool supports Node.JS, Python, Java, and Rust for generating responses from the API. Additionally, it provides built-in health checking commands for monitoring API health status.

auto-playwright
Auto Playwright is a tool that allows users to run Playwright tests using AI. It eliminates the need for selectors by determining actions at runtime based on plain-text instructions. Users can automate complex scenarios, write tests concurrently with or before functionality development, and benefit from rapid test creation. The tool supports various Playwright actions and offers additional options for debugging and customization. It uses HTML sanitization to reduce costs and improve text quality when interacting with the OpenAI API.

minja
Minja is a minimalistic C++ Jinja templating engine designed specifically for integration with C++ LLM projects, such as llama.cpp or gemma.cpp. It is not a general-purpose tool but focuses on providing a limited set of filters, tests, and language features tailored for chat templates. The library is header-only, requires C++17, and depends only on nlohmann::json. Minja aims to keep the codebase small, easy to understand, and offers decent performance compared to Python. Users should be cautious when using Minja due to potential security risks, and it is not intended for producing HTML or JavaScript output.

deepgram-js-sdk
Deepgram JavaScript SDK. Power your apps with world-class speech and Language AI models.

phidata
Phidata is a framework for building AI Assistants with memory, knowledge, and tools. It enables LLMs to have long-term conversations by storing chat history in a database, provides them with business context by storing information in a vector database, and enables them to take actions like pulling data from an API, sending emails, or querying a database. Memory and knowledge make LLMs smarter, while tools make them autonomous.

simpleAI
SimpleAI is a self-hosted alternative to the not-so-open AI API, focused on replicating main endpoints for LLM such as text completion, chat, edits, and embeddings. It allows quick experimentation with different models, creating benchmarks, and handling specific use cases without relying on external services. Users can integrate and declare models through gRPC, query endpoints using Swagger UI or API, and resolve common issues like CORS with FastAPI middleware. The project is open for contributions and welcomes PRs, issues, documentation, and more.

LLM.swift
LLM.swift is a simple and readable library that allows you to interact with large language models locally with ease for macOS, iOS, watchOS, tvOS, and visionOS. It's a lightweight abstraction layer over `llama.cpp` package, so that it stays as performant as possible while is always up to date. Theoretically, any model that works on `llama.cpp` should work with this library as well. It's only a single file library, so you can copy, study and modify the code however you want.
For similar tasks

minuet-ai.nvim
Minuet AI is a Neovim plugin that integrates with nvim-cmp to provide AI-powered code completion using multiple AI providers such as OpenAI, Claude, Gemini, Codestral, and Huggingface. It offers customizable configuration options and streaming support for completion delivery. Users can manually invoke completion or use cost-effective models for auto-completion. The plugin requires API keys for supported AI providers and allows customization of system prompts. Minuet AI also supports changing providers, toggling auto-completion, and provides solutions for input delay issues. Integration with lazyvim is possible, and future plans include implementing RAG on the codebase and virtual text UI support.

awesome-code-ai
A curated list of AI coding tools, including code completion, refactoring, and assistants. This list includes both open-source and commercial tools, as well as tools that are still in development. Some of the most popular AI coding tools include GitHub Copilot, CodiumAI, Codeium, Tabnine, and Replit Ghostwriter.

companion-vscode
Quack Companion is a VSCode extension that provides smart linting, code chat, and coding guideline curation for developers. It aims to enhance the coding experience by offering a new tab with features like curating software insights with the team, code chat similar to ChatGPT, smart linting, and upcoming code completion. The extension focuses on creating a smooth contribution experience for developers by turning contribution guidelines into a live pair coding experience, helping developers find starter contribution opportunities, and ensuring alignment between contribution goals and project priorities. Quack collects limited telemetry data to improve its services and products for developers, with options for anonymization and disabling telemetry available to users.

CodeGeeX4
CodeGeeX4-ALL-9B is an open-source multilingual code generation model based on GLM-4-9B, offering enhanced code generation capabilities. It supports functions like code completion, code interpreter, web search, function call, and repository-level code Q&A. The model has competitive performance on benchmarks like BigCodeBench and NaturalCodeBench, outperforming larger models in terms of speed and performance.

probsem
ProbSem is a repository that provides a framework to leverage large language models (LLMs) for assigning context-conditional probability distributions over queried strings. It supports OpenAI engines and HuggingFace CausalLM models, and is flexible for research applications in linguistics, cognitive science, program synthesis, and NLP. Users can define prompts, contexts, and queries to derive probability distributions over possible completions, enabling tasks like cloze completion, multiple-choice QA, semantic parsing, and code completion. The repository offers CLI and API interfaces for evaluation, with options to customize models, normalize scores, and adjust temperature for probability distributions.

are-copilots-local-yet
Current trends and state of the art for using open & local LLM models as copilots to complete code, generate projects, act as shell assistants, automatically fix bugs, and more. This document is a curated list of local Copilots, shell assistants, and related projects, intended to be a resource for those interested in a survey of the existing tools and to help developers discover the state of the art for projects like these.

chatgpt
The ChatGPT R package provides a set of features to assist in R coding. It includes addins like Ask ChatGPT, Comment selected code, Complete selected code, Create unit tests, Create variable name, Document code, Explain selected code, Find issues in the selected code, Optimize selected code, and Refactor selected code. Users can interact with ChatGPT to get code suggestions, explanations, and optimizations. The package helps in improving coding efficiency and quality by providing AI-powered assistance within the RStudio environment.

vim-ollama
The 'vim-ollama' plugin for Vim adds Copilot-like code completion support using Ollama as a backend, enabling intelligent AI-based code completion and integrated chat support for code reviews. It does not rely on cloud services, preserving user privacy. The plugin communicates with Ollama via Python scripts for code completion and interactive chat, supporting Vim only. Users can configure LLM models for code completion tasks and interactive conversations, with detailed installation and usage instructions provided in the README.
For similar jobs

sweep
Sweep is an AI junior developer that turns bugs and feature requests into code changes. It automatically handles developer experience improvements like adding type hints and improving test coverage.

teams-ai
The Teams AI Library is a software development kit (SDK) that helps developers create bots that can interact with Teams and Microsoft 365 applications. It is built on top of the Bot Framework SDK and simplifies the process of developing bots that interact with Teams' artificial intelligence capabilities. The SDK is available for JavaScript/TypeScript, .NET, and Python.

ai-guide
This guide is dedicated to Large Language Models (LLMs) that you can run on your home computer. It assumes your PC is a lower-end, non-gaming setup.

classifai
Supercharge WordPress Content Workflows and Engagement with Artificial Intelligence. Tap into leading cloud-based services like OpenAI, Microsoft Azure AI, Google Gemini and IBM Watson to augment your WordPress-powered websites. Publish content faster while improving SEO performance and increasing audience engagement. ClassifAI integrates Artificial Intelligence and Machine Learning technologies to lighten your workload and eliminate tedious tasks, giving you more time to create original content that matters.

chatbot-ui
Chatbot UI is an open-source AI chat app that allows users to create and deploy their own AI chatbots. It is easy to use and can be customized to fit any need. Chatbot UI is perfect for businesses, developers, and anyone who wants to create a chatbot.

BricksLLM
BricksLLM is a cloud native AI gateway written in Go. Currently, it provides native support for OpenAI, Anthropic, Azure OpenAI and vLLM. BricksLLM aims to provide enterprise level infrastructure that can power any LLM production use cases. Here are some use cases for BricksLLM: * Set LLM usage limits for users on different pricing tiers * Track LLM usage on a per user and per organization basis * Block or redact requests containing PIIs * Improve LLM reliability with failovers, retries and caching * Distribute API keys with rate limits and cost limits for internal development/production use cases * Distribute API keys with rate limits and cost limits for students

uAgents
uAgents is a Python library developed by Fetch.ai that allows for the creation of autonomous AI agents. These agents can perform various tasks on a schedule or take action on various events. uAgents are easy to create and manage, and they are connected to a fast-growing network of other uAgents. They are also secure, with cryptographically secured messages and wallets.

griptape
Griptape is a modular Python framework for building AI-powered applications that securely connect to your enterprise data and APIs. It offers developers the ability to maintain control and flexibility at every step. Griptape's core components include Structures (Agents, Pipelines, and Workflows), Tasks, Tools, Memory (Conversation Memory, Task Memory, and Meta Memory), Drivers (Prompt and Embedding Drivers, Vector Store Drivers, Image Generation Drivers, Image Query Drivers, SQL Drivers, Web Scraper Drivers, and Conversation Memory Drivers), Engines (Query Engines, Extraction Engines, Summary Engines, Image Generation Engines, and Image Query Engines), and additional components (Rulesets, Loaders, Artifacts, Chunkers, and Tokenizers). Griptape enables developers to create AI-powered applications with ease and efficiency.