llm.nvim

Free large language model (LLM) support for Neovim, provides commands to interact with LLM (like ChatGPT, ChatGLM, kimi, deepseek, openrouter and local llms). Support Github models.

Stars: 264

Visit

llm.nvim is a universal plugin for a large language model (LLM) designed to enable users to interact with LLM within neovim. Users can customize various LLMs such as gpt, glm, kimi, and local LLM. The plugin provides tools for optimizing code, comparing code, translating text, and more. It also supports integration with free models from Cloudflare, Github models, siliconflow, and others. Users can customize tools, chat with LLM, quickly translate text, and explain code snippets. The plugin offers a flexible window interface for easy interaction and customization.

README:

English | 简体中文

[!IMPORTANT] A free large language model(LLM) plugin that allows you to interact with LLM in Neovim.

Supports any LLM, such as GPT, GLM, Kimi, deepseek or local LLMs (such as ollama).

Allows you to define your own AI tools, with different tools able to use different models.

Most importantly, you can use free models provided by any platform (such as Cloudflare, GitHub models, SiliconFlow, openrouter or other platforms).

[!NOTE] The configurations of different LLMs (such as ollama, deepseek), UI configurations, and AI tools (including code completion) should be checked in the examples first. Here you will find most of the information you want to know. Additionally, before using the plugin, you should ensure that your LLM_KEY is valid and that the environment variable is in effect.

Screenshots
Installation
Configuration
Default Keyboard Shortcuts
- Window switch
TODO List
Author's configuration
Acknowledgments
- Special thanks
Q&A

Screenshots

Chat

models | UI

Code Completions

virtual text

blink.cmp or nvim-cmp

Quick Translation

Explain Code

Streaming output | Non-streaming output

Ask

One-time, no history retained.

You can configure inline_assistant to decide whether to display diffs (default: show by pressing 'd').

Attach To Chat

You can configure inline_assistant to decide whether to display diffs (default: show by pressing 'd').

Optimize Code

Display side by side

Display in the form of a diff

Generate Test Cases

AI Translation

Generate Git Commit Message

Generate Doc String

⬆ back to top

Installation

Dependencies

curl

Preconditions

Register on the official website and obtain your API Key (Cloudflare needs to obtain an additional account).
Set the LLM_KEY (Cloudflare needs to set an additional ACCOUNT) environment variable in your zshrc or bashrc.

export LLM_KEY=<Your API_KEY>
export ACCOUNT=<Your ACCOUNT> # just for cloudflare

Websites of different AI platforms

Platform	Link to obtain api key	Note
Cloudflare	https://dash.cloudflare.com/	You can see all of Cloudflare's models here, with the ones marked as beta being free models.
ChatGLM(智谱清言)	https://open.bigmodel.cn/
Kimi(月之暗面)	Moonshot AI 开放平台
Github Models	Github Token
siliconflow (硅基流动)	siliconflow	You can see all models on Siliconflow here, and select 'Only Free' to see all free models.
Deepseek	https://platform.deepseek.com/api_keys
Openrouter	https://openrouter.ai/
Chatanywhere	https://api.chatanywhere.org/v1/oauth/free/render	200 free calls to GPT-4o-mini are available every day.

For local llms, Set LLM_KEY to NONE in your zshrc or bashrc.

⬆ back to top

Minimal installation example

lazy.nvim

  {
    "Kurama622/llm.nvim",
    dependencies = { "nvim-lua/plenary.nvim", "MunifTanjim/nui.nvim"},
    cmd = { "LLMSessionToggle", "LLMSelectedTextHandler", "LLMAppHandler" },
    keys = {
      { "<leader>ac", mode = "n", "<cmd>LLMSessionToggle<cr>" },
    },
  }

Configuration

Basic Configuration

Some commands you should know about

LLMSessionToggle: open/hide the Chat UI.
LLMSelectedTextHandler: Handles the selected text, the way it is processed depends on the prompt words you input.
LLMAppHandler: call AI tools.

If the URL is not configured, the default is to use Cloudflare.

Examples

For more details or examples, please refer to Chat Configuration.

Click here to see meanings of some configuration options

prompt: Model prompt.
prefix: Dialog role indicator.
style: Style of the Chat UI (float means floating window, others are split windows).
url: Model api url.
model: Model name.
api_type: The parsing format of the model output: openai, zhipu, ollama, workers-ai. The openai format is compatible with most models, but ChatGLM can only be parsed using the zhipu format, and cloudflare can only be parsed using the workers-ai format. If you use ollama to run the model, you can use ollama.
fetch_key: If you need to use models from different platforms simultaneously, you can configure fetch_key to ensure that different models use different API Keys. The usage is as follows:
```
fetch_key = function() return "<your api key>" end
```
max_tokens: Maximum output length of the model.
save_session: Whether to save session history.
max_history: Maximum number of saved sessions.
history_path: Path for saving session history.
temperature: The temperature of the model, controlling the randomness of the model's output.
temperature: The top_p of the model, controlling the randomness of the model's output.
spinner: The waiting animation of the model output (effective when non-streaming output).
display
- diff: Display style of diff (effective when optimizing code and showing diff, the style in the screenshot is mini_diff, which requires installation of mini.diff).
keys: Shortcut key settings for different windows, default values can be found in Default Shortcuts
- floating style
  - input window
    - Input:Cancel: Cancel dialog response.
    - Input:Submit: Submit your question.
    - Input:Resend: Rerespond to the dialog.
    - Input:HistoryNext: Select the next session history.
    - Input:HistoryPrev: Select the previous session history.
    - PageUp: Output Window page up
    - HalfPageUp: Output Window page up (half)
    - PageDown: Output window page down
    - HalfPageDown: Output window page down (half)
    - JumpToTop: Jump to the top (output window)
    - JumpToBottom: Jump to the bottom (output window)
  - Chat UI
    - Session:Toggle: open/hide the Chat UI.
    - Session:Close: close the Chat UI.
- split style
  - output window
    - Output:Ask: Open input window.
    - Output:Cancel: Cancel diaglog response.
    - Output:Resend: Rerespond to the dialog.
    - Session:History: open session history.
  - Chat UI
    - Session:Toggle: open/hide the Chat UI.
    - Session:Close: close the Chat UI.

If you use a local LLM (but not one running on ollama), you may need to define the streaming_handler (required), as well as the parse_handler (optional, used by only a few AI tools), for details see Local LLM Configuration.

⬆ back to top

Window Style Configuration

If you want to further configure the style of the conversation interface, you can configure chat_ui_opts and popwin_opts separately.

Click here to see how to configure the window style

Their configuration options are the same:

relative:
- editor: The floating window relative to the current editor window.
- cursor: The floating window relative to the current cursor position.
- win: The floating window relative to the current window.
position: The position of the window.
size: The size of the window.
enter: Whether the window automatically gains focus.
focusable: Whether the window can gain focus.
zindex: The layer of the window.
border
- style: The style of the window border.
- text: The text of the window border.
win_options: The options of the window.
- winblend: The transparency of the window.
- winhighlight: The highlight of the window.

More information can be found in nui/popup.

Examples

For more details or examples, please refer to UI Configuration.

⬆ back to top

Configuration of AI Tools

Currently, llm.nvim provides some templates for AI tools, making it convenient for everyone to customize their own AI tools.

All AI tools need to be defined in app_handler, presented in the form of a pair of key-value (key is the tool name and value is the configuration information of the tool).

Examples

For more details or examples, please refer to AI Tools Configuration.

Click here to see how to configure AI tools

For all AI tools, their configuration options are similar:

handler: Which template to use
- side_by_side_handler: Display results in two windows side by side
- action_handler: Display results in the source file in the form of a diff
  - Y/y: Accept LLM suggested code
  - N/n: Reject LLM suggested code
  - <ESC>: Exit directly
  - I/i: Input additional optimization conditions
  - <C-r>: Optimize again directly
- qa_handler: AI for single-round dialogue
- flexi_handler: Results will be displayed in a flexible window (window size is automatically calculated based on the amount of output text)
- You can also customize functions
prompt: Prompt words for the AI tool
opts
- spell: Whether to have spell check
- number: Whether to display line numbers
- wrap: Whether to automatically wrap lines
- linebreak: Whether to allow line breaks in the middle of words
- url, model: The LLM used by this AI tool
- api_type: The type of parsing output by this AI tool
- streaming_handler: This AI tool uses a custom streaming parsing function
- parse_handler: This AI tool uses a custom parsing function
- border: Floating window border style
- accept
  - mapping: The key mapping for accepting the output
    - mode: Vim mode (Default mode: n)
    - keys: Your key mappings. (Default keys: Y/y)
  - action: The action for accepting the output, which is executed when accepting the output. (Default action: Copy the output)
- reject
  - mapping: The key mapping for rejecting the output
    - mode: Vim mode (Default mode: n)
    - keys: Your key mappings. (Default keys: N/n)
  - action: The action for rejecting the output, which is executed when rejecting the output. (Default action: None or close the window)
- close
  - mapping: The key mapping for closing the AI tool
    - mode: Vim mode (Default mode: n)
    - keys: Your key mappings. (Default keys: <ESC>)
  - action: The action for closing the AI tool. (Default action: Reject all output and close the window)

Different templates also have some exclusive configuration items of their own.

You can also define in the opts of qa_handler:
- component_width: the width of the component
- component_height: the height of the component
- query
  - title: the title of the component, which will be displayed in the center above the component
  - hl: the highlight of the title
- input_box_opts: the window options for the input box (size, win_options)
- preview_box_opts: the window options for the preview box (size, win_options)
You can also define in the opts of action_handler:
- language: The language used for the output result (English/Chinese/Japanese etc.)
- input
  - relative: The relative position of the split window (editor/win)
  - position: The position of the split window (top/left/right/bottom)
  - size: The proportion of the split window (default is 25%)
  - enter: Whether to automatically enter the window
- output
  - relative: Same as input
  - position: Same as input
  - size: Same as input
  - enter: Same as input
In the opts of side_by_side_handler, you can also define:
- left Left window
  - title: The title of the window
  - focusable: Whether the window can gain focus
  - border
  - win_options
- right Right window
  - title: The title of the window
  - focusable: Whether the window can gain focus
  - border
  - win_options
In the opts of flexi_handler, you can also define:
- exit_on_move: Whether to close the flexible window when the cursor moves
- enter_flexible_window: Whether to automatically enter the window when the flexible window pops up
- apply_visual_selection: Whether to append the selected text content after the prompt

My some AI tool configurations:

  {
    "Kurama622/llm.nvim",
    dependencies = { "nvim-lua/plenary.nvim", "MunifTanjim/nui.nvim" },
    cmd = { "LLMSessionToggle", "LLMSelectedTextHandler", "LLMAppHandler" },
    config = function()
      local tools = require("llm.common.tools")
      require("llm").setup({
        app_handler = {
          OptimizeCode = {
            handler = tools.side_by_side_handler,
            -- opts = {
            --   streaming_handler = local_llm_streaming_handler,
            -- },
          },
          TestCode = {
            handler = tools.side_by_side_handler,
            prompt = [[ Write some test cases for the following code, only return the test cases.
            Give the code content directly, do not use code blocks or other tags to wrap it. ]],
            opts = {
              right = {
                title = " Test Cases ",
              },
            },
          },
          OptimCompare = {
            handler = tools.action_handler,
            opts = {
              fetch_key = function()
                return vim.env.GITHUB_TOKEN
              end,
              url = "https://models.inference.ai.azure.com/chat/completions",
              model = "gpt-4o",
              api_type = "openai",
            },
          },

          Translate = {
            handler = tools.qa_handler,
            opts = {
              fetch_key = function()
                return vim.env.GLM_KEY
              end,
              url = "https://open.bigmodel.cn/api/paas/v4/chat/completions",
              model = "glm-4-flash",
              api_type = "zhipu",

              component_width = "60%",
              component_height = "50%",
              query = {
                title = " 󰊿 Trans ",
                hl = { link = "Define" },
              },
              input_box_opts = {
                size = "15%",
                win_options = {
                  winhighlight = "Normal:Normal,FloatBorder:FloatBorder",
                },
              },
              preview_box_opts = {
                size = "85%",
                win_options = {
                  winhighlight = "Normal:Normal,FloatBorder:FloatBorder",
                },
              },
            },
          },

          -- check siliconflow's balance
          UserInfo = {
            handler = function()
              local key = os.getenv("LLM_KEY")
              local res = tools.curl_request_handler(
                "https://api.siliconflow.cn/v1/user/info",
                { "GET", "-H", string.format("'Authorization: Bearer %s'", key) }
              )
              if res ~= nil then
                print("balance: " .. res.data.balance)
              end
            end,
          },
          WordTranslate = {
            handler = tools.flexi_handler,
            prompt = "Translate the following text to Chinese, please only return the translation",
            opts = {
              fetch_key = function()
                return vim.env.GLM_KEY
              end,
              url = "https://open.bigmodel.cn/api/paas/v4/chat/completions",
              model = "glm-4-flash",
              api_type = "zhipu",
              args = [[return {url, "-N", "-X", "POST", "-H", "Content-Type: application/json", "-H", authorization, "-d", vim.fn.json_encode(body)}]],
              exit_on_move = true,
              enter_flexible_window = false,
            },
          },
          CodeExplain = {
            handler = tools.flexi_handler,
            prompt = "Explain the following code, please only return the explanation, and answer in Chinese",
            opts = {
              fetch_key = function()
                return vim.env.GLM_KEY
              end,
              url = "https://open.bigmodel.cn/api/paas/v4/chat/completions",
              model = "glm-4-flash",
              api_type = "zhipu",
              enter_flexible_window = true,
            },
          },
          CommitMsg = {
            handler = tools.flexi_handler,
            prompt = function()
              -- Source: https://andrewian.dev/blog/ai-git-commits
              return string.format([[You are an expert at following the Conventional Commit specification. Given the git diff listed below, please generate a commit message for me:

1. First line: conventional commit format (type: concise description) (remember to use semantic types like feat, fix, docs, style, refactor, perf, test, chore, etc.)
2. Optional bullet points if more context helps:
   - Keep the second line blank
   - Keep them short and direct
   - Focus on what changed
   - Always be terse
   - Don't overly explain
   - Drop any fluffy or formal language

Return ONLY the commit message - no introduction, no explanation, no quotes around it.

Examples:
feat: add user auth system

- Add JWT tokens for API auth
- Handle token refresh for long sessions

fix: resolve memory leak in worker pool

- Clean up idle connections
- Add timeout for stale workers

Simple change example:
fix: typo in README.md

Very important: Do not respond with any of the examples. Your message must be based off the diff that is about to be provided, with a little bit of styling informed by the recent commits you're about to see.

Based on this format, generate appropriate commit messages. Respond with message only. DO NOT format the message in Markdown code blocks, DO NOT use backticks:

```diff
%s
```
]],
                vim.fn.system("git diff --no-ext-diff --staged")
              )
            end,
            opts = {
              fetch_key = function()
                return vim.env.GLM_KEY
              end,
              url = "https://open.bigmodel.cn/api/paas/v4/chat/completions",
              model = "glm-4-flash",
              api_type = "zhipu",
              enter_flexible_window = true,
              apply_visual_selection = false,
            },
          },
        },
    })
    end,
    keys = {
      { "<leader>ac", mode = "n", "<cmd>LLMSessionToggle<cr>" },
      { "<leader>ts", mode = "x", "<cmd>LLMAppHandler WordTranslate<cr>" },
      { "<leader>ae", mode = "v", "<cmd>LLMAppHandler CodeExplain<cr>" },
      { "<leader>at", mode = "n", "<cmd>LLMAppHandler Translate<cr>" },
      { "<leader>tc", mode = "x", "<cmd>LLMAppHandler TestCode<cr>" },
      { "<leader>ao", mode = "x", "<cmd>LLMAppHandler OptimCompare<cr>" },
      { "<leader>au", mode = "n", "<cmd>LLMAppHandler UserInfo<cr>" },
      { "<leader>ag", mode = "n", "<cmd>LLMAppHandler CommitMsg<cr>" },
      -- { "<leader>ao", mode = "x", "<cmd>LLMAppHandler OptimizeCode<cr>" },
    },
  },

⬆ back to top

Local LLM Configuration

Local LLMs require custom parsing functions; for streaming output, we use our custom streaming_handler; for AI tools that return output results in one go, we use our custom parse_handler.

Below is an example of ollama running llama3.2:1b.

Expand the code.

local function local_llm_streaming_handler(chunk, ctx, F)
  if not chunk then
    return ctx.assistant_output
  end
  local tail = chunk:sub(-1, -1)
  if tail:sub(1, 1) ~= "}" then
    ctx.line = ctx.line .. chunk
  else
    ctx.line = ctx.line .. chunk
    local status, data = pcall(vim.fn.json_decode, ctx.line)
    if not status or not data.message.content then
      return ctx.assistant_output
    end
    ctx.assistant_output = ctx.assistant_output .. data.message.content
    F.WriteContent(ctx.bufnr, ctx.winid, data.message.content)
    ctx.line = ""
  end
  return ctx.assistant_output
end

local function local_llm_parse_handler(chunk)
  local assistant_output = chunk.message.content
  return assistant_output
end

return {
  {
    "Kurama622/llm.nvim",
    dependencies = { "nvim-lua/plenary.nvim", "MunifTanjim/nui.nvim" },
    cmd = { "LLMSessionToggle", "LLMSelectedTextHandler" },
    config = function()
      require("llm").setup({
        url = "http://localhost:11434/api/chat", -- your url
        model = "llama3.2:1b",

        streaming_handler = local_llm_streaming_handler,
        app_handler = {
          WordTranslate = {
            handler = tools.flexi_handler,
            prompt = "Translate the following text to Chinese, please only return the translation",
            opts = {
              parse_handler = local_llm_parse_handler,
              exit_on_move = true,
              enter_flexible_window = false,
            },
          },
        }
      })
    end,
    keys = {
      { "<leader>ac", mode = "n", "<cmd>LLMSessionToggle<cr>" },
    },
  }
}

⬆ back to top

Default Keyboard Shortcuts

floating window

window	key	mode	desc
Input	`ctrl+g`	`i`	Submit your question
Input	`ctrl+c`	`i`	Cancel dialog response
Input	`ctrl+r`	`i`	Rerespond to the dialog
Input	`ctrl+j`	`i`	Select the next session history
Input	`ctrl+k`	`i`	Select the previous session history
Input	`Ctrl+b`	`n`/`i`	Output Window page up
Input	`Ctrl+f`	`n`/`i`	Output window page down
Input	`Ctrl+u`	`n`/`i`	Output Window page up (half)
Input	`Ctrl+d`	`n`/`i`	Output window page down (half)
Input	`gg`	`n`	Jump to the top (output window)
Input	`G`	`n`	Jump to the bottom (output window)
Output+Input	`<leader>ac`	`n`	Toggle session
Output+Input	`<esc>`	`n`	Close session

Window switch

You can use <C-w><C-w> to switch windows, and if you find it ungraceful, you can also set your own shortcut key for window switching. (This feature has not set a default shortcut key)

    -- Switch from the output window to the input window.
    ["Focus:Input"]       = { mode = "n", key = {"i", "<C-w>"} },
    -- Switch from the input window to the output window.
    ["Focus:Output"]      = { mode = { "n", "i" }, key = "<C-w>" },

split window

window	key	mode	desc
Input	`<cr>`	`n`	Submit your question
Output	`i`	`n`	Open the input box
Output	`ctrl+c`	`n`	Cancel dialog response
Output	`ctrl+r`	`n`	Rerespond to the dialog
Output	`ctrl+h`	`n`	Open the history window
Output+Input	`<leader>ac`	`n`	Toggle session
Output+Input	`<esc>`	`n`	Close session
History	`j`	`n`	Preview the next session history
History	`k`	`n`	Preview the previous session history
History	`<cr>`	`n`	Enter the selected session
History	`<esc>`	`n`	Close the history window

TODO List

todo-list

⬆ back to top

Author's configuration

plugins/llm

Acknowledgments

We would like to express our heartfelt gratitude to the contributors of the following open-source projects, whose code has provided invaluable inspiration and reference for the development of llm.nvim:

olimorris/codecompanion.nvim: Diff style and prompt.
SmiteshP/nvim-navbuddy: UI.
milanglacier/minuet-ai.nvim: Code completions.

Special thanks

ACKNOWLEDGMENTS

Q&A

The format of curl usage in Windows is different from Linux, and the default request format of llm.nvim may cause issues under Windows.

Use a custom request format

Basic Chat and some AI tools (using streaming output) with customized request format

Define the args parameter at the same level as the prompt.

--[[ custom request args ]]
args = [[return {url, "-N", "-X", "POST", "-H", "Content-Type: application/json", "-H", authorization, "-d", vim.fn.json_encode(body)}]],

AI tools (using non-streaming output) custom request format

Define args in opts

  WordTranslate = {
    handler = tools.flexi_handler,
    prompt = "Translate the following text to Chinese, please only return the translation",
    opts = {
      fetch_key = function()
        return vim.env.GLM_KEY
      end,
      url = "https://open.bigmodel.cn/api/paas/v4/chat/completions",
      model = "glm-4-flash",
      api_type = "zhipu",
      args = [[return {url, "-N", "-X", "POST", "-H", "Content-Type: application/json", "-H", authorization, "-d", vim.fn.json_encode(body)}]],
      exit_on_move = true,
      enter_flexible_window = false,
    },
  },

[!NOTE] You need to modify the args according to your actual situation.

⬆ back to top

Switching between multiple LLMs and frequently changing the value of LLM_KEY is troublesome, and I don't want to expose my key in Neovim's configuration file.

Create a .env file specifically to store your various keys. Note: Do not upload this file to GitHub.

export GITHUB_TOKEN=xxxxxxx
export DEEPSEEK_TOKEN=xxxxxxx
export SILICONFLOW_TOKEN=xxxxxxx

Load the .env file in zshrc or bashrc

source ~/.config/zsh/.env

# Default to using the LLM provided by Github Models.
export LLM_KEY=$GITHUB_TOKEN

Finally, switching keys is completed through fetch_key.

  fetch_key = function()
    return vim.env.DEEPSEEK_TOKEN
  end,

⬆ back to top

Priority of different parse/streaming functions

AI tool configuration's streaming_handler or parse_handler > AI tool configuration's api_type > Main configuration's streaming_handler or parse_handler > Main configuration's api_type

⬆ back to top

How can the AI-generated git commit message feature be integrated with lazygit

{
  "kdheepak/lazygit.nvim",
  lazy = true,
  cmd = {
    "LazyGit",
    "LazyGitConfig",
    "LazyGitCurrentFile",
    "LazyGitFilter",
    "LazyGitFilterCurrentFile",
  },
  -- optional for floating window border decoration
  dependencies = {
    "nvim-lua/plenary.nvim",
  },
  config = function()
    vim.keymap.set("t", "<C-c>", function()
      vim.api.nvim_win_close(vim.api.nvim_get_current_win(), true)
      vim.api.nvim_command("LLMAppHandler CommitMsg")
    end, { desc = "AI Commit Msg" })
  end,
}

⬆ back to top

For Tasks:

Click tags to check more tools for each tasks

optimize code compare code translate text explain code customize llm

For Jobs:

software developer data scientist ai engineer technical writer research scientist

Alternative AI tools for llm.nvim

Similar Open Source Tools

llm.nvim

github

: 264

avante.nvim

avante.nvim is a Neovim plugin that emulates the behavior of the Cursor AI IDE, providing AI-driven code suggestions and enabling users to apply recommendations to their source files effortlessly. It offers AI-powered code assistance and one-click application of suggested changes, streamlining the editing process and saving time. The plugin is still in early development, with functionalities like setting API keys, querying AI about code, reviewing suggestions, and applying changes. Key bindings are available for various actions, and the roadmap includes enhancing AI interactions, stability improvements, and introducing new features for coding tasks.

github

: 12.1k

ax

Ax is a Typescript library that allows users to build intelligent agents inspired by agentic workflows and the Stanford DSP paper. It seamlessly integrates with multiple Large Language Models (LLMs) and VectorDBs to create RAG pipelines or collaborative agents capable of solving complex problems. The library offers advanced features such as streaming validation, multi-modal DSP, and automatic prompt tuning using optimizers. Users can easily convert documents of any format to text, perform smart chunking, embedding, and querying, and ensure output validation while streaming. Ax is production-ready, written in Typescript, and has zero dependencies.

github

: 1.4k

mistral-inference

Mistral Inference repository contains minimal code to run 7B, 8x7B, and 8x22B models. It provides model download links, installation instructions, and usage guidelines for running models via CLI or Python. The repository also includes information on guardrailing, model platforms, deployment, and references. Users can interact with models through commands like mistral-demo, mistral-chat, and mistral-common. Mistral AI models support function calling and chat interactions for tasks like testing models, chatting with models, and using Codestral as a coding assistant. The repository offers detailed documentation and links to blogs for further information.

github

: 10.1k

Webscout

WebScout is a versatile tool that allows users to search for anything using Google, DuckDuckGo, and phind.com. It contains AI models, can transcribe YouTube videos, generate temporary email and phone numbers, has TTS support, webai (terminal GPT and open interpreter), and offline LLMs. It also supports features like weather forecasting, YT video downloading, temp mail and number generation, text-to-speech, advanced web searches, and more.

github

: 203

langchainrb

Langchain.rb is a Ruby library that makes it easy to build LLM-powered applications. It provides a unified interface to a variety of LLMs, vector search databases, and other tools, making it easy to build and deploy RAG (Retrieval Augmented Generation) systems and assistants. Langchain.rb is open source and available under the MIT License.

github

: 1.7k

byzer-llm

Easy, fast, and cheap pretrain, finetune, serving for everyone

github

: 293

ai00_server

AI00 RWKV Server is an inference API server for the RWKV language model based upon the web-rwkv inference engine. It supports VULKAN parallel and concurrent batched inference and can run on all GPUs that support VULKAN. No need for Nvidia cards!!! AMD cards and even integrated graphics can be accelerated!!! No need for bulky pytorch, CUDA and other runtime environments, it's compact and ready to use out of the box! Compatible with OpenAI's ChatGPT API interface. 100% open source and commercially usable, under the MIT license. If you are looking for a fast, efficient, and easy-to-use LLM API server, then AI00 RWKV Server is your best choice. It can be used for various tasks, including chatbots, text generation, translation, and Q&A.

github

: 530

Webscout

github

: 210

worker-vllm

The worker-vLLM repository provides a serverless endpoint for deploying OpenAI-compatible vLLM models with blazing-fast performance. It supports deploying various model architectures, such as Aquila, Baichuan, BLOOM, ChatGLM, Command-R, DBRX, DeciLM, Falcon, Gemma, GPT-2, GPT BigCode, GPT-J, GPT-NeoX, InternLM, Jais, LLaMA, MiniCPM, Mistral, Mixtral, MPT, OLMo, OPT, Orion, Phi, Phi-3, Qwen, Qwen2, Qwen2MoE, StableLM, Starcoder2, Xverse, and Yi. Users can deploy models using pre-built Docker images or build custom images with specified arguments. The repository also supports OpenAI compatibility for chat completions, completions, and models, with customizable input parameters. Users can modify their OpenAI codebase to use the deployed vLLM worker and access a list of available models for deployment.

github

: 300

rust-genai

genai is a multi-AI providers library for Rust that aims to provide a common and ergonomic single API to various generative AI providers such as OpenAI, Anthropic, Cohere, Ollama, and Gemini. It focuses on standardizing chat completion APIs across major AI services, prioritizing ergonomics and commonality. The library initially focuses on text chat APIs and plans to expand to support images, function calling, and more in the future versions. Version 0.1.x will have breaking changes in patches, while version 0.2.x will follow semver more strictly. genai does not provide a full representation of a given AI provider but aims to simplify the differences at a lower layer for ease of use.

github

: 154

client-python

The Mistral Python Client is a tool inspired by cohere-python that allows users to interact with the Mistral AI API. It provides functionalities to access and utilize the AI capabilities offered by Mistral. Users can easily install the client using pip and manage dependencies using poetry. The client includes examples demonstrating how to use the API for various tasks, such as chat interactions. To get started, users need to obtain a Mistral API Key and set it as an environment variable. Overall, the Mistral Python Client simplifies the integration of Mistral AI services into Python applications.

github

: 570

aiocache

Aiocache is an asyncio cache library that supports multiple backends such as memory, redis, and memcached. It provides a simple interface for functions like add, get, set, multi_get, multi_set, exists, increment, delete, clear, and raw. Users can easily install and use the library for caching data in Python applications. Aiocache allows for easy instantiation of caches and setup of cache aliases for reusing configurations. It also provides support for backends, serializers, and plugins to customize cache operations. The library offers detailed documentation and examples for different use cases and configurations.

github

: 1.2k

pocketgroq

PocketGroq is a tool that provides advanced functionalities for text generation, web scraping, web search, and AI response evaluation. It includes features like an Autonomous Agent for answering questions, web crawling and scraping capabilities, enhanced web search functionality, and flexible integration with Ollama server. Users can customize the agent's behavior, evaluate responses using AI, and utilize various methods for text generation, conversation management, and Chain of Thought reasoning. The tool offers comprehensive methods for different tasks, such as initializing RAG, error handling, and tool management. PocketGroq is designed to enhance development processes and enable the creation of AI-powered applications with ease.

github

: 178

MHA2MLA

This repository contains the code for the paper 'Towards Economical Inference: Enabling DeepSeek's Multi-Head Latent Attention in Any Transformer-based LLMs'. It provides tools for fine-tuning and evaluating Llama models, converting models between different frameworks, processing datasets, and performing specific model training tasks like Partial-RoPE Fine-Tuning and Multiple-Head Latent Attention Fine-Tuning. The repository also includes commands for model evaluation using Lighteval and LongBench, along with necessary environment setup instructions.

github

: 145

mcphub.nvim

MCPHub.nvim is a powerful Neovim plugin that integrates MCP (Model Context Protocol) servers into your workflow. It offers a centralized config file for managing servers and tools, with an intuitive UI for testing resources. Ideal for LLM integration, it provides programmatic API access and interactive testing through the `:MCPHub` command.

github

: 448

For similar tasks

llm.nvim

github

: 264

gpt_academic

GPT Academic is a powerful tool that leverages the capabilities of large language models (LLMs) to enhance academic research and writing. It provides a user-friendly interface that allows researchers, students, and professionals to interact with LLMs and utilize their abilities for various academic tasks. With GPT Academic, users can access a wide range of features and functionalities, including: * **Summarization and Paraphrasing:** GPT Academic can summarize complex texts, articles, and research papers into concise and informative summaries. It can also paraphrase text to improve clarity and readability. * **Question Answering:** Users can ask GPT Academic questions related to their research or studies, and the tool will provide comprehensive and well-informed answers based on its knowledge and understanding of the relevant literature. * **Code Generation and Explanation:** GPT Academic can generate code snippets and provide explanations for complex coding concepts. It can also help debug code and suggest improvements. * **Translation:** GPT Academic supports translation of text between multiple languages, making it a valuable tool for researchers working with international collaborations or accessing resources in different languages. * **Citation and Reference Management:** GPT Academic can help users manage their citations and references by automatically generating citations in various formats and providing suggestions for relevant references based on the user's research topic. * **Collaboration and Note-Taking:** GPT Academic allows users to collaborate on projects and take notes within the tool. They can share their work with others and access a shared workspace for real-time collaboration. * **Customizable Interface:** GPT Academic offers a customizable interface that allows users to tailor the tool to their specific needs and preferences. They can choose from a variety of themes, adjust the layout, and add or remove features to create a personalized workspace. Overall, GPT Academic is a versatile and powerful tool that can significantly enhance the productivity and efficiency of academic research and writing. It empowers users to leverage the capabilities of LLMs and unlock new possibilities for academic exploration and knowledge creation.

github

: 67.8k

shell-ask

Shell Ask is a command-line tool that enables users to interact with various language models through a simple interface. It supports multiple LLMs such as OpenAI, Anthropic, Ollama, and Google Gemini. Users can ask questions, provide context through command output, select models interactively, and define reusable AI commands. The tool allows piping the output of other programs for enhanced functionality. With AI command presets and configuration options, Shell Ask provides a versatile and efficient way to leverage language models for various tasks.

github

: 386

DevoxxGenieIDEAPlugin

Devoxx Genie is a Java-based IntelliJ IDEA plugin that integrates with local and cloud-based LLM providers to aid in reviewing, testing, and explaining project code. It supports features like code highlighting, chat conversations, and adding files/code snippets to context. Users can modify REST endpoints and LLM parameters in settings, including support for cloud-based LLMs. The plugin requires IntelliJ version 2023.3.4 and JDK 17. Building and publishing the plugin is done using Gradle tasks. Users can select an LLM provider, choose code, and use commands like review, explain, or generate unit tests for code analysis.

github

: 414

neural

Neural is a Vim and Neovim plugin that integrates various machine learning tools to assist users in writing code, generating text, and explaining code or paragraphs. It supports multiple machine learning models, focuses on privacy, and is compatible with Vim 8.0+ and Neovim 0.8+. Users can easily configure Neural to interact with third-party machine learning tools, such as OpenAI, to enhance code generation and completion. The plugin also provides commands like `:NeuralExplain` to explain code or text and `:NeuralStop` to stop Neural from working. Neural is maintained by the Dense Analysis team and comes with a disclaimer about sending input data to third-party servers for machine learning queries.

github

: 492

fittencode.nvim

Fitten Code AI Programming Assistant for Neovim provides fast completion using AI, asynchronous I/O, and support for various actions like document code, edit code, explain code, find bugs, generate unit test, implement features, optimize code, refactor code, start chat, and more. It offers features like accepting suggestions with Tab, accepting line with Ctrl + Down, accepting word with Ctrl + Right, undoing accepted text, automatic scrolling, and multiple HTTP/REST backends. It can run as a coc.nvim source or nvim-cmp source.

github

: 108

chatgpt

The ChatGPT R package provides a set of features to assist in R coding. It includes addins like Ask ChatGPT, Comment selected code, Complete selected code, Create unit tests, Create variable name, Document code, Explain selected code, Find issues in the selected code, Optimize selected code, and Refactor selected code. Users can interact with ChatGPT to get code suggestions, explanations, and optimizations. The package helps in improving coding efficiency and quality by providing AI-powered assistance within the RStudio environment.

github

: 310

scalene

Scalene is a high-performance CPU, GPU, and memory profiler for Python that provides detailed information and runs faster than many other profilers. It incorporates AI-powered proposed optimizations, allowing users to generate optimization suggestions by clicking on specific lines or regions of code. Scalene separates time spent in Python from native code, highlights hotspots, and identifies memory usage per line. It supports GPU profiling on NVIDIA-based systems and detects memory leaks. Users can generate reduced profiles, profile specific functions using decorators, and suspend/resume profiling for background processes. Scalene is available as a pip or conda package and works on various platforms. It offers features like profiling at the line level, memory trends, copy volume reporting, and leak detection.

github

: 12.5k

For similar jobs

sweep

Sweep is an AI junior developer that turns bugs and feature requests into code changes. It automatically handles developer experience improvements like adding type hints and improving test coverage.

github

: 7.1k

teams-ai

The Teams AI Library is a software development kit (SDK) that helps developers create bots that can interact with Teams and Microsoft 365 applications. It is built on top of the Bot Framework SDK and simplifies the process of developing bots that interact with Teams' artificial intelligence capabilities. The SDK is available for JavaScript/TypeScript, .NET, and Python.

github

: 502

ai-guide

This guide is dedicated to Large Language Models (LLMs) that you can run on your home computer. It assumes your PC is a lower-end, non-gaming setup.

github

: 159

classifai

Supercharge WordPress Content Workflows and Engagement with Artificial Intelligence. Tap into leading cloud-based services like OpenAI, Microsoft Azure AI, Google Gemini and IBM Watson to augment your WordPress-powered websites. Publish content faster while improving SEO performance and increasing audience engagement. ClassifAI integrates Artificial Intelligence and Machine Learning technologies to lighten your workload and eliminate tedious tasks, giving you more time to create original content that matters.

github

: 620

chatbot-ui

Chatbot UI is an open-source AI chat app that allows users to create and deploy their own AI chatbots. It is easy to use and can be customized to fit any need. Chatbot UI is perfect for businesses, developers, and anyone who wants to create a chatbot.

github

: 27.7k

BricksLLM

BricksLLM is a cloud native AI gateway written in Go. Currently, it provides native support for OpenAI, Anthropic, Azure OpenAI and vLLM. BricksLLM aims to provide enterprise level infrastructure that can power any LLM production use cases. Here are some use cases for BricksLLM: * Set LLM usage limits for users on different pricing tiers * Track LLM usage on a per user and per organization basis * Block or redact requests containing PIIs * Improve LLM reliability with failovers, retries and caching * Distribute API keys with rate limits and cost limits for internal development/production use cases * Distribute API keys with rate limits and cost limits for students

github

: 953

uAgents

uAgents is a Python library developed by Fetch.ai that allows for the creation of autonomous AI agents. These agents can perform various tasks on a schedule or take action on various events. uAgents are easy to create and manage, and they are connected to a fast-growing network of other uAgents. They are also secure, with cryptographically secured messages and wallets.

github

: 1.3k

griptape

Griptape is a modular Python framework for building AI-powered applications that securely connect to your enterprise data and APIs. It offers developers the ability to maintain control and flexibility at every step. Griptape's core components include Structures (Agents, Pipelines, and Workflows), Tasks, Tools, Memory (Conversation Memory, Task Memory, and Meta Memory), Drivers (Prompt and Embedding Drivers, Vector Store Drivers, Image Generation Drivers, Image Query Drivers, SQL Drivers, Web Scraper Drivers, and Conversation Memory Drivers), Engines (Query Engines, Extraction Engines, Summary Engines, Image Generation Engines, and Image Query Engines), and additional components (Rulesets, Loaders, Artifacts, Chunkers, and Tokenizers). Griptape enables developers to create AI-powered applications with ease and efficiency.

github

: 2.2k

llm.nvim

README:

Contents

Screenshots

Chat

Explain Code

Optimize Code

Installation

Dependencies

Preconditions

Websites of different AI platforms

Minimal installation example

Configuration

Basic Configuration

Examples

Window Style Configuration

Examples

Configuration of AI Tools

Examples

Local LLM Configuration

Default Keyboard Shortcuts

Window switch

TODO List

Author's configuration

Acknowledgments

Special thanks

Q&A

The format of curl usage in Windows is different from Linux, and the default request format of llm.nvim may cause issues under Windows.

Switching between multiple LLMs and frequently changing the value of LLM_KEY is troublesome, and I don't want to expose my key in Neovim's configuration file.

Priority of different parse/streaming functions

How can the AI-generated git commit message feature be integrated with lazygit

For Tasks:

For Jobs:

Alternative AI tools for llm.nvim

Similar Open Source Tools

llm.nvim

avante.nvim

ax

mistral-inference

Webscout

langchainrb

byzer-llm

ai00_server

Webscout

worker-vllm

rust-genai

client-python

aiocache

pocketgroq

MHA2MLA

mcphub.nvim

For similar tasks

llm.nvim

gpt_academic

shell-ask

DevoxxGenieIDEAPlugin

neural

fittencode.nvim

chatgpt

scalene

For similar jobs

sweep

teams-ai

ai-guide

classifai

chatbot-ui

BricksLLM

uAgents

griptape