VectorCode

A code repository indexing tool to supercharge your LLM experience.

Stars: 665

Visit

VectorCode is a code repository indexing tool that helps users write better prompts for coding LLMs by providing information about the code repository being worked on. It includes a neovim plugin and supports multiple embedding engines. The tool enhances completion results by providing project context and improves understanding of close-source or cutting edge projects.

README:

VectorCode

VectorCode is a code repository indexing tool. It helps you build better prompt for your coding LLMs by indexing and providing information about the code repository you're working on. This repository also contains the corresponding neovim plugin that provides a set of APIs for you to build or enhance AI plugins, and integrations for some of the popular plugins.

[!NOTE] This project is in beta quality and is undergoing rapid iterations. I know there are plenty of rooms for improvements, and any help is welcomed.

Why VectorCode?
Documentation
- About Versioning
TODOs
Credit
Star History

Why VectorCode?

LLMs usually have very limited understanding about close-source projects, projects that are not well-known, and cutting edge developments that have not made it into releases. Their capabilities on these projects are quite limited. With VectorCode, you can easily (and programmatically) inject task-relevant context from the project into the prompt. This significantly improves the quality of the model output and reduce hallucination.

Documentation

[!NOTE] The documentation on the main branch reflects the code on the latest commit. To check for the documentation for the version you're using, you can check out the corresponding tags.

For the setup and usage of the command-line tool, see the CLI documentation;
For neovim users, after you've gone through the CLI documentation, please refer to the neovim plugin documentation (and optionally the lua API reference) for further instructions.
Additional resources:
- the wiki for extra tricks and tips that will help you get the most out of VectorCode;
- the discussions where you can ask general questions and share your cool usages about VectorCode.
- If you're feeling adanvturous, feel free to check out the pull requests for WIP features.

If you're trying to contribute to this project, take a look at the contribution guide, which contains information about some basic guidelines that you should follow and tips that you may find helpful.

About Versioning

This project follows an adapted semantic versioning:

Until 1.0.0 is released, the major version number stays 0 which indicates that this project is still in early stage, and features/interfaces may change from time to time;
The minor version number indicates breaking changes. When I decide to remove a feature/config option, the actual removal will happen when I bump the minor version number. Therefore, if you want to avoid breaking a working setup, you may choose to use a version constraint like "vectorcode<0.7.0";
The patch version number indicates non-breaking changes. This can include new features and bug fixes. When I decide to deprecate things, I will make a new release with bumped patch version. Until the minor version number is bumped, the deprecated feature will still work but you'll see a warning. It's recommended to update your setup to adapt the new features.

TODOs

[x] query by ~~file path~~ excluded paths;
[x] chunking support;
- [x] add metadata for files;
- [x] chunk-size configuration;
- [x] smarter chunking (semantics/syntax based), implemented with py-tree-sitter and tree-sitter-language-pack;
- [x] configurable document selection from query results.
[x] ~~NeoVim Lua API with cache to skip the retrieval when a project has not been indexed~~ Returns empty array instead;
[x] job pool for async caching;
[x] persistent-client;
[ ] proper remote Chromadb support (with authentication, etc.);
[x] respect .gitignore;
[x] implement some sort of project-root anchors (such as .git or a custom .vectorcode.json) that enhances automatic project-root detection. Implemented project-level .vectorcode/ and .git as root anchor
[x] ability to view and delete files in a collection;
[x] joint search (kinda, using codecompanion.nvim/MCP);
[x] Nix support (unofficial packages here);
[ ] Query rewriting (#124).

Credit

@milanglacier (and minuet-ai.nvim) for the support when this project was still in early stage;
@olimorris for the help (personally and from codecompanion.nvim) when this project made initial attempts at tool-calling;
@ravitemer for the help to interface VectorCode with MCP;
The nix community (especially @sarahec and @GaetanLepage) for maintaining the nix packages.

Star History

For Tasks:

Click tags to check more tools for each tasks

index code repositories enhance completion results provide project context improve llm understanding support multiple embedding engines

For Jobs:

software developer machine learning engineer data scientist ai researcher web developer

Alternative AI tools for VectorCode

Similar Open Source Tools

VectorCode

github

: 665

coderunner

Coderunner is a versatile tool designed for running code snippets in various programming languages. It provides an interactive environment for testing and debugging code without the need for a full-fledged IDE. With support for multiple languages and quick execution times, Coderunner is ideal for beginners learning to code, experienced developers prototyping algorithms, educators creating coding exercises, interview candidates practicing coding challenges, and professionals testing small code snippets.

github

: 512

nvim-aider

Nvim-aider is a plugin for Neovim that provides additional functionality and key mappings to enhance the user's editing experience. It offers features such as code navigation, quick access to commonly used commands, and improved text manipulation tools. With Nvim-aider, users can streamline their workflow and increase productivity while working with Neovim.

github

: 86

ollama4j

Ollama4j is a Java library that serves as a wrapper or binding for the Ollama server. It allows users to communicate with the Ollama server and manage models for various deployment scenarios. The library provides APIs for interacting with Ollama, generating fake data, testing UI interactions, translating messages, and building web UIs. Users can easily integrate Ollama4j into their Java projects to leverage the functionalities offered by the Ollama server.

github

: 438

langtest

Langtest is a tool designed for testing and analyzing programming languages. It provides a platform for users to write code snippets in various languages and run them to see the output. The tool supports multiple programming languages and offers features like syntax highlighting, code execution, and result comparison. Users can use Langtest to quickly test code snippets, compare language syntax, and evaluate language performance. It is a useful tool for students, developers, and language enthusiasts to experiment with different programming languages in a convenient and efficient manner.

github

: 539

promptl

Promptl is a versatile command-line tool designed to streamline the process of creating and managing prompts for user input in various programming projects. It offers a simple and efficient way to prompt users for information, validate their input, and handle different scenarios based on their responses. With Promptl, developers can easily integrate interactive prompts into their scripts, applications, and automation workflows, enhancing user experience and improving overall usability. The tool provides a range of customization options and features, making it suitable for a wide range of use cases across different programming languages and environments.

github

: 71

SpecForge

SpecForge is a powerful tool for generating API specifications from code. It helps developers to easily create and maintain accurate API documentation by extracting information directly from the codebase. With SpecForge, users can streamline the process of documenting APIs, ensuring consistency and reducing manual effort. The tool supports various programming languages and frameworks, making it versatile and adaptable to different development environments. By automating the generation of API specifications, SpecForge enhances collaboration between developers and stakeholders, improving overall project efficiency and quality.

github

: 407

onlook

Onlook is a web scraping tool that allows users to extract data from websites easily and efficiently. It provides a user-friendly interface for creating web scraping scripts and supports various data formats for exporting the extracted data. With Onlook, users can automate the process of collecting information from multiple websites, saving time and effort. The tool is designed to be flexible and customizable, making it suitable for a wide range of web scraping tasks.

github

: 22.4k

verl-tool

The verl-tool is a versatile command-line utility designed to streamline various tasks related to version control and code management. It provides a simple yet powerful interface for managing branches, merging changes, resolving conflicts, and more. With verl-tool, users can easily track changes, collaborate with team members, and ensure code quality throughout the development process. Whether you are a beginner or an experienced developer, verl-tool offers a seamless experience for version control operations.

github

: 383

AI-Codereview-Gitlab

AI-Codereview-Gitlab is an automated code review tool based on large models, designed to help development teams conduct intelligent code reviews quickly during code merging or submission. It supports multiple large models including DeepSeek, ZhipuAI, OpenAI, and Ollama. The tool can automatically push review results to DingTalk, WeChat Work, and Feishu, generate daily reports based on GitLab commit records, and provide a visual dashboard to display code review records. The tool works by triggering webhook events on GitLab when users submit code, calling third-party large models to review the code, and recording the review results in corresponding Merge Requests or Commit Notes.

github

: 767

nanocoder

Nanocoder is a versatile code editor designed for beginners and experienced programmers alike. It provides a user-friendly interface with features such as syntax highlighting, code completion, and error checking. With Nanocoder, you can easily write and debug code in various programming languages, making it an ideal tool for learning, practicing, and developing software projects. Whether you are a student, hobbyist, or professional developer, Nanocoder offers a seamless coding experience to boost your productivity and creativity.

github

: 473

GraphLLM

GraphLLM is a graph-based framework designed to process data using LLMs. It offers a set of tools including a web scraper, PDF parser, YouTube subtitles downloader, Python sandbox, and TTS engine. The framework provides a GUI for building and debugging graphs with advanced features like loops, conditionals, parallel execution, streaming of results, hierarchical graphs, external tool integration, and dynamic scheduling. GraphLLM is a low-level framework that gives users full control over the raw prompt and output of models, with a steeper learning curve. It is tested with llama70b and qwen 32b, under heavy development with breaking changes expected.

github

: 209

evalica

Evalica is a powerful tool for evaluating code quality and performance in software projects. It provides detailed insights and metrics to help developers identify areas for improvement and optimize their code. With support for multiple programming languages and frameworks, Evalica offers a comprehensive solution for code analysis and optimization. Whether you are a beginner looking to learn best practices or an experienced developer aiming to enhance your code quality, Evalica is the perfect tool for you.

github

: 57

vivaria

Vivaria is a web application tool designed for running evaluations and conducting agent elicitation research. Users can interact with Vivaria using a web UI and a command-line interface. It allows users to start task environments based on METR Task Standard definitions, run AI agents, perform agent elicitation research, view API requests and responses, add tags and comments to runs, store results in a PostgreSQL database, sync data to Airtable, test prompts against LLMs, and authenticate using Auth0.

github

: 110

mcp-use

MCP-Use is a Python library for analyzing and processing text data using Markov Chains. It provides functionalities for generating text based on input data, calculating transition probabilities, and simulating text sequences. The library is designed to be user-friendly and efficient, making it suitable for natural language processing tasks.

github

: 7.5k

PythonAiRoad

PythonAiRoad is a repository containing classic original articles source code from the 'Algorithm Gourmet House'. It is a platform for sharing algorithms and code related to artificial intelligence. Users are encouraged to contact the author for further discussions or collaborations. The repository serves as a valuable resource for those interested in AI algorithms and implementations.

github

: 382

For similar tasks

VectorCode

github

: 665

For similar jobs

weave

Weave is a toolkit for developing Generative AI applications, built by Weights & Biases. With Weave, you can log and debug language model inputs, outputs, and traces; build rigorous, apples-to-apples evaluations for language model use cases; and organize all the information generated across the LLM workflow, from experimentation to evaluations to production. Weave aims to bring rigor, best-practices, and composability to the inherently experimental process of developing Generative AI software, without introducing cognitive overhead.

github

: 980

LLMStack

LLMStack is a no-code platform for building generative AI agents, workflows, and chatbots. It allows users to connect their own data, internal tools, and GPT-powered models without any coding experience. LLMStack can be deployed to the cloud or on-premise and can be accessed via HTTP API or triggered from Slack or Discord.

github

: 1.5k

VisionCraft

The VisionCraft API is a free API for using over 100 different AI models. From images to sound.

github

: 94

kaito

Kaito is an operator that automates the AI/ML inference model deployment in a Kubernetes cluster. It manages large model files using container images, avoids tuning deployment parameters to fit GPU hardware by providing preset configurations, auto-provisions GPU nodes based on model requirements, and hosts large model images in the public Microsoft Container Registry (MCR) if the license allows. Using Kaito, the workflow of onboarding large AI inference models in Kubernetes is largely simplified.

github

: 405

PyRIT

PyRIT is an open access automation framework designed to empower security professionals and ML engineers to red team foundation models and their applications. It automates AI Red Teaming tasks to allow operators to focus on more complicated and time-consuming tasks and can also identify security harms such as misuse (e.g., malware generation, jailbreaking), and privacy harms (e.g., identity theft). The goal is to allow researchers to have a baseline of how well their model and entire inference pipeline is doing against different harm categories and to be able to compare that baseline to future iterations of their model. This allows them to have empirical data on how well their model is doing today, and detect any degradation of performance based on future improvements.

github

: 2.9k

tabby

Tabby is a self-hosted AI coding assistant, offering an open-source and on-premises alternative to GitHub Copilot. It boasts several key features: * Self-contained, with no need for a DBMS or cloud service. * OpenAPI interface, easy to integrate with existing infrastructure (e.g Cloud IDE). * Supports consumer-grade GPUs.

github

: 32.1k

spear

SPEAR (Simulator for Photorealistic Embodied AI Research) is a powerful tool for training embodied agents. It features 300 unique virtual indoor environments with 2,566 unique rooms and 17,234 unique objects that can be manipulated individually. Each environment is designed by a professional artist and features detailed geometry, photorealistic materials, and a unique floor plan and object layout. SPEAR is implemented as Unreal Engine assets and provides an OpenAI Gym interface for interacting with the environments via Python.

github

: 224

Magick

Magick is a groundbreaking visual AIDE (Artificial Intelligence Development Environment) for no-code data pipelines and multimodal agents. Magick can connect to other services and comes with nodes and templates well-suited for intelligent agents, chatbots, complex reasoning systems and realistic characters.

github

: 675