dir-assistant

Chat with your current directory's files using a local or API LLM.

Stars: 324

Visit

Dir-assistant is a tool that allows users to interact with their current directory's files using local or API Language Models (LLMs). It supports various platforms and provides API support for major LLM APIs. Users can configure and customize their local LLMs and API LLMs using the tool. Dir-assistant also supports model downloads and configurations for efficient usage. It is designed to enhance file interaction and retrieval using advanced language models.

README:

dir-assistant

Chat with your current directory's files using a local or API LLM.

Summary

dir-assistant is a CLI python application available through pip that recursively indexes all text files in the current working directory so you can chat with them using a local or API LLM. By "chat with them", it is meant that their contents will automatically be included in the prompts sent to the LLM, with the most contextually relevant files included first. dir-assistant is designed primarily for use as a coding aid and automation tool.

Features

Includes an interactive chat mode and a single prompt non-interactive mode.
When enabled, it will automatically make file updates and commit to git.
Local platform support for CPU (OpenBLAS), Cuda, ROCm, Metal, Vulkan, and SYCL.
API support for all major LLM APIs. More info in the LiteLLM Docs.
Uses a unique method for finding the most important files to include when submitting your prompt to an LLM called CGRAG (Contextually Guided Retrieval-Augmented Generation). You can read this blog post for more information about how it works.

New Features
1. Notable Upstream News
Quickstart
General Usage Tips
1. Optimized Settings for Coding Assistance
Install
Embedding Model Configuration
Optional: Select A Hardware Platform
API Configuration
1. Connecting to a Custom API Server.
Local LLM Model Download
Running
Upgrading
Additional Help
Contributors
Acknowledgements
Limitations
Todos
Additional Credits

New Features

Added llama-cpp-python as an optional instead of required dependency downloadable with pip install dir-assistant[recommended]
Official Windows support. Note: The python installer via python.org is recommended for Windows.
Custom API server connections using the new LiteLLM completion settings config section. This enables you to use your own GPU rig with dir-assistant. See Connecting to a Custom API Server.

Notable Upstream News

This section is dedicated to changes in libraries which can impact users of dir-assistant.

llama-cpp-python

KV cache quants now available for most models. This enables reduced memory consumption per context token.
Improved flash attention implementation for ROCM. This drastically reduces VRAM usage for large contexts on AMD cards.

These changes allow a 32B model with 128k context to comfortably run on all GPUs with at least 20GB of VRAM if enabled.

Quickstart

In this section are recipes to run dir-assistant in basic capacity to get you started quickly.

Quickstart Chat with API Model

To get started using an API model, you can use Google Gemini 2.0 Flash, which is currently free. To begin, you need to sign up for Google AI Studio and create an API key. After you create your API key, enter the following commands:

pip install dir-assistant
dir-assistant setkey GEMINI_API_KEY xxxxxYOURAPIKEYHERExxxxx
cd directory/to/chat/with
dir-assistant

For Windows

Note: The Python.org installer is recommended for Windows. The Windows Store installer does not add dir-assistant to your PATH so you will need to call it with python -m dir_assistant if you decide to go that route.

pip install dir-assistant
dir-assistant setkey GEMINI_API_KEY xxxxxYOURAPIKEYHERExxxxx
cd directory/to/chat/with
dir-assistant

For Ubuntu 24.04

pip3 has been replaced with pipx starting in Ubuntu 24.04.

pipx install dir-assistant
dir-assistant setkey GEMINI_API_KEY xxxxxYOURAPIKEYHERExxxxx
cd directory/to/chat/with
dir-assistant

Quickstart Chat with Local Default Model

To get started locally, you can download a default llm model. Default configuration with this model requires 3GB of memory on most hardware. You will be able to adjust the configuration to fit higher or lower memory requirements. To run via CPU:

pip install dir-assistant[recommended]
dir-assistant models download-embed
dir-assistant models download-llm
cd directory/to/chat/with
dir-assistant

To run with hardware acceleration, use the platform subcommand:

...
dir-assistant platform cuda
cd directory/to/chat/with
dir-assistant

See which platforms are supported using -h:

dir-assistant platform -h

For Windows

It is not recommended to use dir-assistant directly with local LLMs on Windows. This is because llama-cpp-python requires a C compiler for installation via pip, and setting one up is not a trivial task on Windows like it is on other platforms. Instead, it is recommended to use another LLM server such as LMStudio and configure dir-assistant to use it as a custom API server. To do this, ensure you are installing dir-assistant without the recommended dependencies:

pip install dir-assistant

Then configure dir-assistant to connect to your custom LLM API server:

Connecting to a Custom API Server

For instructions on setting up LMStudio to host an API, follow their guide:

https://lmstudio.ai/docs/app/api

For Ubuntu 24.04

pip3 has been replaced with pipx starting in Ubuntu 24.04.

pipx install dir-assistant[recommended]
...
dir-assistant platform cuda --pipx

Quickstart Non-interactive Prompt with API Model

The non-interactive mode of dir-assistant allows you to create scripts which analyze your files without user interaction.

pip install dir-assistant
dir-assistant setkey GEMINI_API_KEY xxxxxYOURAPIKEYHERExxxxx
cd directory/to/chat/with
dir-assistant -s "Describe the files in this directory"

For Ubuntu 24.04

pip3 has been replaced with pipx starting in Ubuntu 24.04.

pipx install dir-assistant
dir-assistant setkey GEMINI_API_KEY xxxxxYOURAPIKEYHERExxxxx
cd directory/to/chat/with
dir-assistant -s "Describe the files in this directory"

Install

Install with pip:

pip install dir-assistant

You can also install llama-cpp-python as an optional dependency to enable dir-assistant to directly run local LLMs:

pip install dir-assistant[recommended]

Note: llama-cpp-python is not updated often so may not run the latest models or have the latest features of Llama.cpp. You may have better results with a separate local LLM server and connect it to dir-assistant using the custom API server feature.

The default configuration for dir-assistant is API-mode. If you download an LLM model with download-llm, local-mode will automatically be set. To change from API-mode to local-mode, set the ACTIVE_MODEL_IS_LOCAL setting.

For Ubuntu 24.04

pip3 has been replaced with pipx starting in Ubuntu 24.04.

pipx install dir-assistant

General Usage Tips

Dir-assistant is a powerful tool with many configuration options. This section provides some general tips for using dir-assistant to achieve the best results.

Optimized Settings for Coding Assistance

There are quite literally thousands of models that can be used with dir-assistant. The best results in terms of quality for complex coding tasks on large codebases as of writing have been achieved with voyage-code-3 and gemini-2.0-flash-thinking-exp. To use these models open the config file with dir-assistant config open and modify this optimized configuration to suit your needs:

Note: Don't forget to add your own API keys! Get them via Google AI Studio and Voyage AI.

[DIR_ASSISTANT]
SYSTEM_INSTRUCTIONS = "You are a helpful AI assistant tasked with assisting my coding. "
GLOBAL_IGNORES = [ ".gitignore", ".d", ".obj", ".sql", "js/vendors", ".tnn", ".env", "node_modules", ".min.js", ".min.css", "htmlcov", ".coveragerc", ".pytest_cache", ".egg-info", ".git/", ".vscode/", "node_modules/", "build/", ".idea/", "__pycache__", ]
CONTEXT_FILE_RATIO = 0.9
ACTIVE_MODEL_IS_LOCAL = false
ACTIVE_EMBED_IS_LOCAL = false
USE_CGRAG = true
PRINT_CGRAG = false
OUTPUT_ACCEPTANCE_RETRIES = 2
COMMIT_TO_GIT = true
VERBOSE = false
NO_COLOR = false
LITELLM_EMBED_REQUEST_DELAY = 0
LITELLM_MODEL_USES_SYSTEM_MESSAGE = true
LITELLM_PASS_THROUGH_CONTEXT_SIZE = false
LITELLM_CONTEXT_SIZE = 200000
LITELLM_EMBED_CONTEXT_SIZE = 4000
MODELS_PATH = "~/.local/share/dir-assistant/models/"
LLM_MODEL = "agentica-org_DeepScaleR-1.5B-Preview-Q4_K_M.gguf"
EMBED_MODEL = "nomic-embed-text-v1.5.Q4_K_M.gguf"

[DIR_ASSISTANT.LITELLM_API_KEYS]
GEMINI_API_KEY = "yourkeyhere"
VOYAGE_API_KEY = "yourkeyhere"

[DIR_ASSISTANT.LITELLM_COMPLETION_OPTIONS]
model = "gemini/gemini-2.0-flash-thinking-exp"
timeout = 600

[DIR_ASSISTANT.LITELLM_EMBED_COMPLETION_OPTIONS]
model = "voyage/voyage-code-3"
timeout = 600

[DIR_ASSISTANT.LLAMA_CPP_COMPLETION_OPTIONS]
frequency_penalty = 1.1
presence_penalty = 1.0

[DIR_ASSISTANT.LLAMA_CPP_OPTIONS]
n_ctx = 10000
verbose = false
n_gpu_layers = -1
rope_scaling_type = 2
rope_freq_scale = 0.75

[DIR_ASSISTANT.LLAMA_CPP_EMBED_OPTIONS]
n_ctx = 4000
n_batch = 512
verbose = false
rope_scaling_type = 2
rope_freq_scale = 0.75
n_gpu_layers = -1

Embedding Model Configuration

You must use an embedding model regardless of whether you are running an LLM via local or API mode, but you can also choose whether the embedding model is local or API using the ACTIVE_EMBED_IS_LOCAL setting. Generally local embedding will be faster, but API will be higher quality. To start, it is recommended to use a local model. You can download a good default embedding model with:

dir-assistant models download-embed

If you would like to use another embedding model, open the models directory with:

dir-assistant models

Note: The embedding model will be hardware accelerated after using the platform subcommand. To disable hardware acceleration, change n_gpu_layers = -1 to n_gpu_layers = 0 in the config.

Optional: Select A Hardware Platform

By default dir-assistant is installed with CPU-only compute support. It will work properly without this step, but if you would like to hardware accelerate dir-assistant, use the command below to compile llama-cpp-python with your hardware's support.

dir-assistant platform cuda

Available options: cpu, cuda, rocm, metal, vulkan, sycl

Note: The embedding model and the local llm model will be run with acceleration after selecting a platform. To disable hardware acceleration change n_gpu_layers = -1 to n_gpu_layers = 0 in the config.

For Ubuntu 24.04

pip3 has been replaced with pipx starting in Ubuntu 24.04.

dir-assistant platform cuda --pipx

For Platform Install Issues

System dependencies may be required for the platform command and are outside the scope of these instructions.

If you have any issues building llama-cpp-python, the project's install instructions may offer more info: https://github.com/abetlen/llama-cpp-python

API Configuration

If you wish to use an API LLM, you will need to configure it. To configure which LLM API dir-assistant uses, you must edit LITELLM_MODEL and the appropriate API key in your configuration. To open your configuration file, enter:

dir-assistant config open

Once editing the file, change:

[DIR_ASSISTANT]
LITELLM_CONTEXT_SIZE = 200000

[DIR_ASSISTANT.LITELLM_API_KEYS]
GEMINI_API_KEY = "xxxxxxxxxxxxxxxxxxx"

[DIR_ASSISTANT.LITELLM_COMPLETION_OPTIONS]
model = "gemini/gemini-2.0-flash"

LiteLLM supports all major LLM APIs, including APIs hosted locally. View the available options in the LiteLLM providers list.

There is a convenience subcommand for modifying and adding API keys:

dir-assistant setkey GEMINI_API_KEY xxxxxYOURAPIKEYHERExxxxx

However, in most cases you will need to modify other options when changing APIs.

Connecting to a Custom API Server

If you would like to connect to a custom API server, such as your own ollama, llama.cpp, LMStudio, vLLM, or other OpenAPI-compatible API server, dir-assistant supports this. To configure for this, open the config with dir-assistant config open and make following changes (LMStudio's base_url shown for the example):

[DIR_ASSISTANT]
ACTIVE_MODEL_IS_LOCAL = false

[DIR_ASSISTANT.LITELLM_COMPLETION_OPTIONS]
model = "openai/mistral-small-24b-instruct-2501"
base_url = "http://localhost:1234/v1"

Local LLM Model Download

If you want to use a local LLM directly within dir-assistant using llama-cpp-python, you can download a low requirements default model with:

dir-assistant models download-llm

Note: The local LLM model will be hardware accelerated after using the platform subcommand. To disable hardware acceleration, change n_gpu_layers = -1 to n_gpu_layers = 0 in the config.

Configuring A Custom Local Model

If you would like to use a custom local LLM model, download a GGUF model and place it in your models directory. Huggingface has numerous GGUF models to choose from. The models directory can be opened in a file browser using this command:

dir-assistant models

After putting your gguf in the models directory, you must configure dir-assistant to use it:

dir-assistant config open

Edit the following setting:

[DIR_ASSISTANT]
LLM_MODEL = "Mistral-Nemo-Instruct-2407.Q6_K.gguf"

Llama.cpp Options

Llama.cpp provides a large number of options to customize how your local model is run. Most of these options are exposed via llama-cpp-python. You can configure them with the [DIR_ASSISTANT.LLAMA_CPP_OPTIONS], [DIR_ASSISTANT.LLAMA_CPP_EMBED_OPTIONS], and [DIR_ASSISTANT.LLAMA_CPP_COMPLETION_OPTIONS] sections in the config file.

The options available for llama-cpp-python are documented in the Llama constructor documentation.

What the options do is also documented in the llama.cpp CLI documentation.

The most important llama-cpp-python options are related to tuning the LLM to your system's VRAM:

Setting n_ctx lower will reduce the amount of VRAM required to run, but will decrease the amount of file text that can be included when running a prompt.
CONTEXT_FILE_RATIO sets the proportion of prompt history to file text to be included when sent to the LLM. Higher ratios mean more file text and less prompt history. More file text generally improves comprehension.
If your llm n_ctx times CONTEXT_FILE_RATIO is smaller than your embed n_ctx, your file text chunks have the potential to be larger than your llm context, and thus will not be included. To ensure all files can be included, make sure your embed context is smaller than n_ctx times CONTEXT_FILE_RATIO.
Larger embed n_ctx will chunk your files into larger sizes, which allows LLMs to understand them more easily.
n_batch must be smaller than the n_ctx of a model, but setting it higher will probably improve performance.

For other tips about tuning Llama.cpp, explore their documentation and do some google searches.

Running

dir-assistant

Running dir-assistant will scan all files recursively in your current directory. The most relevant files will automatically be sent to the LLM when you enter a prompt.

dir-assistant is shorthand for dir-assistant start. All arguments below are applicable for both.

Options for Running

The following arguments are available while running dir-assistant:

-i --ignore: A list of space-separated filepaths to ignore
-d --dirs: A list of space-separated directories to work on (your current directory will always be used)
-s --single-prompt: Run a single prompt and output the final answer
-v --verbose: Show debug information during execution

Example usage:

# Run a single prompt and exit
dir-assistant -s "What does this codebase do?"

# Show debug information
dir-assistant -v

# Ignore specific files and add additional directories
dir-assistant -i ".log" ".tmp" -d "../other-project"

Automated file update and git commit

The COMMIT_TO_GIT feature allows dir-assistant to make changes directly to your files and commit the changes to git during the chat. By default, this feature is disabled, but after enabling it, the assistant will suggest file changes and ask whether to apply the changes. If confirmed, it stages the changes and creates a git commit with the prompt message as the commit message.

To enable the COMMIT_TO_GIT feature, update the configuration:

dir-assistant config open

Change or add the following setting:

[DIR_ASSISTANT]
...
COMMIT_TO_GIT = true

Once enabled, the assistant will handle the Git commit process as part of its workflow. To undo a commit, type undo in the prompt.

Additional directories

You can include files from outside your current directory to include in your dir-assistant session:

dir-assistant -d /path/to/dir1 ../dir2

Ignoring files

You can ignore files when starting up so they will not be included in the assistant's context:

dir-assistant -i file.txt file2.txt

There is also a global ignore list in the config file. To configure it first open the config file:

dir-assistant config open

Then edit the setting:

[DIR_ASSISTANT]
...
GLOBAL_IGNORES = [
    ...
    "file.txt"
]

Overriding Configurations with Environment Variables

Any configuration setting can be overridden using environment variables. The environment variable name should match the configuration key name:

# Override the model path
export DIR_ASSISTANT__LLM_MODEL="mistral-7b-instruct.Q4_K_M.gguf"

# Enable git commits
export DIR_ASSISTANT__COMMIT_TO_GIT=true

# Change context ratio
export DIR_ASSISTANT__CONTEXT_FILE_RATIO=0.7

# Change llama.cpp embedding options
export DIR_ASSISTANT__LLAMA_CPP_EMBED_OPTIONS__n_ctx=2048

# Example setting multiple env vars inline with the command
DIR_ASSISTANT__COMMIT_TO_GIT=true DIR_ASSISTANT__CONTEXT_FILE_RATIO=0.7 dir-assistant

This allows multiple config profiles for your custom use cases.

# Run with different models
DIR_ASSISTANT__LLM_MODEL="model1.gguf" dir-assistant -s "What does this codebase do?"
DIR_ASSISTANT__LLM_MODEL="model2.gguf" dir-assistant -s "What does this codebase do?"

# Test with different context ratios
DIR_ASSISTANT__CONTEXT_FILE_RATIO=0.8 dir-assistant

Upgrading

Some version upgrades may have incompatibility issues in the embedding index cache. Use this command to delete the index cache so it may be regenerated:

dir-assistant clear

Additional Help

Use the -h argument with any command or subcommand to view more information. If your problem is beyond the scope of the helptext, please report a github issue.

Contributors

We appreciate contributions from the community! For a list of contributors and how you can contribute, please see CONTRIBUTORS.md.

Acknowledgements

Local LLMs are run via the fantastic llama-cpp-python package
API LLMS are run using the also fantastic LiteLLM package

Limitations

Dir-assistant only detects and reads text files at this time.

Todos

~~API LLMs~~
~~RAG~~
~~File caching (improve startup time)~~
~~CGRAG (Contextually-Guided Retrieval-Augmented Generation)~~
~~Multi-line input~~
~~File watching (automatically reindex changed files)~~
~~Single-step pip install~~
~~Model download~~
~~Commit to git~~
~~API Embedding models~~
~~Immediate mode for better compatibility with custom script automations~~
~~Support for custom APIs~~
Web search
Daemon mode for API-based use

Additional Credits

Special thanks to Blazed.deals for sponsoring this project.

For Tasks:

Click tags to check more tools for each tasks

organize files retrieve information search documents generate text configure models

For Jobs:

data scientist software engineer research scientist ai engineer data analyst

Alternative AI tools for dir-assistant

Similar Open Source Tools

dir-assistant

github

: 324

desktop

ComfyUI Desktop is a packaged desktop application that allows users to easily use ComfyUI with bundled features like ComfyUI source code, ComfyUI-Manager, and uv. It automatically installs necessary Python dependencies and updates with stable releases. The app comes with Electron, Chromium binaries, and node modules. Users can store ComfyUI files in a specified location and manage model paths. The tool requires Python 3.12+ and Visual Studio with Desktop C++ workload for Windows. It uses nvm to manage node versions and yarn as the package manager. Users can install ComfyUI and dependencies using comfy-cli, download uv, and build/launch the code. Troubleshooting steps include rebuilding modules and installing missing libraries. The tool supports debugging in VSCode and provides utility scripts for cleanup. Crash reports can be sent to help debug issues, but no personal data is included.

github

: 1.3k

mlx-lm

MLX LM is a Python package designed for generating text and fine-tuning large language models on Apple silicon using MLX. It offers integration with the Hugging Face Hub for easy access to thousands of LLMs, support for quantizing and uploading models to the Hub, low-rank and full model fine-tuning capabilities, and distributed inference and fine-tuning with `mx.distributed`. Users can interact with the package through command line options or the Python API, enabling tasks such as text generation, chatting with language models, model conversion, streaming generation, and sampling. MLX LM supports various Hugging Face models and provides tools for efficient scaling to long prompts and generations, including a rotating key-value cache and prompt caching. It requires macOS 15.0 or higher for optimal performance.

github

: 339

llm-ollama

LLM-ollama is a plugin that provides access to models running on an Ollama server. It allows users to query the Ollama server for a list of models, register them with LLM, and use them for prompting, chatting, and embedding. The plugin supports image attachments, embeddings, JSON schemas, async models, model aliases, and model options. Users can interact with Ollama models through the plugin in a seamless and efficient manner.

github

: 247

opencommit

OpenCommit is a tool that auto-generates meaningful commits using AI, allowing users to quickly create commit messages for their staged changes. It provides a CLI interface for easy usage and supports customization of commit descriptions, emojis, and AI models. Users can configure local and global settings, switch between different AI providers, and set up Git hooks for integration with IDE Source Control. Additionally, OpenCommit can be used as a GitHub Action to automatically improve commit messages on push events, ensuring all commits are meaningful and not generic. Payments for OpenAI API requests are handled by the user, with the tool storing API keys locally.

github

: 5.9k

reader

Reader is a tool that converts any URL to an LLM-friendly input with a simple prefix `https://r.jina.ai/`. It improves the output for your agent and RAG systems at no cost. Reader supports image reading, captioning all images at the specified URL and adding `Image [idx]: [caption]` as an alt tag. This enables downstream LLMs to interact with the images in reasoning, summarizing, etc. Reader offers a streaming mode, useful when the standard mode provides an incomplete result. In streaming mode, Reader waits a bit longer until the page is fully rendered, providing more complete information. Reader also supports a JSON mode, which contains three fields: `url`, `title`, and `content`. Reader is backed by Jina AI and licensed under Apache-2.0.

github

: 8.5k

vector-inference

This repository provides an easy-to-use solution for running inference servers on Slurm-managed computing clusters using vLLM. All scripts in this repository run natively on the Vector Institute cluster environment. Users can deploy models as Slurm jobs, check server status and performance metrics, and shut down models. The repository also supports launching custom models with specific configurations. Additionally, users can send inference requests and set up an SSH tunnel to run inference from a local device.

github

: 53

gpt-cli

gpt-cli is a command-line interface tool for interacting with various chat language models like ChatGPT, Claude, and others. It supports model customization, usage tracking, keyboard shortcuts, multi-line input, markdown support, predefined messages, and multiple assistants. Users can easily switch between different assistants, define custom assistants, and configure model parameters and API keys in a YAML file for easy customization and management.

github

: 580

repo-to-text

The `repo-to-text` tool converts a directory's structure and contents into a single text file. It generates a formatted text representation that includes the directory tree and file contents, making it easy to share code with LLMs for development and debugging. Users can customize the tool's behavior with various options and settings, including output directory specification, debug logging, and file inclusion/exclusion rules. The tool supports Docker usage for containerized environments and provides detailed instructions for installation, usage, settings configuration, and contribution guidelines. It is a versatile tool for converting repository contents into text format for easy sharing and documentation.

github

: 122

telemetry-airflow

This repository codifies the Airflow cluster that is deployed at workflow.telemetry.mozilla.org (behind SSO) and commonly referred to as "WTMO" or simply "Airflow". Some links relevant to users and developers of WTMO: * The `dags` directory in this repository contains some custom DAG definitions * Many of the DAGs registered with WTMO don't live in this repository, but are instead generated from ETL task definitions in bigquery-etl * The Data SRE team maintains a WTMO Developer Guide (behind SSO)

github

: 185

ai-starter-kit

SambaNova AI Starter Kits is a collection of open-source examples and guides designed to facilitate the deployment of AI-driven use cases for developers and enterprises. The kits cover various categories such as Data Ingestion & Preparation, Model Development & Optimization, Intelligent Information Retrieval, and Advanced AI Capabilities. Users can obtain a free API key using SambaNova Cloud or deploy models using SambaStudio. Most examples are written in Python but can be applied to any programming language. The kits provide resources for tasks like text extraction, fine-tuning embeddings, prompt engineering, question-answering, image search, post-call analysis, and more.

github

: 215

fabric

Fabric is an open-source framework for augmenting humans using AI. It provides a structured approach to breaking down problems into individual components and applying AI to them one at a time. Fabric includes a collection of pre-defined Patterns (prompts) that can be used for a variety of tasks, such as extracting the most interesting parts of YouTube videos and podcasts, writing essays, summarizing academic papers, creating AI art prompts, and more. Users can also create their own custom Patterns. Fabric is designed to be easy to use, with a command-line interface and a variety of helper apps. It is also extensible, allowing users to integrate it with their own AI applications and infrastructure.

github

: 30.3k

Oxen

Oxen is a data version control library, written in Rust. It's designed to be fast, reliable, and easy to use. Oxen can be used in a variety of ways, from a simple command line tool to a remote server to sync to, to integrations into other ecosystems such as python.

github

: 219

sage

Sage is a tool that allows users to chat with any codebase, providing a chat interface for code understanding and integration. It simplifies the process of learning how a codebase works by offering heavily documented answers sourced directly from the code. Users can set up Sage locally or on the cloud with minimal effort. The tool is designed to be easily customizable, allowing users to swap components of the pipeline and improve the algorithms powering code understanding and generation.

github

: 705

code2prompt

code2prompt is a command-line tool that converts your codebase into a single LLM prompt with a source tree, prompt templating, and token counting. It automates generating LLM prompts from codebases of any size, customizing prompt generation with Handlebars templates, respecting .gitignore, filtering and excluding files using glob patterns, displaying token count, including Git diff output, copying prompt to clipboard, saving prompt to an output file, excluding files and folders, adding line numbers to source code blocks, and more. It helps streamline the process of creating LLM prompts for code analysis, generation, and other tasks.

github

: 5.1k

vectara-answer

Vectara Answer is a sample app for Vectara-powered Summarized Semantic Search (or question-answering) with advanced configuration options. For examples of what you can build with Vectara Answer, check out Ask News, LegalAid, or any of the other demo applications.

github

: 249

For similar tasks

elia

Elia is a powerful terminal user interface designed for interacting with large language models. It allows users to chat with models like Claude 3, ChatGPT, Llama 3, Phi 3, Mistral, and Gemma. Conversations are stored locally in a SQLite database, ensuring privacy. Users can run local models through 'ollama' without data leaving their machine. Elia offers easy installation with pipx and supports various environment variables for different models. It provides a quick start to launch chats and manage local models. Configuration options are available to customize default models, system prompts, and add new models. Users can import conversations from ChatGPT and wipe the database when needed. Elia aims to enhance user experience in interacting with language models through a user-friendly interface.

github

: 1.8k

chatgpt-web-sea

ChatGPT Web Sea is an open-source project based on ChatGPT-web for secondary development. It supports all models that comply with the OpenAI interface standard, allows for model selection, configuration, and extension, and is compatible with OneAPI. The tool includes a Chinese ChatGPT tuning guide, supports file uploads, and provides model configuration options. Users can interact with the tool through a web interface, configure models, and perform tasks such as model selection, API key management, and chat interface setup. The project also offers Docker deployment options and instructions for manual packaging.

github

: 52

dir-assistant

github

: 324

kubeai

KubeAI is a highly scalable AI platform that runs on Kubernetes, serving as a drop-in replacement for OpenAI with API compatibility. It can operate OSS model servers like vLLM and Ollama, with zero dependencies and additional OSS addons included. Users can configure models via Kubernetes Custom Resources and interact with models through a chat UI. KubeAI supports serving various models like Llama v3.1, Gemma2, and Qwen2, and has plans for model caching, LoRA finetuning, and image generation.

github

: 873

renumics-rag

Renumics RAG is a retrieval-augmented generation assistant demo that utilizes LangChain and Streamlit. It provides a tool for indexing documents and answering questions based on the indexed data. Users can explore and visualize RAG data, configure OpenAI and Hugging Face models, and interactively explore questions and document snippets. The tool supports GPU and CPU setups, offers a command-line interface for retrieving and answering questions, and includes a web application for easy access. It also allows users to customize retrieval settings, embeddings models, and database creation. Renumics RAG is designed to enhance the question-answering process by leveraging indexed documents and providing detailed answers with sources.

github

: 155

llm-term

LLM-Term is a Rust-based CLI tool that generates and executes terminal commands using OpenAI's language models or local Ollama models. It offers configurable model and token limits, works on both PowerShell and Unix-like shells, and provides a seamless user experience for generating commands based on prompts. Users can easily set up the tool, customize configurations, and leverage different models for command generation.

github

: 72

client

Gemini PHP is a PHP API client for interacting with the Gemini AI API. It allows users to generate content, chat, count tokens, configure models, embed resources, list models, get model information, troubleshoot timeouts, and test API responses. The client supports various features such as text-only input, text-and-image input, multi-turn conversations, streaming content generation, token counting, model configuration, and embedding techniques. Users can interact with Gemini's API to perform tasks related to natural language generation and text analysis.

github

: 198

chats

Sdcb Chats is a powerful and flexible frontend for large language models, supporting multiple functions and platforms. Whether you want to manage multiple model interfaces or need a simple deployment process, Sdcb Chats can meet your needs. It supports dynamic management of multiple large language model interfaces, integrates visual models to enhance user interaction experience, provides fine-grained user permission settings for security, real-time tracking and management of user account balances, easy addition, deletion, and configuration of models, transparently forwards user chat requests based on the OpenAI protocol, supports multiple databases including SQLite, SQL Server, and PostgreSQL, compatible with various file services such as local files, AWS S3, Minio, Aliyun OSS, Azure Blob Storage, and supports multiple login methods including Keycloak SSO and phone SMS verification.

github

: 265

For similar jobs

sweep

Sweep is an AI junior developer that turns bugs and feature requests into code changes. It automatically handles developer experience improvements like adding type hints and improving test coverage.

github

: 7.1k

teams-ai

The Teams AI Library is a software development kit (SDK) that helps developers create bots that can interact with Teams and Microsoft 365 applications. It is built on top of the Bot Framework SDK and simplifies the process of developing bots that interact with Teams' artificial intelligence capabilities. The SDK is available for JavaScript/TypeScript, .NET, and Python.

github

: 502

ai-guide

This guide is dedicated to Large Language Models (LLMs) that you can run on your home computer. It assumes your PC is a lower-end, non-gaming setup.

github

: 159

classifai

Supercharge WordPress Content Workflows and Engagement with Artificial Intelligence. Tap into leading cloud-based services like OpenAI, Microsoft Azure AI, Google Gemini and IBM Watson to augment your WordPress-powered websites. Publish content faster while improving SEO performance and increasing audience engagement. ClassifAI integrates Artificial Intelligence and Machine Learning technologies to lighten your workload and eliminate tedious tasks, giving you more time to create original content that matters.

github

: 620

chatbot-ui

Chatbot UI is an open-source AI chat app that allows users to create and deploy their own AI chatbots. It is easy to use and can be customized to fit any need. Chatbot UI is perfect for businesses, developers, and anyone who wants to create a chatbot.

github

: 27.7k

BricksLLM

BricksLLM is a cloud native AI gateway written in Go. Currently, it provides native support for OpenAI, Anthropic, Azure OpenAI and vLLM. BricksLLM aims to provide enterprise level infrastructure that can power any LLM production use cases. Here are some use cases for BricksLLM: * Set LLM usage limits for users on different pricing tiers * Track LLM usage on a per user and per organization basis * Block or redact requests containing PIIs * Improve LLM reliability with failovers, retries and caching * Distribute API keys with rate limits and cost limits for internal development/production use cases * Distribute API keys with rate limits and cost limits for students

github

: 953

uAgents

uAgents is a Python library developed by Fetch.ai that allows for the creation of autonomous AI agents. These agents can perform various tasks on a schedule or take action on various events. uAgents are easy to create and manage, and they are connected to a fast-growing network of other uAgents. They are also secure, with cryptographically secured messages and wallets.

github

: 1.3k

griptape

Griptape is a modular Python framework for building AI-powered applications that securely connect to your enterprise data and APIs. It offers developers the ability to maintain control and flexibility at every step. Griptape's core components include Structures (Agents, Pipelines, and Workflows), Tasks, Tools, Memory (Conversation Memory, Task Memory, and Meta Memory), Drivers (Prompt and Embedding Drivers, Vector Store Drivers, Image Generation Drivers, Image Query Drivers, SQL Drivers, Web Scraper Drivers, and Conversation Memory Drivers), Engines (Query Engines, Extraction Engines, Summary Engines, Image Generation Engines, and Image Query Engines), and additional components (Rulesets, Loaders, Artifacts, Chunkers, and Tokenizers). Griptape enables developers to create AI-powered applications with ease and efficiency.

github

: 2.2k

dir-assistant

README:

dir-assistant

Summary

Features

Table of Contents

New Features

Notable Upstream News

llama-cpp-python

Quickstart

Quickstart Chat with API Model

For Windows

For Ubuntu 24.04

Quickstart Chat with Local Default Model

For Windows

For Ubuntu 24.04

Quickstart Non-interactive Prompt with API Model

For Ubuntu 24.04

Install

For Ubuntu 24.04

General Usage Tips

Optimized Settings for Coding Assistance

Embedding Model Configuration

Optional: Select A Hardware Platform

For Ubuntu 24.04

For Platform Install Issues

API Configuration

Connecting to a Custom API Server

Local LLM Model Download

Configuring A Custom Local Model

Llama.cpp Options

Running

Options for Running

Automated file update and git commit

Additional directories

Ignoring files

Overriding Configurations with Environment Variables

Upgrading

Additional Help

Contributors

Acknowledgements

Limitations

Todos

Additional Credits

For Tasks:

For Jobs:

Alternative AI tools for dir-assistant

Similar Open Source Tools

dir-assistant

desktop

mlx-lm

llm-ollama

opencommit

reader

vector-inference

gpt-cli

repo-to-text

telemetry-airflow

ai-starter-kit

fabric

Oxen

sage

code2prompt

vectara-answer

For similar tasks

elia

chatgpt-web-sea

dir-assistant

kubeai

renumics-rag

llm-term

client

chats

For similar jobs

sweep

teams-ai

ai-guide

classifai

chatbot-ui

BricksLLM