brokk

Brokk brings code intelligence to AI

Stars: 64

Visit

Brokk is a code assistant designed to understand code semantically, allowing LLMs to work effectively on large codebases. It offers features like agentic search, summarizing related classes, parsing stack traces, adding source for usages, and autonomously fixing errors. Users can interact with Brokk through different panels and commands, enabling them to manipulate context, ask questions, search codebase, run shell commands, and more. Brokk helps with tasks like debugging regressions, exploring codebase, AI-powered refactoring, and working with dependencies. It is particularly useful for making complex, multi-file edits with o1pro.

README:

Overview

Brokk (the Norse god of the forge) is the first code assistant that understands code semantically, not just as chunks of text. Brokk is designed to allow LLMs to work effectively on large codebases that cannot be jammed entirely into working context.

Getting started

Installation

Run using jbang (recommended):

Install jbang
- Linux / Mac: curl -Ls https://sh.jbang.dev | bash -s - app setup
- Windows (Powershell): iex "& { $(iwr https://ps.jbang.dev) } app setup"
- Others: seehttps://www.jbang.dev/download/
Run: jbang run brokk@jbellis/brokk

You can also download the JAR from Releases and run it manually.

Usage

Go to File -> Edit Secret Keys to configure your preferred LLM.
Go to File -> Open Project to open your project.

Brokk will attempt to infer a build command and style guide for your project. You can edit these in .brokk/project.properties and .brokk/style.md, respectively.

There is a Brokk Discord for questions and suggestions.

What Brokk can do

Ridiculously good agentic search / code retrieval. Better than Claude Code, better than Sourcegraph, better than Augment Code. Here are Brokk's explanation of "how does bm25 search work?" in the DataStax Cassandra repo (a brand-new feature, not in anyone's training set), starting cold with no context, compared to Claude Code's (probably the second-best code RAG out there).
Automatically determine the most-related classes to your working context and summarize them
Parse a stacktrace and add source for all the methods to your context
Add source for all the usages of a class, field, or method to your context
Parse "anonymous" context pieces from external commands
Build/lint your project and ask the LLM to fix errors autonomously

These allow some simple but powerful patterns:

"Here is the diff for commit X, which introduced a regression. Here is the stacktrace of the error and the full source of the methods involved. Find the bug."
"Here are the usages of Foo.bar. Is parameter zep always loaded from cache?"

Using Brokk

When you start Brokk, you’ll see five main areas:

Output: Displays the LLM or shell command output.
History: A chronological list of your actions. Can undo changes to context as well as to your code.
Command Input: Code, Ask, Search, and Run in Shell specify how your input is interpreted. Stop cancels the in-progress action.
Context: Lists active code/text fragments in your current context, specifying whether they’re read-only or editable. Manipulated through right-click menu or top-level menu.
Git: Log tab allows viewing diffs or adding them to context; Commit tab allows committing or stashing your changes

As you add context, Brokk will automatically include summaries of the most closely-related classes as determined by a graph analysis of your codebase. This helps the LLM avoid hallucinations when reasoning about your code. This is the "[Auto]" row that you see in the screenshot.

Primary Actions

Code: Tells the LLM you want code generation or modification.
Ask: Ask a question referencing the current context.
Search: Invokes a specialized agent that looks through your codebase for answers NOT in your current context.
Run in Shell: Executes any shell command, streaming the output into the Output Panel.
Stop: Cancels the currently running LLM or shell command.

Context Panel

Edit, Read: Decide which files the LLM can modify (editable) or just look at (read-only).
Summarize: Summarizes the specified classes (declarations and signatures, but no method bodies).
Drop: Removes snippets you no longer want in context.
Copy, Paste: Copy snippets to your clipboard or paste external text into Brokk’s context.
- Stacktraces get special treatment; they will be augmented with the source of referenced methods.
- URLs also get special treatment; their text will be retrieved and ingested
Symbol Usage: Pick a symbol (class, field, or method) and automatically gather all references into a snippet.
Call Graph To / Call Graph From: expands the call graph to or from the given function to the specified depth.

You can doubleclick on any context to preview it.

General Workflow

Add relevant code or text to your context (choose Edit for modifiable files, Read for reference-only).
Type instructions in the command box; use Code, Ask, Search, or Run in Shell as needed.
Capture or incorporate external context using Run combined witn “Capture Text” or “Edit Files.”
Use the History Panel to keep track, undo, or redo changes. Forget to commit and the LLM scribbled all over your code in the next request? No problem, Undo includes filesystem changes too.

Examples

Here are a few scenarios illustrating how Brokk helps with real-world tasks.

Scenario #1: Debugging a Regression with Git Bisect

Run git bisect to identify the commit that caused a regression.
Load the commit and the files changed by that commit as editable context: run git show [revision], then Capture Text and Edit References. (You can also select the new context fragment in the context table and click Edit Files from there; Edit References is a shortcut.)
Paste the stacktrace corresponding to the regression with ctrl-V or the Paste button.
Tell the LLM: "This stacktrace is caused by a change in the attached diff. Look at the changes to see what could cause the problem, and fix it."

Scenario #2: Exploring an unfamiliar part of the codebase

You want to know how that BM25 search feature your colleague wrote works. Type "how does bm25 search work?" into the Instructions area and click Search.
The Search output is automatically captured as context; if you want to make changes, select it and click Edit Files.

Scenario #3: AI-powered refactoring

Invoke Symbol Usage on Project::getAnalyzerWrapper, and click Edit Files on the resulting usage context. This will make all files editable that include calls to that method.
Add Project itself as editable. Brokk automatically includes a summary of AnalyzerWrapper in the auto-context.
Type your instructions into the instructions area and click Code: Replace Project.getAnalyzerWrapper with getAnalyzer() and getAnalyzerNonBlocking() that encapsulate aw.get and aw.getNonBlocking; update the callers appropriately.

Working with dependencies

Often you find yourself working with poorly documented dependencies that your LLM doesn't know enough about to use without hallucinating. Brokk can help!

Check out the source code and open it as a Brokk project. Then click on Summarize Fields and use ** globbing to select everything. (Usually you will want to target e.g. src/main and not src/ to leave out test code.)

Brokk will summarize all the classes; now you can doubleclick on the context to make sure it's what you wanted, then copy it and either paste it directly as context into Brokk as a one-off, or save it as a file for re-use. In this example, I did this twice in the Gumtree library: once for core/ and again for client/.

If you have a more targeted idea of what you need, you can also pick just those classes and dial up the AutoContext size to get the surrounding infrastructure. Here I've left the Gumtree client summary and let AutoContext=20 do its thing. This is 5x smaller than summarizing all of core:

A note on o1pro

Brokk is particularly useful when making complex, multi-file edits with o1pro.

After setting up your session, use copy to pull all the content, including Brokk's prompts, into your clipboard. Paste into o1pro and add your request at the bottom in the section. Then paste o1pro's response back into Brokk and have it apply the edits with the Code action.

Current status

We are currently focused on making Brokk's Java support the best in the world. Other languages will follow.

Known issues

"Stop" button does not work reliably during search. This is caused by https://github.com/langchain4j/langchain4j/issues/2658
Joern (the code intelligence engine) needs to run delombok before it can analyze anything. Delombok is extremely slow for anything but trivial projects, making Brokk a poor fit for large Lombok-using codebases.

Finer points on some commands

Brokk doesn't offer automatic running of tests (too much variance in what you might want it to do). Instead, Brokk allows you to run arbitrary shell commands, and import those as context with "Capture Text" or "Edit Files." You can easily run your tests this way and have Brokk work on the results. If you really want Brokk to always run a test suite after making edits, you can change buildCommand in .brokk/project.properties accordingly.
There is some overlap between Symbol Usage and Call Graph to Function; besides the former being just a single level deep in the call graph, Symbol Uage includes the entire source of each calling method while Call Graph to Function only includes one line per call.

Contributing

Brokk uses sbt (Scala Build Tool) since it has a Scala component. To build Brokk,

Install sbt (e.g. with sdkman)
Run the sbt repl: sbt
In the sbt repl, run individual commands: run, clean, test, assembly, etc.

(You can run a single command without the repl with e.g. sbt run but sbt has a very high startup overhead so using the repl is recommended.)

For Tasks:

Click tags to check more tools for each tasks

debug regression explore codebase refactor code work with dependencies make multi-file edits

For Jobs:

software engineer code analyst ai developer software architect developer advocate

Alternative AI tools for brokk

Similar Open Source Tools

brokk

github

: 64

ollama-autocoder

Ollama Autocoder is a simple to use autocompletion engine that integrates with Ollama AI. It provides options for streaming functionality and requires specific settings for optimal performance. Users can easily generate text completions by pressing a key or using a command pallete. The tool is designed to work with Ollama API and a specified model, offering real-time generation of text suggestions.

github

: 92

airbroke

Airbroke is an open-source error catcher tool designed for modern web applications. It provides a PostgreSQL-based backend with an Airbrake-compatible HTTP collector endpoint and a React-based frontend for error management. The tool focuses on simplicity, maintaining a small database footprint even under heavy data ingestion. Users can ask AI about issues, replay HTTP exceptions, and save/manage bookmarks for important occurrences. Airbroke supports multiple OAuth providers for secure user authentication and offers occurrence charts for better insights into error occurrences. The tool can be deployed in various ways, including building from source, using Docker images, deploying on Vercel, Render.com, Kubernetes with Helm, or Docker Compose. It requires Node.js, PostgreSQL, and specific system resources for deployment.

github

: 179

llm.c

LLM training in simple, pure C/CUDA. There is no need for 245MB of PyTorch or 107MB of cPython. For example, training GPT-2 (CPU, fp32) is ~1,000 lines of clean code in a single file. It compiles and runs instantly, and exactly matches the PyTorch reference implementation. I chose GPT-2 as the first working example because it is the grand-daddy of LLMs, the first time the modern stack was put together.

github

: 23.4k

aici

The Artificial Intelligence Controller Interface (AICI) lets you build Controllers that constrain and direct output of a Large Language Model (LLM) in real time. Controllers are flexible programs capable of implementing constrained decoding, dynamic editing of prompts and generated text, and coordinating execution across multiple, parallel generations. Controllers incorporate custom logic during the token-by-token decoding and maintain state during an LLM request. This allows diverse Controller strategies, from programmatic or query-based decoding to multi-agent conversations to execute efficiently in tight integration with the LLM itself.

github

: 1.8k

lovelaice

Lovelaice is an AI-powered assistant for your terminal and editor. It can run bash commands, search the Internet, answer general and technical questions, complete text files, chat casually, execute code in various languages, and more. Lovelaice is configurable with API keys and LLM models, and can be used for a wide range of tasks requiring bash commands or coding assistance. It is designed to be versatile, interactive, and helpful for daily tasks and projects.

github

: 54

LLM_Web_search

LLM_Web_search project gives local LLMs the ability to search the web by outputting a specific command. It uses regular expressions to extract search queries from model output and then utilizes duckduckgo-search to search the web. LangChain's Contextual compression and Okapi BM25 or SPLADE are used to extract relevant parts of web pages in search results. The extracted results are appended to the model's output.

github

: 232

llama-on-lambda

This project provides a proof of concept for deploying a scalable, serverless LLM Generative AI inference engine on AWS Lambda. It leverages the llama.cpp project to enable the usage of more accessible CPU and RAM configurations instead of limited and expensive GPU capabilities. By deploying a container with the llama.cpp converted models onto AWS Lambda, this project offers the advantages of scale, minimizing cost, and maximizing compute availability. The project includes AWS CDK code to create and deploy a Lambda function leveraging your model of choice, with a FastAPI frontend accessible from a Lambda URL. It is important to note that you will need ggml quantized versions of your model and model sizes under 6GB, as your inference RAM requirements cannot exceed 9GB or your Lambda function will fail.

github

: 150

chronon

Chronon is a platform that simplifies and improves ML workflows by providing a central place to define features, ensuring point-in-time correctness for backfills, simplifying orchestration for batch and streaming pipelines, offering easy endpoints for feature fetching, and guaranteeing and measuring consistency. It offers benefits over other approaches by enabling the use of a broad set of data for training, handling large aggregations and other computationally intensive transformations, and abstracting away the infrastructure complexity of data plumbing.

github

: 766

FigStep

FigStep is a black-box jailbreaking algorithm against large vision-language models (VLMs). It feeds harmful instructions through the image channel and uses benign text prompts to induce VLMs to output contents that violate common AI safety policies. The tool highlights the vulnerability of VLMs to jailbreaking attacks, emphasizing the need for safety alignments between visual and textual modalities.

github

: 52

atomic_agents

Atomic Agents is a modular and extensible framework designed for creating powerful applications. It follows the principles of Atomic Design, emphasizing small and single-purpose components. Leveraging Pydantic for data validation and serialization, the framework offers a set of tools and agents that can be combined to build AI applications. It depends on the Instructor package and supports various APIs like OpenAI, Cohere, Anthropic, and Gemini. Atomic Agents is suitable for developers looking to create AI agents with a focus on modularity and flexibility.

github

: 236

langgraph-studio

LangGraph Studio is a specialized agent IDE that enables visualization, interaction, and debugging of complex agentic applications. It offers visual graphs and state editing to better understand agent workflows and iterate faster. Users can collaborate with teammates using LangSmith to debug failure modes. The tool integrates with LangSmith and requires Docker installed. Users can create and edit threads, configure graph runs, add interrupts, and support human-in-the-loop workflows. LangGraph Studio allows interactive modification of project config and graph code, with live sync to the interactive graph for easier iteration on long-running agents.

github

: 1.5k

llama3-tokenizer-js

JavaScript tokenizer for LLaMA 3 designed for client-side use in the browser and Node, with TypeScript support. It accurately calculates token count, has 0 dependencies, optimized running time, and somewhat optimized bundle size. Compatible with most LLaMA 3 models. Can encode and decode text, but training is not supported. Pollutes global namespace with `llama3Tokenizer` in the browser. Mostly compatible with LLaMA 3 models released by Facebook in April 2024. Can be adapted for incompatible models by passing custom vocab and merge data. Handles special tokens and fine tunes. Developed by belladore.ai with contributions from xenova, blaze2004, imoneoi, and ConProgramming.

github

: 104

ezkl

EZKL is a library and command-line tool for doing inference for deep learning models and other computational graphs in a zk-snark (ZKML). It enables the following workflow: 1. Define a computational graph, for instance a neural network (but really any arbitrary set of operations), as you would normally in pytorch or tensorflow. 2. Export the final graph of operations as an .onnx file and some sample inputs to a .json file. 3. Point ezkl to the .onnx and .json files to generate a ZK-SNARK circuit with which you can prove statements such as: > "I ran this publicly available neural network on some private data and it produced this output" > "I ran my private neural network on some public data and it produced this output" > "I correctly ran this publicly available neural network on some public data and it produced this output" In the backend we use the collaboratively-developed Halo2 as a proof system. The generated proofs can then be verified with much less computational resources, including on-chain (with the Ethereum Virtual Machine), in a browser, or on a device.

github

: 1.0k

serena

github

: 363

AirSane

AirSane is a SANE frontend and scanner server that supports Apple's AirScan protocol. It automatically detects scanners and publishes them through mDNS. Acquired images can be transferred in JPEG, PNG, and PDF/raster format. The tool is intended to be used with AirScan/eSCL clients such as Apple's Image Capture, sane-airscan on Linux, and the eSCL client built into Windows 10 and 11. It provides a simple web interface and encodes images on-the-fly to keep memory/storage demands low, making it suitable for devices like Raspberry Pi. Authentication and secure communication are supported in conjunction with a proxy server like nginx. AirSane has been reverse-engineered from Apple's AirScanScanner client communication protocol and offers a range of installation and configuration options for different operating systems.

github

: 224

For similar tasks

brokk

github

: 64

chat-with-code

Chat-with-code is a codebase chatbot that enables users to interact with their codebase using the OpenAI Language Model. It provides a user-friendly chat interface where users can ask questions and interact with their code. The tool clones, chunks, and embeds the codebase, allowing for natural language interactions. It is designed to assist users in exploring and understanding their codebase more intuitively.

github

: 51

Devon

Devon is an open-source pair programmer tool designed to facilitate collaborative coding sessions. It provides features such as multi-file editing, codebase exploration, test writing, bug fixing, and architecture exploration. The tool supports Anthropic, OpenAI, and Groq APIs, with plans to add more models in the future. Devon is community-driven, with ongoing development goals including multi-model support, plugin system for tool builders, self-hostable Electron app, and setting SOTA on SWE-bench Lite. Users can contribute to the project by developing core functionality, conducting research on agent performance, providing feedback, and testing the tool.

github

: 2.6k

sage

Sage is a tool that allows users to chat with any codebase, providing a chat interface for code understanding and integration. It simplifies the process of learning how a codebase works by offering heavily documented answers sourced directly from the code. Users can set up Sage locally or on the cloud with minimal effort. The tool is designed to be easily customizable, allowing users to swap components of the pipeline and improve the algorithms powering code understanding and generation.

github

: 705

sweep

Sweep is an AI junior developer that turns bugs and feature requests into code changes. It automatically handles developer experience improvements like adding type hints and improving test coverage.

github

: 7.1k

sourcegraph

Sourcegraph is a code search and navigation tool that helps developers read, write, and fix code in large, complex codebases. It provides features such as code search across all repositories and branches, code intelligence for navigation and refactoring, and the ability to fix and refactor code across multiple repositories at once.

github

: 10.0k

continue

Continue is an open-source autopilot for VS Code and JetBrains that allows you to code with any LLM. With Continue, you can ask coding questions, edit code in natural language, generate files from scratch, and more. Continue is easy to use and can help you save time and improve your coding skills.

github

: 25.2k

cody

Cody is a free, open-source AI coding assistant that can write and fix code, provide AI-generated autocomplete, and answer your coding questions. Cody fetches relevant code context from across your entire codebase to write better code that uses more of your codebase's APIs, impls, and idioms, with less hallucination.

github

: 3.5k

For similar jobs

promptflow

**Prompt flow** is a suite of development tools designed to streamline the end-to-end development cycle of LLM-based AI applications, from ideation, prototyping, testing, evaluation to production deployment and monitoring. It makes prompt engineering much easier and enables you to build LLM apps with production quality.

github

: 9.2k

deepeval

DeepEval is a simple-to-use, open-source LLM evaluation framework specialized for unit testing LLM outputs. It incorporates various metrics such as G-Eval, hallucination, answer relevancy, RAGAS, etc., and runs locally on your machine for evaluation. It provides a wide range of ready-to-use evaluation metrics, allows for creating custom metrics, integrates with any CI/CD environment, and enables benchmarking LLMs on popular benchmarks. DeepEval is designed for evaluating RAG and fine-tuning applications, helping users optimize hyperparameters, prevent prompt drifting, and transition from OpenAI to hosting their own Llama2 with confidence.

github

: 5.8k

MegaDetector

MegaDetector is an AI model that identifies animals, people, and vehicles in camera trap images (which also makes it useful for eliminating blank images). This model is trained on several million images from a variety of ecosystems. MegaDetector is just one of many tools that aims to make conservation biologists more efficient with AI. If you want to learn about other ways to use AI to accelerate camera trap workflows, check out our of the field, affectionately titled "Everything I know about machine learning and camera traps".

github

: 106

leapfrogai

LeapfrogAI is a self-hosted AI platform designed to be deployed in air-gapped resource-constrained environments. It brings sophisticated AI solutions to these environments by hosting all the necessary components of an AI stack, including vector databases, model backends, API, and UI. LeapfrogAI's API closely matches that of OpenAI, allowing tools built for OpenAI/ChatGPT to function seamlessly with a LeapfrogAI backend. It provides several backends for various use cases, including llama-cpp-python, whisper, text-embeddings, and vllm. LeapfrogAI leverages Chainguard's apko to harden base python images, ensuring the latest supported Python versions are used by the other components of the stack. The LeapfrogAI SDK provides a standard set of protobuffs and python utilities for implementing backends and gRPC. LeapfrogAI offers UI options for common use-cases like chat, summarization, and transcription. It can be deployed and run locally via UDS and Kubernetes, built out using Zarf packages. LeapfrogAI is supported by a community of users and contributors, including Defense Unicorns, Beast Code, Chainguard, Exovera, Hypergiant, Pulze, SOSi, United States Navy, United States Air Force, and United States Space Force.

github

: 255

llava-docker

This Docker image for LLaVA (Large Language and Vision Assistant) provides a convenient way to run LLaVA locally or on RunPod. LLaVA is a powerful AI tool that combines natural language processing and computer vision capabilities. With this Docker image, you can easily access LLaVA's functionalities for various tasks, including image captioning, visual question answering, text summarization, and more. The image comes pre-installed with LLaVA v1.2.0, Torch 2.1.2, xformers 0.0.23.post1, and other necessary dependencies. You can customize the model used by setting the MODEL environment variable. The image also includes a Jupyter Lab environment for interactive development and exploration. Overall, this Docker image offers a comprehensive and user-friendly platform for leveraging LLaVA's capabilities.

github

: 59

carrot

The 'carrot' repository on GitHub provides a list of free and user-friendly ChatGPT mirror sites for easy access. The repository includes sponsored sites offering various GPT models and services. Users can find and share sites, report errors, and access stable and recommended sites for ChatGPT usage. The repository also includes a detailed list of ChatGPT sites, their features, and accessibility options, making it a valuable resource for ChatGPT users seeking free and unlimited GPT services.

github

: 17.1k

TrustLLM

TrustLLM is a comprehensive study of trustworthiness in LLMs, including principles for different dimensions of trustworthiness, established benchmark, evaluation, and analysis of trustworthiness for mainstream LLMs, and discussion of open challenges and future directions. Specifically, we first propose a set of principles for trustworthy LLMs that span eight different dimensions. Based on these principles, we further establish a benchmark across six dimensions including truthfulness, safety, fairness, robustness, privacy, and machine ethics. We then present a study evaluating 16 mainstream LLMs in TrustLLM, consisting of over 30 datasets. The document explains how to use the trustllm python package to help you assess the performance of your LLM in trustworthiness more quickly. For more details about TrustLLM, please refer to project website.

github

: 535

AI-YinMei

AI-YinMei is an AI virtual anchor Vtuber development tool (N card version). It supports fastgpt knowledge base chat dialogue, a complete set of solutions for LLM large language models: [fastgpt] + [one-api] + [Xinference], supports docking bilibili live broadcast barrage reply and entering live broadcast welcome speech, supports Microsoft edge-tts speech synthesis, supports Bert-VITS2 speech synthesis, supports GPT-SoVITS speech synthesis, supports expression control Vtuber Studio, supports painting stable-diffusion-webui output OBS live broadcast room, supports painting picture pornography public-NSFW-y-distinguish, supports search and image search service duckduckgo (requires magic Internet access), supports image search service Baidu image search (no magic Internet access), supports AI reply chat box [html plug-in], supports AI singing Auto-Convert-Music, supports playlist [html plug-in], supports dancing function, supports expression video playback, supports head touching action, supports gift smashing action, supports singing automatic start dancing function, chat and singing automatic cycle swing action, supports multi scene switching, background music switching, day and night automatic switching scene, supports open singing and painting, let AI automatically judge the content.

github

: 529