
serena
a powerful coding agent with semantic retrieval and editing capabilities (MCP server)
Stars: 141

README:
- ๐ Serena is a powerful, fully-featured coding agent that works directly on your codebase.
- ๐ง Serena integrates with existing LLMs, providing them with essential semantic code retrieval and editing tools!
- ๐ Serena is free to use. No API keys or subscriptions required!
Q: Can I have a state-of-the-art coding agent without paying (enormous) API costs
or constantly purchasing tokens?
A: Yes, you can!
By integrating Serena with your favourite (even free) LLM and thereby enabling it
to perform coding tasks directly on your codebase.
Here you see Serena implementing a small feature for itself (a better log GUI) with Claude Desktop. Note the smart usage in finding and editing the right symbols.
https://github.com/user-attachments/assets/6eaa9aa1-610d-4723-a2d6-bf1e487ba753
Serena can be integrated with an LLM in several ways:
- by using the model context protocol (MCP).
Serena provides an MCP server which integrates with Claude (and soon also ChatGPT). - by using Agno โ the model-agnostic agent framework.
Serena's Agno-based agent allows you to turn virtually any LLM into a coding agent, whether it's provided by Google, OpenAI or DeepSeek (with a paid API key) or a free model provided by Ollama, Together or Anyscale. - by incorporating Serena's tools into an agent framework of your choice.
Serena's tool implementation is decoupled from the framework-specific code and can thus easily be adapted to any agent framework.
Serena's semantic code analysis capabilities build on language servers using the widely implemented language server protocol (LSP). The LSP provides a set of versatile code querying and editing functionalities based on symbolic understanding of the code. Equipped with these capabilities, Serena discovers and edits code just like a seasoned developer making use of an IDE's capabilities would. Serena can efficiently find the right context and do the right thing even in very large and complex projects! So not only is it free and open-source, it frequently achieves better results than existing solutions that charge a premium.
Language servers provide support for a wide range of programming languages. With Serena, we provide
- direct, out-of-the-box support for:
- Python
- Java (Note: startup is slow, initial startup especially so)
- TypeScript
- indirect support (may require some code changes/manual installation) for:
- Ruby (untested)
- Go (untested)
- C# (untested)
Further languages can easily be supported by providing a shallow adapter for a new language server implementation.
- Is It Really Free to Use?
- What Can I Use Serena For?
- Quick Start
- Serena's Tools and Configuration
- Comparison with Other Coding Agents
- Limitations of MCP Servers
- Onboarding and Memories
- Recommendations on Using Serena
- Acknowledgements
- Customizing Serena
- Full List of Tools
Yes! Even the free tier of Anthropic's Claude has support for MCP Servers, so you can use Serena with Claude for free.
Presumably, the same will soon be possible with ChatGPT Desktop once support for MCP servers is added.
Through Agno, you furthermore have the option to use Serena with a free/open-weights model.
Serena is Oraios AI's contribution to the developer community.
We use it ourselves on a regular basis.
We got tired of having to pay multiple IDE-based subscriptions (such as Windsurf or Cursor) that forced us to keep purchasing tokens on top of the chat subscription costs we already had. The substantial API costs incurred by tools like Claude Code, Cline, Aider and other API-based tools are similarly unattractive. We thus built Serena with the prospect of being able to cancel most other subscriptions.
You can use Serena for any coding tasks โ analyzing, planning, editing and so on. Serena can read, write and execute code, read logs and the terminal output. "Vibe coding" is possible, and if you want to almost feel like "the code no longer exists", you may find Serena even more adequate for vibing than an agent inside an IDE (since you will have a separate GUI that really lets you forget).
-
Install
uv
(instructions here) -
Clone the repository to
/path/to/serena
. -
Create a configuration file for your project, say
myproject.yml
based on the template in myproject.demo.yml. -
Configure the MCP server in your client.
For Claude Desktop, go to File / Settings / Developer / MCP Servers / Edit Config, which will let you open the json fileclaude_desktop_config.json
. Add the following (with adjusted paths) to enable Serena:{ "mcpServers": { "serena": { "command": "/abs/path/to/uv", "args": ["run", "--directory", "/abs/path/to/serena", "serena-mcp-server", "/abs/path/to/myproject.yml"] } } }
When using paths containing backslashes on Windows, be sure to escape them correctly (
\\
).
That's it! Save the config and then restart Claude Desktop.
After restarting, you should Serena's tools in your chat interface (notice the small hammer icon).
Note that Serena is always configured for a single project. To use it for another, you will have to write a new configuration file, adjust the configuration to point to it and then restart the client.
For more information on MCP servers with Claude Desktop, see the official quick start guide.
Agno is a model-agnostic agent framework that allows you to use Serena with a large number of underlying LLMs.
While Agno is not yet entirely stable, we chose it, because it comes with its own open-source UI, making it easy to directly use the agent using a chat interface.
Here's how it works (see also Agno's documentation):
-
Download the agent-ui code with npx
npx create-agent-ui@latest
or, alternatively, clone it manually:
git clone https://github.com/agno-agi/agent-ui.git cd agent-ui pnpm install pnpm dev
-
Install serena with the optional requirements:
# You can also only select agno,google or agno,anthropic instead of all-extras uv pip install --all-extras -e .
-
Copy
.env.example
to.env
and fill in the API keys for the provider(s) you intend to use. -
Start the agno agent app with
uv run python scripts/agno_agent.py
By default, the script uses Claude as the model, but you can choose any model supported by Agno (which is essentially any existing model).
-
In a new terminal, start the agno UI with
cd agent-ui pnpm dev
Connect the UI to the agent you started above and start chatting. You will have the same tools as in the MCP server version.
.yml
).
Serena combines tools for semantic code retrieval with editing capabilities and shell execution. Find the complete list of tools below.
The use of all tools is generally recommended, as this allows Serena to provide the most value: Only by executing shell commands (in particular, tests) can Serena identify and correct mistakes autonomously.
However, it should be noted that the execute_shell_command
tool allows for arbitrary code execution.
When using Serena as an MCP Server, clients will typically ask the user for permission
before executing a tool, so as long as the user inspects execution parameters beforehand,
this should not be a problem.
However, if you have concerns, you can choose to disable certain commands in your project's
.yml configuration file.
If you only want to use Serena purely for analyzing code and suggesting implementations
without modifying the codebase, you can consider disabling the editing tools in the configuration, i.e.
create_text_file
insert_after_symbol
insert_at_line
insert_before_symbol
replace_symbol_body
-
delete_lines
.
In general, be sure to back up your work and use a version control system in order to avoid losing any work.
To our knowledge, Serena is the first fully-featured coding agent where the entire functionality is available through an MCP server, thus not requiring API keys or subscriptions.
The most prominent subscription-based coding agents are parts of IDEs like Windsurf, Cursor and VSCode. Serena's functionality is similar to Cursor's Agent, Windsurf's Cascade or VSCode's upcoming agent mode.
Serena has the advantage of not requiring a subscription. A potential disadvantage is that it is not directly integrated into an IDE, so the inspection of newly written code is not as seamless.
More technical differences are:
- Serena is not bound to a specific IDE. Serena's MCP server can be used with any MCP client (including some IDEs), and the Agno-based agent provides additional ways of applying its functionality.
- Serena is not bound to a specific large language model or API.
- Serena navigates and edits code using a language server, so it has a symbolic understanding of the code. IDE-based tools often use a RAG-based or purely text-based approach, which is often less powerful, especially for large codebases.
- Serena is open-source and has a small codebase, so it can be easily extended and modified.
An alternative to subscription-based agents are API-based agents like Claude Code, Cline, Aider, Roo Code and others, where the usage costs map directly to the API costs of the underlying LLM. Some of them (like Cline) can even be included in IDEs as an extension. They are often very powerful and their main downside are the (potentially very high) API costs.
Serena itself can be used as an API-based agent (see the section on Agno above). We have not yet written a CLI tool or a dedicated IDE extension for Serena (and there is probably no need for the latter, as Serena can already be used with any IDE that supports MCP servers). If there is demand for a Serena as a CLI tool like Claude Code, we will consider writing one.
The main difference between Serena and other API-based agents is that Serena can also be used as an MCP server, thus not requiring an API key and bypassing the API costs. This is a unique feature of Serena.
There are other MCP servers designed for coding, like DesktopCommander and codemcp. However, to the best of our knowledge, none of them provide semantic code retrieval and editing tools; they rely purely on text-based analysis. It is the integration of language servers and the MCP that makes Serena unique and so powerful for challenging coding tasks, especially in the context of larger codebases.
The support for MCP Servers in Claude Desktop and the various MCP Server SDKs are relatively new developments, and we found them to be somewhat unstable. Sometimes, Claude Desktop will crash on a tool execution (with an asyncio error or something else of this kind). On the one hand, it can display show error messages that are no of consequence, and on the other, it can fail to show error messages when things fail irrecoverably. Yet we expect these stability issues to improve over time.
The working configuration of an MCP server may vary from platform to platform and from client to client. We recommend always using absolute paths, as relative paths may be sources of errors. The language server is running in a separate sub-process and is called with asyncio โ sometimes Claude Desktop lets it crash. If you have Serena's log window enabled, and it disappears, you'll know what happened.
For now, you may have to restart Claude Desktop multiple times, may have to manually cleanup lingering processes, and you may experiences freezes in conversations. Just try again in the latter case. Feel free to open issues if you encounter setup problems that you cannot solve.
To help with troubleshooting, we have written a small GUI utility for logging. We recommend that you enable it
through the project configuration (myproject.yml
) if you encounter problems. For Claude Desktop, there are also the MCP logs that can help
identify issues.
By default, Serena will perform an onboarding process when it is started for the first time for a project. The goal of the process is for Serena to get familiar with the project and to store memories, which it can then draw upon in future interactions.
Memroies are files stored in .serena/memories/
in the project directory,
which the agent can choose to read.
Feel free to read and adjust them as needed; you can also add new ones manually.
Every file in the .serena/memories/
directory is a memory file.
We found the memories to significantly improve the user experience with Serena. By itself, Serena is instructed to create new memories whenever appropriate.
We will continue to collect best practices as the Serena community grows. Below a short overview of things that we learned when using Serena internally.
Most of these recommendations are true for any coding agent, including all agents mentioned above.
To our surprise, Serena seemed to work best with the non-thinking version of Claude 3.7 vs its thinking version (we haven't yet made extensive comparisons to Gemini). The thinking version took longer, had more difficulties in using the tools, and often would just write code without reading enough context.
In our initial experiments, Gemini seemed to work very well. Unfortunately, Gemini does not support the MCP (yet?), so the only way to use it is through an API-key. On the bright side, Gemini is comparatively cheap and can handle huge context lengths.
In the very first interaction, Serena is instructed to perform an onboarding and write the first memory files. Sometimes (depending on the LLM), the files are not written to disk. In that case, just as Serena to write the memories.
In this phase Serena will usually read and write quite a lot of text and thereby fill up the context. We recommend that you switch to another conversation once the onboarding is performed in order to not run out of tokens. The onboarding will only be performed once, unless you explicitly trigger it.
After the onboarding, we recommend that you have a quick look at the memories and, if necessary, edit them or add additional ones.
It is best to start a code generation task from a clean git state. Not only will
this make it easier for you to inspect the changes, but also the model itself will
have a chance of seeing what it has changed by calling git diff
and thereby
correct itself or continue working in a followup conversation if needed.
git config core.autocrlf
to true
on Windows.
With git config core.autocrlf
set to false
on Windows, you may end up with huge diffs
only due to line endings. It is generally a good idea to enable this git setting on Windows:
git config --global core.autocrlf true
In our experience, LLMs are really bad at counting, i.e. they have problems inserting blocks of code in the right place. Most editing operations can be performed on a symbolic level, allowing this problem is overcome. However, sometimes, line-level insertions are useful.
Serena is instructed to double-check the line numbers and any code blocks that it will edit, but you may find it useful to explicitly tell it how to edit code if you run into problems.
For long and complicated tasks, or tasks where Serena has read a lot of content, you may come close to the limits of context tokens. In that case, it is often a good idea to continue in a new conversation. Serena has a dedicated tool to create a summary of the current state of the progress and all relevant info for continuing it. You can request to create this summary and write it to a memory. Then, in a new conversation, you can just ask Serena to read the memory and continue with the task. In our experience, this worked really well. On the up-side, since in a single session there is no summarization involved, Serena does not usually get lost (unlike some other agents that summarize under the hood), and it is also instructed to occasionally check whether it's on the right track.
Moreover, Serena is instructed to be frugal with context (e.g., to not read bodies of code symbols unnecessarily), but we found that Claude is not always very good in being frugal (Gemini seemed better at it). You can explicitly instruct it to not read the bodies if you know that it's not needed.
Claude Desktop will ask you before executing a tool. For most tools you can just safely
click on "Allow for this Chat", especially if all your files are under
version control. One exception is the execute_shell_command
tool - there you might want
to inspect each call individually. We recommend reviewing each call to this command and
not enabling it for the whole chat.
Serena uses the code structure for finding, reading and editing code. This means that it will work well with well-structured code but may fail with fully unstructured one (like a God-class with enormous, non-modular functions). Type annotations also help a lot here. The better your code, the better Serena will work. So we generally recommend you to write well-structured, modular and typed code - it will not only help you but also help your AI ;).
Serena cannot debug (no coding assistant can do this at the moment, to our knowledge). This means that for improving the results within an agent loop, Serena needs to acquire information by executing tests, running scripts, performing linting and so on. It is often very helpful to include many log messages with explicit information and to have meaningful tests. Especially the latter often help the agent to self-correct.
We generally recommend to start an editing task from a state where all linting checks and tests pass.
We found that it is often a good idea to spend some time conceptualizing and planning a task before actually implementing it, especially for non-trivial task. This helps both in achieving better results and in increasing the feeling of control and staying in the loop. You can make a detailed plan in one session, where Serena may read a lot of your code to build up the context, and then continue with the implementation in another (potentially after creating suitable memories).
We built Serena on top of multiple existing open-source technologies, the most important ones being:
- multilspy. A beautifully designed wrapper around language servers following the LSP. It was not easily extendable with the symbolic logic that Serena required, so instead of incorporating it as dependency, we copied the source code and adapted it to our needs.
- Python MCP SDK
- Agno and the associated agent-ui, which we use to allow Serena to work with any model, beyond the ones supporting the MCP.
- All the language servers that we use through multilspy.
Without these projects, Serena would not have been possible (or would have been significantly more difficult to build).
It is very easy to extend Serena with your own ideas. Just implement a new Tool by subclassing from
serena.agent.Tool
. By default, the SerenaAgent
will immediately have access to it. We look forward
to seeing what the community will come up with! For details on contributing, see here.
Here the full list of Serena's default tools with a short description (the output of uv run serena-list-tools
)
-
check_onboarding_performed
: Checks whether the onboarding was already performed. -
create_text_file
: Creates/overwrites a file in the project directory. -
delete_lines
: Deletes a range of lines within a file. -
delete_memory
: Deletes a memory from Serena's project-specific memory store. -
execute_shell_command
: Executes a shell command. -
find_referencing_symbols
: Finds symbols that reference the symbol at the given location (optionally filtered by type). -
find_symbol
: Performs a global (or local) search for symbols with/containing a given name/substring (optionally filtered by type). -
get_dir_overview
: Gets an overview of the top-level symbols defined in all files within a given directory. -
get_document_overview
: Gets an overview of the top-level symbols defined in a given file. -
insert_after_symbol
: Inserts content after the end of the definition of a given symbol. -
insert_at_line
: Inserts content at a given line in a file. -
insert_before_symbol
: Inserts content before the beginning of the definition of a given symbol. -
list_dir
: Lists files and directories in the given directory (optionally with recursion). -
list_memories
: Lists memories in Serena's project-specific memory store. -
onboarding
: Performs onboarding (identifying the project structure and essential tasks, e.g. for testing or building). -
prepare_for_new_conversation
: Provides instructions for preparing for a new conversation (in order to continue with the necessary context). -
read_file
: Reads a file within the project directory. -
read_memory
: Reads the memory with the given name from Serena's project-specific memory store. -
replace_symbol_body
: Replaces the full definition of a symbol. -
search_in_all_code
: Performs a search for a pattern in all code files (and only in code files) in the project. -
summarize_changes
: Provides instructions for summarizing the changes made to the codebase. -
think_about_collected_information
: Thinking tool for pondering the completeness of collected information. -
think_about_task_adherence
: Thinking tool for determining whether the agent is still on track with the current task. -
think_about_whether_you_are_done
: Thinking tool for determining whether the task is truly completed. -
write_memory
: Writes a named memory (for future reference) to Serena's project-specific memory store.
For Tasks:
Click tags to check more tools for each tasksFor Jobs:
Alternative AI tools for serena
Similar Open Source Tools

sorcery
Sorcery is a SillyTavern extension that allows AI characters to interact with the real world by executing user-defined scripts at specific events in the chat. It is easy to use and does not require a specially trained function calling model. Sorcery can be used to control smart home appliances, interact with virtual characters, and perform various tasks in the chat environment. It works by injecting instructions into the system prompt and intercepting markers to run associated scripts, providing a seamless user experience.

lumigator
Lumigator is an open-source platform developed by Mozilla.ai to help users select the most suitable language model for their specific needs. It supports the evaluation of summarization tasks using sequence-to-sequence models such as BART and BERT, as well as causal models like GPT and Mistral. The platform aims to make model selection transparent, efficient, and empowering by providing a framework for comparing LLMs using task-specific metrics to evaluate how well a model fits a project's needs. Lumigator is in the early stages of development and plans to expand support to additional machine learning tasks and use cases in the future.

tau
Tau is a framework for building low maintenance & highly scalable cloud computing platforms that software developers will love. It aims to solve the high cost and time required to build, deploy, and scale software by providing a developer-friendly platform that offers autonomy and flexibility. Tau simplifies the process of building and maintaining a cloud computing platform, enabling developers to achieve 'Local Coding Equals Global Production' effortlessly. With features like auto-discovery, content-addressing, and support for WebAssembly, Tau empowers users to create serverless computing environments, host frontends, manage databases, and more. The platform also supports E2E testing and can be extended using a plugin system called orbit.

lfai-landscape
LF AI & Data Landscape is a map to explore open source projects in the AI & Data domains, highlighting companies that are members of LF AI & Data. It showcases members of the Foundation and is modelled after the Cloud Native Computing Foundation landscape. The landscape includes current version, interactive version, new entries, logos, proper SVGs, corrections, external data, best practices badge, non-updated items, license, formats, installation, vulnerability reporting, and adjusting the landscape view.

reverse-engineering-assistant
ReVA (Reverse Engineering Assistant) is a project aimed at building a disassembler agnostic AI assistant for reverse engineering tasks. It utilizes a tool-driven approach, providing small tools to the user to empower them in completing complex tasks. The assistant is designed to accept various inputs, guide the user in correcting mistakes, and provide additional context to encourage exploration. Users can ask questions, perform tasks like decompilation, class diagram generation, variable renaming, and more. ReVA supports different language models for online and local inference, with easy configuration options. The workflow involves opening the RE tool and program, then starting a chat session to interact with the assistant. Installation includes setting up the Python component, running the chat tool, and configuring the Ghidra extension for seamless integration. ReVA aims to enhance the reverse engineering process by breaking down actions into small parts, including the user's thoughts in the output, and providing support for monitoring and adjusting prompts.

llms-txt
The llms-txt repository proposes a standardization on using an `/llms.txt` file to provide information to help large language models (LLMs) use a website at inference time. The `llms.txt` file is a markdown file that offers brief background information, guidance, and links to more detailed information in markdown files. It aims to provide concise and structured information for LLMs to access easily, helping users interact with websites via AI helpers. The repository also includes tools like a CLI and Python module for parsing `llms.txt` files and generating LLM context from them, along with a sample JavaScript implementation. The proposal suggests adding clean markdown versions of web pages alongside the original HTML pages to facilitate LLM readability and access to essential information.

llmap
LLMap is a CLI code search tool designed to automatically find context in large codebases by evaluating the relevance of each source file using DeepSeek-V3 and DeepSeek-R1. It optimizes analysis by performing multi-stage analysis and caching results for faster searches. Currently supports Java and Python files, with potential for extension to other languages. Install with 'pip install llmap-ai' and use with a DeepSeek API key to search for specific context in code.

qlora-pipe
qlora-pipe is a pipeline parallel training script designed for efficiently training large language models that cannot fit on one GPU. It supports QLoRA, LoRA, and full fine-tuning, with efficient model loading and the ability to load any dataset that Axolotl can handle. The script allows for raw text training, resuming training from a checkpoint, logging metrics to Tensorboard, specifying a separate evaluation dataset, training on multiple datasets simultaneously, and supports various models like Llama, Mistral, Mixtral, Qwen-1.5, and Cohere (Command R). It handles pipeline- and data-parallelism using Deepspeed, enabling users to set the number of GPUs, pipeline stages, and gradient accumulation steps for optimal utilization.

GlaDOS
This project aims to create a real-life version of GLaDOS, an aware, interactive, and embodied AI entity. It involves training a voice generator, developing a 'Personality Core,' implementing a memory system, providing vision capabilities, creating 3D-printable parts, and designing an animatronics system. The software architecture focuses on low-latency voice interactions, utilizing a circular buffer for data recording, text streaming for quick transcription, and a text-to-speech system. The project also emphasizes minimal dependencies for running on constrained hardware. The hardware system includes servo- and stepper-motors, 3D-printable parts for GLaDOS's body, animations for expression, and a vision system for tracking and interaction. Installation instructions cover setting up the TTS engine, required Python packages, compiling llama.cpp, installing an inference backend, and voice recognition setup. GLaDOS can be run using 'python glados.py' and tested using 'demo.ipynb'.

clippinator
Clippinator is a code assistant tool that helps users develop code autonomously by planning, writing, debugging, and testing projects. It consists of agents based on GPT-4 that work together to assist the user in coding tasks. The main agent, Taskmaster, delegates tasks to specialized subagents like Architect, Writer, Frontender, Editor, QA, and Devops. The tool provides project architecture, tools for file and terminal operations, browser automation with Selenium, linting capabilities, CI integration, and memory management. Users can interact with the tool to provide feedback and guide the coding process, making it a powerful tool when combined with human intervention.

llama-on-lambda
This project provides a proof of concept for deploying a scalable, serverless LLM Generative AI inference engine on AWS Lambda. It leverages the llama.cpp project to enable the usage of more accessible CPU and RAM configurations instead of limited and expensive GPU capabilities. By deploying a container with the llama.cpp converted models onto AWS Lambda, this project offers the advantages of scale, minimizing cost, and maximizing compute availability. The project includes AWS CDK code to create and deploy a Lambda function leveraging your model of choice, with a FastAPI frontend accessible from a Lambda URL. It is important to note that you will need ggml quantized versions of your model and model sizes under 6GB, as your inference RAM requirements cannot exceed 9GB or your Lambda function will fail.

AIlice
AIlice is a fully autonomous, general-purpose AI agent that aims to create a standalone artificial intelligence assistant, similar to JARVIS, based on the open-source LLM. AIlice achieves this goal by building a "text computer" that uses a Large Language Model (LLM) as its core processor. Currently, AIlice demonstrates proficiency in a range of tasks, including thematic research, coding, system management, literature reviews, and complex hybrid tasks that go beyond these basic capabilities. AIlice has reached near-perfect performance in everyday tasks using GPT-4 and is making strides towards practical application with the latest open-source models. We will ultimately achieve self-evolution of AI agents. That is, AI agents will autonomously build their own feature expansions and new types of agents, unleashing LLM's knowledge and reasoning capabilities into the real world seamlessly.

ollama-autocoder
Ollama Autocoder is a simple to use autocompletion engine that integrates with Ollama AI. It provides options for streaming functionality and requires specific settings for optimal performance. Users can easily generate text completions by pressing a key or using a command pallete. The tool is designed to work with Ollama API and a specified model, offering real-time generation of text suggestions.

modelbench
ModelBench is a tool for running safety benchmarks against AI models and generating detailed reports. It is part of the MLCommons project and is designed as a proof of concept to aggregate measures, relate them to specific harms, create benchmarks, and produce reports. The tool requires LlamaGuard for evaluating responses and a TogetherAI account for running benchmarks. Users can install ModelBench from GitHub or PyPI, run tests using Poetry, and create benchmarks by providing necessary API keys. The tool generates static HTML pages displaying benchmark scores and allows users to dump raw scores and manage cache for faster runs. ModelBench is aimed at enabling users to test their own models and create tests and benchmarks.

gpdb
Greenplum Database (GPDB) is an advanced, fully featured, open source data warehouse, based on PostgreSQL. It provides powerful and rapid analytics on petabyte scale data volumes. Uniquely geared toward big data analytics, Greenplum Database is powered by the worldโs most advanced cost-based query optimizer delivering high analytical query performance on large data volumes.