seer

Seer is a service that provides AI capabilities to Sentry, running inference on Sentry issues and providing insights to users.

Stars: 87

Visit

Seer is a service that provides AI capabilities to Sentry by running inference on Sentry issues and providing user insights. It is currently in early development and not yet compatible with self-hosted Sentry instances. The tool requires access to internal Sentry resources and is intended for internal Sentry employees. Users can set up the environment, download model artifacts, integrate with local Sentry, run evaluations for Autofix AI agent, and deploy to a sandbox staging environment. Development commands include applying database migrations, creating new migrations, running tests, and more. The tool also supports VCRs for recording and replaying HTTP requests.

README:

Seer

Seer is a service that provides AI capabilities to Sentry by running inference on Sentry issues and providing user insights.

📣 Seer is currently in early development and not yet compatible with self-hosted Sentry instances. Stay tuned for updates!

Setup

These instructions require access to internal Sentry resources and are intended for internal Sentry employees.

Prerequisites

Install direnv or a similar tool
Install pyenv and configure Python 3.11

pyenv install 3.11
pyenv local 3.11

Install Docker. Note that if you want to install Docker from brew instead of Docker Desktop, then you would need to install docker-compose as well.
Install Google Cloud SDK and authenticate.

Environment Setup

Clone the repository and navigate to the project root
Run direnv allow to set up the Python environment
Create a .env file based on .env.example and set the required values
(Optional) Add SENTRY_AUTH_TOKEN=<your token> to your .env file

Model Artifacts

Download model artifacts:

gsutil cp -r gs://sentry-ml/seer/models .

If you see a prompt "Reauthentication required. Please insert your security key and press enter...", re-authenticate using the command gcloud auth login and set the project id to the one for Seer.

Running Seer

Start the development environment:
```
make dev
```
If you encounter database errors, run:
```
make update
```
If you encounter authentication errors, run:
```
gcloud auth application-default login
```

Integrating with Local Sentry

Expose port 9091 in your local Sentry configuration

Add the following to ~/.sentry/sentry.conf.py:

SEER_RPC_SHARED_SECRET = ["seers-also-very-long-value-haha"]
SENTRY_FEATURES['projects:ai-autofix'] = True
SENTRY_FEATURES['organizations:issue-details-autofix-ui'] = True

For local development, you may need to bypass certain checks in the Sentry codebase
Restart both Sentry and Seer

[!NOTE] Set NO_SENTRY_INTEGRATION=1 in .env to ignore Local Sentry Integration

Development Commands

Apply database migrations: make update
Create new migrations: make migration
Run type checker: make mypy
Run tests: make test
Open a shell: make shell
Update requirements.txt based on requirements-constraints.txt: make upgrade-package-versions

Reset Development Environment

To start fresh:

bash
docker compose down --volumes
make update && make dev

Running Multiple Instances of Seer

To run multiple instances of Seer, you should set unique port values for each instance in the .env file.

RABBITMQ_PORT=...
RABBITMQ_CONFIG_PORT=...
DB_PORT=...
APP_PORT=...

Langfuse Integration

To enable Langfuse tracing, set these environment variables:

LANGFUSE_SECRET_KEY=...
LANGFUSE_PUBLIC_KEY=...
LANGFUSE_HOST=...

Autofix

Autofix is an AI agent that identifies root causes of Sentry issues and suggests fixes.

Running Evaluations

Send a POST request to /v1/automation/autofix/evaluations/start with the following JSON body:

{
"dataset_name": "string", // Name of the dataset to run on (currently only internal datasets available)
"run_name": "string", // Custom name for your evaluation run
"run_description": "string", // Description of your evaluation run
"run_type": "full | root_cause | execution", // Type of evaluation to perform
"test": boolean, // Set to true to run on a single item (for testing)
"random_for_test": boolean, // Set to true to use a random item when testing (requires "test": true)
"run_on_item_id": "string", // Specific item ID to run on (optional)
"n_runs_per_item": int // Number of runs to perform per item (optional, default 1)
}

Note: Currently, only internal datasets are available.

Staging Sandbox

It is possible to run and deploy seer to a sandbox staging environment. An example of such a deployment is in this PR.

To get started, use the #proj-tf-sandbox channel and request direction or help on scaffolding a new sandbox in the sandbox repo.

You then can use the iap, and seer-staging modules to scaffold a public load balancer pointing to a compute running the docker-compose.staging.yml file.

After scaffolding your environment, you'll want to set your SBX_PROJECT environment variable in your .env file, and run make push-staging to submit a cloud build for your image.

!!!!NOTE!!!! The staging cloud build uses your current local environment to build the image, not CI, which means it will use all your src files and your local .env file to configure the image that will be hosted in your sandbox. Make sure you don't accidentally include any sensitive personal files in your source tree before using this.

Each time you push with make push-staging there will be a period of time while the VM polls and unpacks the new image before it is loaded. If you have a SENTRY_DSN and SENTRY_ENVIRONMENT set, a release will be created by the push, allowing you to track when the server has loaded that release version.

Running Tests

You can run all tests with make test.

Running Individual Tests

Make sure you have the test database running when running individual tests, do that via docker compose up --remove-orphans -d test-db.

To run a single test, make sure you're in a shell, by doing make shell, and then run pytest tests/path/to/test.py::test_name.

VCRs

VCRs are a way to record and replay HTTP requests. They are useful for recording requests from external services that you don't control instead of mocking them.

To use VCRs, add the @pytest.mark.vcr() decorator to your test.

To record new VCRs, delete the existing cassettes and run the test. Subsequent test runs will use the cassette instead of making requests.

VCR Encryption

You must not commit the raw VCRs to the repo. Instead, you must encrypt them using make vcr-encrypt and decrypt them using make vcr-decrypt.

Before first time running encryption or decryption, you will need to run make vcr-encrypt-prep to install the required libraries and authenticate with GCP.

Using encrypted VCRs

Before committing the VCRs, you must run make vcr-encrypt to encrypt them.

By default, the CLEAN=1 flag is set, so the encrypted cassettes will match exactly what you have in your local. If you want to not overwrite files, run make vcr-encrypt CLEAN=0.

If you want to run tests with VCRs enabled, you must run make vcr-decrypt to decrypt them.

By default, the CLEAN flag is not set, so files in your local that don't exist in the repo will not be deleted. If you want your local to match exactly what is in the encrypted cassettes, run with CLEAN=1.

Split Tests in CI

In CI, we split the tests into groups to run in parallel. The groups are divided by the test durations.

The durations are stored in the .test_durations file.

To update the durations, run in a make shell:

pip install pytest-split
pytest --store-durations

The durations should be updated every once in a while or especially when you add potentially slow tests.

Production

Celery Worker Queue

You can set the queue that the celery worker listens on via the CELERY_WORKER_QUEUE environment variable.

If not set, the default queue name is "seer".

For Tasks:

Click tags to check more tools for each tasks

run evaluations integrate with local sentry deploy to staging apply database migrations run tests

For Jobs:

ai engineer data scientist software developer machine learning engineer devops engineer

Alternative AI tools for seer

Similar Open Source Tools

seer

github

: 87

ai-town

AI Town is a virtual town where AI characters live, chat, and socialize. This project provides a deployable starter kit for building and customizing your own version of AI Town. It features a game engine, database, vector search, auth, text model, deployment, pixel art generation, background music generation, and local inference. You can customize your own simulation by creating characters and stories, updating spritesheets, changing the background, and modifying the background music.

github

: 6.3k

opencommit

OpenCommit is a tool that auto-generates meaningful commits using AI, allowing users to quickly create commit messages for their staged changes. It provides a CLI interface for easy usage and supports customization of commit descriptions, emojis, and AI models. Users can configure local and global settings, switch between different AI providers, and set up Git hooks for integration with IDE Source Control. Additionally, OpenCommit can be used as a GitHub Action to automatically improve commit messages on push events, ensuring all commits are meaningful and not generic. Payments for OpenAI API requests are handled by the user, with the tool storing API keys locally.

github

: 5.9k

cagent

cagent is a powerful and easy-to-use multi-agent runtime that orchestrates AI agents with specialized capabilities and tools, allowing users to quickly build, share, and run a team of virtual experts to solve complex problems. It supports creating agents with YAML configuration, improving agents with MCP servers, and delegating tasks to specialists. Key features include multi-agent architecture, rich tool ecosystem, smart delegation, YAML configuration, advanced reasoning tools, and support for multiple AI providers like OpenAI, Anthropic, Gemini, and Docker Model Runner.

github

: 1.2k

ChatGPT-OpenAI-Smart-Speaker

ChatGPT Smart Speaker is a project that enables speech recognition and text-to-speech functionalities using OpenAI and Google Speech Recognition. It provides scripts for running on PC/Mac and Raspberry Pi, allowing users to interact with a smart speaker setup. The project includes detailed instructions for setting up the required hardware and software dependencies, along with customization options for the OpenAI model engine, language settings, and response randomness control. The Raspberry Pi setup involves utilizing the ReSpeaker hardware for voice feedback and light shows. The project aims to offer an advanced smart speaker experience with features like wake word detection and response generation using AI models.

github

: 188

vectorflow

VectorFlow is an open source, high throughput, fault tolerant vector embedding pipeline. It provides a simple API endpoint for ingesting large volumes of raw data, processing, and storing or returning the vectors quickly and reliably. The tool supports text-based files like TXT, PDF, HTML, and DOCX, and can be run locally with Kubernetes in production. VectorFlow offers functionalities like embedding documents, running chunking schemas, custom chunking, and integrating with vector databases like Pinecone, Qdrant, and Weaviate. It enforces a standardized schema for uploading data to a vector store and supports features like raw embeddings webhook, chunk validation webhook, S3 endpoint, and telemetry. The tool can be used with the Python client and provides detailed instructions for running and testing the functionalities.

github

: 639

dravid

Dravid (DRD) is an advanced, AI-powered CLI coding framework designed to follow user instructions until the job is completed, including fixing errors. It can generate code, fix errors, handle image queries, manage file operations, integrate with external APIs, and provide a development server with error handling. Dravid is extensible and requires Python 3.7+ and CLAUDE_API_KEY. Users can interact with Dravid through CLI commands for various tasks like creating projects, asking questions, generating content, handling metadata, and file-specific queries. It supports use cases like Next.js project development, working with existing projects, exploring new languages, Ruby on Rails project development, and Python project development. Dravid's project structure includes directories for source code, CLI modules, API interaction, utility functions, AI prompt templates, metadata management, and tests. Contributions are welcome, and development setup involves cloning the repository, installing dependencies with Poetry, setting up environment variables, and using Dravid for project enhancements.

github

: 114

aides-jeunes

The user interface (and the main server) of the simulator of aids and social benefits for young people. It is based on the free socio-fiscal simulator Openfisca.

github

: 79

qb

QANTA is a system and dataset for question answering tasks. It provides a script to download datasets, preprocesses questions, and matches them with Wikipedia pages. The system includes various datasets, training, dev, and test data in JSON and SQLite formats. Dependencies include Python 3.6, `click`, and NLTK models. Elastic Search 5.6 is needed for the Guesser component. Configuration is managed through environment variables and YAML files. QANTA supports multiple guesser implementations that can be enabled/disabled. Running QANTA involves using `cli.py` and Luigi pipelines. The system accesses raw Wikipedia dumps for data processing. The QANTA ID numbering scheme categorizes datasets based on events and competitions.

github

: 167

CLI

Bito CLI provides a command line interface to the Bito AI chat functionality, allowing users to interact with the AI through commands. It supports complex automation and workflows, with features like long prompts and slash commands. Users can install Bito CLI on Mac, Linux, and Windows systems using various methods. The tool also offers configuration options for AI model type, access key management, and output language customization. Bito CLI is designed to enhance user experience in querying AI models and automating tasks through the command line interface.

github

: 546

redbox

Redbox is a retrieval augmented generation (RAG) app that uses GenAI to chat with and summarise civil service documents. It increases organisational memory by indexing documents and can summarise reports read months ago, supplement them with current work, and produce a first draft that lets civil servants focus on what they do best. The project uses a microservice architecture with each microservice running in its own container defined by a Dockerfile. Dependencies are managed using Python Poetry. Contributions are welcome, and the project is licensed under the MIT License. Security measures are in place to ensure user data privacy and considerations are being made to make the core-api secure.

github

: 111

ray-llm

RayLLM (formerly known as Aviary) is an LLM serving solution that makes it easy to deploy and manage a variety of open source LLMs, built on Ray Serve. It provides an extensive suite of pre-configured open source LLMs, with defaults that work out of the box. RayLLM supports Transformer models hosted on Hugging Face Hub or present on local disk. It simplifies the deployment of multiple LLMs, the addition of new LLMs, and offers unique autoscaling support, including scale-to-zero. RayLLM fully supports multi-GPU & multi-node model deployments and offers high performance features like continuous batching, quantization and streaming. It provides a REST API that is similar to OpenAI's to make it easy to migrate and cross test them. RayLLM supports multiple LLM backends out of the box, including vLLM and TensorRT-LLM.

github

: 1.1k

LLM_AppDev-HandsOn

This repository showcases how to build a simple LLM-based chatbot for answering questions based on documents using retrieval augmented generation (RAG) technique. It also provides guidance on deploying the chatbot using Podman or on the OpenShift Container Platform. The workshop associated with this repository introduces participants to LLMs & RAG concepts and demonstrates how to customize the chatbot for specific purposes. The software stack relies on open-source tools like streamlit, LlamaIndex, and local open LLMs via Ollama, making it accessible for GPU-constrained environments.

github

: 383

bolt-python-ai-chatbot

The 'bolt-python-ai-chatbot' is a Slack chatbot app template that allows users to integrate AI-powered conversations into their Slack workspace. Users can interact with the bot in conversations and threads, send direct messages for private interactions, use commands to communicate with the bot, customize bot responses, and store user preferences. The app supports integration with Workflow Builder, custom language models, and different AI providers like OpenAI, Anthropic, and Google Cloud Vertex AI. Users can create user objects, manage user states, and select from various AI models for communication.

github

: 52

redbox-copilot

Redbox Copilot is a retrieval augmented generation (RAG) app that uses GenAI to chat with and summarise civil service documents. It increases organisational memory by indexing documents and can summarise reports read months ago, supplement them with current work, and produce a first draft that lets civil servants focus on what they do best. The project uses a microservice architecture with each microservice running in its own container defined by a Dockerfile. Dependencies are managed using Python Poetry. Contributions are welcome, and the project is licensed under the MIT License.

github

: 66

usage_rules

UsageRules is a development tool for Elixir projects that helps gather and consolidate usage rules from dependencies to provide to LLM agents. It provides pre-built usage rules for Elixir and a powerful documentation search task for hexdocs. The tool scans project dependencies, looks for `usage-rules.md` files, consolidates rules into a target file, and maintains sections that can be updated independently. It is useful for projects using frameworks like Ash, Phoenix, or other packages that provide specific usage guidelines, coding patterns, or best practices.

github

: 76

For similar tasks

seer

github

: 87

ai-rag-chat-evaluator

This repository contains scripts and tools for evaluating a chat app that uses the RAG architecture. It provides parameters to assess the quality and style of answers generated by the chat app, including system prompt, search parameters, and GPT model parameters. The tools facilitate running evaluations, with examples of evaluations on a sample chat app. The repo also offers guidance on cost estimation, setting up the project, deploying a GPT-4 model, generating ground truth data, running evaluations, and measuring the app's ability to say 'I don't know'. Users can customize evaluations, view results, and compare runs using provided tools.

github

: 191

LLM-RGB

LLM-RGB is a repository containing a collection of detailed test cases designed to evaluate the reasoning and generation capabilities of Language Learning Models (LLMs) in complex scenarios. The benchmark assesses LLMs' performance in understanding context, complying with instructions, and handling challenges like long context lengths, multi-step reasoning, and specific response formats. Each test case evaluates an LLM's output based on context length difficulty, reasoning depth difficulty, and instruction compliance difficulty, with a final score calculated for each test case. The repository provides a score table, evaluation details, and quick start guide for running evaluations using promptfoo testing tools.

github

: 138

mastra

Mastra is an opinionated Typescript framework designed to help users quickly build AI applications and features. It provides primitives such as workflows, agents, RAG, integrations, syncs, and evals. Users can run Mastra locally or deploy it to a serverless cloud. The framework supports various LLM providers, offers tools for building language models, workflows, and accessing knowledge bases. It includes features like durable graph-based state machines, retrieval-augmented generation, integrations, syncs, and automated tests for evaluating LLM outputs.

github

: 16.9k

SWELancer-Benchmark

SWE-Lancer is a benchmark repository containing datasets and code for the paper 'SWE-Lancer: Can Frontier LLMs Earn $1 Million from Real-World Freelance Software Engineering?'. It provides instructions for package management, building Docker images, configuring environment variables, and running evaluations. Users can use this tool to assess the performance of language models in real-world freelance software engineering tasks.

github

: 1.1k

inspect_evals

Inspect Evals is a repository of community-contributed LLM evaluations for Inspect AI, created in collaboration by the UK AISI, Arcadia Impact, and the Vector Institute. It supports many model providers including OpenAI, Anthropic, Google, Mistral, Azure AI, AWS Bedrock, Together AI, Groq, Hugging Face, vLLM, and Ollama. Users can contribute evaluations, install necessary dependencies, and run evaluations for various models. The repository covers a wide range of evaluation tasks across different domains such as coding, assistants, cybersecurity, safeguards, mathematics, reasoning, knowledge, scheming, multimodal tasks, bias evaluation, personality assessment, and writing tasks.

github

: 236

commanddash

Dash AI is an open-source coding assistant for Flutter developers. It is designed to not only write code but also run and debug it, allowing it to assist beyond code completion and automate routine tasks. Dash AI is powered by Gemini, integrated with the Dart Analyzer, and specifically tailored for Flutter engineers. The vision for Dash AI is to create a single-command assistant that can automate tedious development tasks, enabling developers to focus on creativity and innovation. It aims to assist with the entire process of engineering a feature for an app, from breaking down the task into steps to generating exploratory tests and iterating on the code until the feature is complete. To achieve this vision, Dash AI is working on providing LLMs with the same access and information that human developers have, including full contextual knowledge, the latest syntax and dependencies data, and the ability to write, run, and debug code. Dash AI welcomes contributions from the community, including feature requests, issue fixes, and participation in discussions. The project is committed to building a coding assistant that empowers all Flutter developers.

github

: 227

ollama4j

Ollama4j is a Java library that serves as a wrapper or binding for the Ollama server. It facilitates communication with the Ollama server and provides models for deployment. The tool requires Java 11 or higher and can be installed locally or via Docker. Users can integrate Ollama4j into Maven projects by adding the specified dependency. The tool offers API specifications and supports various development tasks such as building, running unit tests, and integration tests. Releases are automated through GitHub Actions CI workflow. Areas of improvement include adhering to Java naming conventions, updating deprecated code, implementing logging, using lombok, and enhancing request body creation. Contributions to the project are encouraged, whether reporting bugs, suggesting enhancements, or contributing code.

github

: 162

For similar jobs

sweep

Sweep is an AI junior developer that turns bugs and feature requests into code changes. It automatically handles developer experience improvements like adding type hints and improving test coverage.

github

: 7.1k

teams-ai

The Teams AI Library is a software development kit (SDK) that helps developers create bots that can interact with Teams and Microsoft 365 applications. It is built on top of the Bot Framework SDK and simplifies the process of developing bots that interact with Teams' artificial intelligence capabilities. The SDK is available for JavaScript/TypeScript, .NET, and Python.

github

: 502

ai-guide

This guide is dedicated to Large Language Models (LLMs) that you can run on your home computer. It assumes your PC is a lower-end, non-gaming setup.

github

: 159

classifai

Supercharge WordPress Content Workflows and Engagement with Artificial Intelligence. Tap into leading cloud-based services like OpenAI, Microsoft Azure AI, Google Gemini and IBM Watson to augment your WordPress-powered websites. Publish content faster while improving SEO performance and increasing audience engagement. ClassifAI integrates Artificial Intelligence and Machine Learning technologies to lighten your workload and eliminate tedious tasks, giving you more time to create original content that matters.

github

: 668

chatbot-ui

Chatbot UI is an open-source AI chat app that allows users to create and deploy their own AI chatbots. It is easy to use and can be customized to fit any need. Chatbot UI is perfect for businesses, developers, and anyone who wants to create a chatbot.

github

: 27.7k

BricksLLM

BricksLLM is a cloud native AI gateway written in Go. Currently, it provides native support for OpenAI, Anthropic, Azure OpenAI and vLLM. BricksLLM aims to provide enterprise level infrastructure that can power any LLM production use cases. Here are some use cases for BricksLLM: * Set LLM usage limits for users on different pricing tiers * Track LLM usage on a per user and per organization basis * Block or redact requests containing PIIs * Improve LLM reliability with failovers, retries and caching * Distribute API keys with rate limits and cost limits for internal development/production use cases * Distribute API keys with rate limits and cost limits for students

github

: 953

uAgents

uAgents is a Python library developed by Fetch.ai that allows for the creation of autonomous AI agents. These agents can perform various tasks on a schedule or take action on various events. uAgents are easy to create and manage, and they are connected to a fast-growing network of other uAgents. They are also secure, with cryptographically secured messages and wallets.

github

: 1.3k

griptape

Griptape is a modular Python framework for building AI-powered applications that securely connect to your enterprise data and APIs. It offers developers the ability to maintain control and flexibility at every step. Griptape's core components include Structures (Agents, Pipelines, and Workflows), Tasks, Tools, Memory (Conversation Memory, Task Memory, and Meta Memory), Drivers (Prompt and Embedding Drivers, Vector Store Drivers, Image Generation Drivers, Image Query Drivers, SQL Drivers, Web Scraper Drivers, and Conversation Memory Drivers), Engines (Query Engines, Extraction Engines, Summary Engines, Image Generation Engines, and Image Query Engines), and additional components (Rulesets, Loaders, Artifacts, Chunkers, and Tokenizers). Griptape enables developers to create AI-powered applications with ease and efficiency.

github

: 2.2k