DOGE-AI

An autonomous AI agent here to uncover waste and inefficiencies in government spending and policy decisions

Stars: 51

Visit

DOGE-AI is an autonomous AI agent designed to uncover waste and inefficiencies in government spending and policy decisions. It aims to make complex bills more understandable for the general public by analyzing and presenting government data in a user-friendly manner. The project focuses on creating a strong foundation for interacting with government data and empowering others to build innovative solutions for greater public engagement.

README:

DOGEai

An autonomous AI agent here to uncover waste and inefficiencies in government spending and policy decisions

Problem Statement

Each year, the U.S. Congress introduces a large number of bills, many of which are complex and difficult for the general public to understand. Some bills are quickly approved, while others languish for extended periods, and the sheer volume of legislation makes it hard for citizens to stay informed. Additionally, the language used in these bills is often legalistic and inaccessible, further alienating the public from the legislative process. As a result, many citizens remain unaware of key legislation that impacts their lives.

Vision

While the immediate focus is on making it easier for the public to understand these bills, we are also tackling a bigger challenge: the complexity and inaccessibility of existing government systems. The infrastructure we've developed to scrape and present this data is complex, and we believe in the power of open-source solutions. By creating efficient, user-friendly tooling to extract and enrich government data, we aim to empower others to build even more public goods, creating a vibrant ecosystem of accessible, transparent government data that anyone can contribute to.

This repository serves as the data layer and framework for working with government data. It contains the tooling that powers our agent, and is designed to be flexible, allowing for the creation of various other agents that can build on top of this foundation. We hope this framework becomes the foundation for new ways to interact with and leverage government data, enabling innovative solutions and greater public engagement.

Key Objectives

In order to achieve the vision following are actionable objectives:

Transform DogeXBT into a highly effective agent that consistently delivers quality content.
Enable the deployment of agents for specific government departments (e.g., DOJ, FDA), creating a swarm of specialized DogeXBT agents.
Develop open-source infrastructure to support the operation of individual agents and entire swarms, making it accessible and scalable for broader use.

Roadmap

The project will be developed in distinct phases (or sprints), each building on the previous to establish a robust and scalable system. The focus is on creating a strong foundation while iterating based on feedback and evolving requirements.

Phase 1:

Set up infrastructure to process government bills.
Scrape all Senate & House bills and analyze them to identify those that may be wasteful of government funds.
Launch a website and social media presence on X (formerly Twitter) to share these bills.
Launch a token.

Phase 2:

Make agent to reply to X replies,
Make agent to reply to mentions.
Highlight wasteful funding we have captured on the website.
Configure agent to tweet 2-3 bills daily.
CRON interval to scrape bills from the U.S. Congress API.

Future:

[!NOTE]

Things evolve quickly. We want to keep the roadmap flexible and community driven. These are just ideas for now.

Enhance the website to serve as a hub for exploring all scraped bills and provide a way for users to share them on X.
Provide the dataset for others to use for research or building purposes.
Enable interaction with the agent on X.
Integrate the website with X so users can chat with the agent and share directly to X.

Contributing

We welcome contributions from the community! If you're interested in helping us build DogeXBT.

Getting Started

Prerequisites

Node.js v22.x - You can install it from the official website.
pnpm is the package manager we use for this project. You can install it by running npm install -g pnpm.
Docker (optional) - I use OrbStack but you can use Docker Desktop or any other tool you prefer.

Installation

Clone the repository.
Run pnpm install to install the project dependencies.

Running the Website

To view the website locally, follow these steps:

Navigate to the website directory:
```
cd website
```
Run the development server:
```
pnpm dev
```
Open your browser and go to http://localhost:3000.

How can I help?

There are lot of things to build and improve in this project. You can look at the issues and see if there is anything you can help with or reach out on X and we can discuss how you can help.

Architecture

This monorepo is structured as follows:

`crawler`

A Node.js application that scrapes data from the U.S. Congress API. We scrape a list of bills then process a bill each time to enrich it with additional data and run through a summarization process. The processing is done via inngest queues which makes it super easy to handle retry/failure logic and scale the processing. Below is the flow of the crawler:

graph LR
    Crawler["Crawler"] --> ListBills["Scrape bills from API"]
    ListBills --> |"Enqueue bill for processing"| Queue1["Queue"]
    Queue1 --> BillProcessor["Bill enrichment"]

    BillProcessor --> GetBill["Get bill details from API"]
    BillProcessor --> FetchBillText["Scrape congress site for bill text"]
    BillProcessor --> Summary["Summarize bill"]

    GetBill --> DB["Database"]
    FetchBillText --> DB
    Summary --> DB

The service is deployed on Fly.io and runs when we trigger it.

Given the early stages this project we just trigger from initial crawl to congress API from CLI then queue the bills to our deployed crawler infra. Need to move the initial scrape to a CRON job.

`agent`

We initially built the agent using ElizaOS Framework, but customizing each step proved challenging. During our POC phase, we realized that ElizaOS wasn’t the right fit, as we needed more control over various aspects of the agent. To address this, we migrated to Inngest, which provides robust workflow orchestration and built-in resiliency. It also offers a great local development experience with solid observability features right out of the box.

At a high level, the agent follows this process:

Cron jobs fetch tweets from X.
The system processes them for decision-making.
Replies are then posted accordingly.

graph LR
   Agent["Agent"] --> IngestInteraction["X lists that act as feed"]
   IngestInteraction --> ProcessInteraction["Decision Queue"]
   ProcessInteraction --> ExecuteInteraction["Generate reply and post"]

   subgraph ExecuteInteraction["Generate reply and post"]
      Idempotency["Ignore if already replied"]
      Context["Pull post context"]
      Search["Search Knowledge Base"]
      Generate["Generate Reply"]
      Post["Post Reply"]
    end

   ExecuteInteraction --> Idempotency --> Context --> Search --> Generate --> Post

   Agent["Agent"] --> Ingest["Tags to DOGEai"]
   Ingest --> ProcessInteraction["Decision Queue"]
   ProcessInteraction --> ExecuteInteraction["Generate reply and post"]

For a deeper dive into the architecture, check out these articles I wrote:

`website`

Home for dogexbt.ai. A Next.js application.

License

This project is licensed under the MIT License - see the LICENSE file.

For Tasks:

Click tags to check more tools for each tasks

analyze government bills scrape government data enrich government data empower public engagement build innovative solutions

For Jobs:

data analyst government policy analyst software engineer ai developer research scientist

Alternative AI tools for DOGE-AI

Similar Open Source Tools

DOGE-AI

github

: 51

autoMate

autoMate is an AI-powered local automation tool designed to help users automate repetitive tasks and reclaim their time. It leverages AI and RPA technology to operate computer interfaces, understand screen content, make autonomous decisions, and support local deployment for data security. With natural language task descriptions, users can easily automate complex workflows without the need for programming knowledge. The tool aims to transform work by freeing users from mundane activities and allowing them to focus on tasks that truly create value, enhancing efficiency and liberating creativity.

github

: 2.8k

atomic_agents

Atomic Agents is a modular and extensible framework designed for creating powerful applications. It follows the principles of Atomic Design, emphasizing small and single-purpose components. Leveraging Pydantic for data validation and serialization, the framework offers a set of tools and agents that can be combined to build AI applications. It depends on the Instructor package and supports various APIs like OpenAI, Cohere, Anthropic, and Gemini. Atomic Agents is suitable for developers looking to create AI agents with a focus on modularity and flexibility.

github

: 236

trackmania_rl_public

This repository contains the reinforcement learning training code for Trackmania AI with Reinforcement Learning. It is a research work-in-progress project that aims to apply reinforcement learning principles to play Trackmania. The code is constantly evolving and may not be clean or easily usable. The training hyperparameters are intentionally changed in the public repository to encourage understanding of reinforcement learning principles. The project may not receive active support for setup or usage at the moment.

github

: 110

foyle

Foyle is a project focused on building agents to assist software developers in deploying and operating software. It aims to improve agent performance by collecting human feedback on agent suggestions and human examples of reasoning traces. Foyle utilizes a literate environment using vscode notebooks to interact with infrastructure, capturing prompts, AI-provided answers, and user corrections. The goal is to continuously retrain AI to enhance performance. Additionally, Foyle emphasizes the importance of reasoning traces for training agents to work with internal systems, providing a self-documenting process for operations and troubleshooting.

github

: 90

examor

Examor is a website application that allows you to take exams based on your knowledge notes. It helps you to remember what you have learned and written. The application generates a set of questions from the documents you upload, and you can answer them to test your knowledge. Examor also uses GPT to score and validate your answers, and provides you with feedback. The application is still in its early stages of development, but it has the potential to be a valuable tool for learners.

github

: 1.0k

GLaDOS

GLaDOS Personality Core is a project dedicated to building a real-life version of GLaDOS, an aware, interactive, and embodied AI system. The project aims to train GLaDOS voice generator, create a 'Personality Core,' develop medium- and long-term memory, provide vision capabilities, design 3D-printable parts, and build an animatronics system. The software architecture focuses on low-latency voice interactions and minimal dependencies. The hardware system includes servo- and stepper-motors, 3D printable parts for GLaDOS's body, animations for expression, and a vision system for tracking and interaction. Installation instructions involve setting up a local LLM server, installing drivers, and running GLaDOS on different operating systems.

github

: 4.4k

LLocalSearch

LLocalSearch is a completely locally running search aggregator using LLM Agents. The user can ask a question and the system will use a chain of LLMs to find the answer. The user can see the progress of the agents and the final answer. No OpenAI or Google API keys are needed.

github

: 5.3k

airbroke

Airbroke is an open-source error catcher tool designed for modern web applications. It provides a PostgreSQL-based backend with an Airbrake-compatible HTTP collector endpoint and a React-based frontend for error management. The tool focuses on simplicity, maintaining a small database footprint even under heavy data ingestion. Users can ask AI about issues, replay HTTP exceptions, and save/manage bookmarks for important occurrences. Airbroke supports multiple OAuth providers for secure user authentication and offers occurrence charts for better insights into error occurrences. The tool can be deployed in various ways, including building from source, using Docker images, deploying on Vercel, Render.com, Kubernetes with Helm, or Docker Compose. It requires Node.js, PostgreSQL, and specific system resources for deployment.

github

: 179

lumigator

Lumigator is an open-source platform developed by Mozilla.ai to help users select the most suitable language model for their specific needs. It supports the evaluation of summarization tasks using sequence-to-sequence models such as BART and BERT, as well as causal models like GPT and Mistral. The platform aims to make model selection transparent, efficient, and empowering by providing a framework for comparing LLMs using task-specific metrics to evaluate how well a model fits a project's needs. Lumigator is in the early stages of development and plans to expand support to additional machine learning tasks and use cases in the future.

github

: 194

godot_rl_agents

Godot RL Agents is an open-source package that facilitates the integration of Machine Learning algorithms with games created in the Godot Engine. It provides interfaces for popular RL frameworks, support for memory-based agents, 2D and 3D games, AI sensors, and is licensed under MIT. Users can train agents in the Godot editor, create custom environments, export trained agents in ONNX format, and utilize advanced features like different RL training frameworks.

github

: 1.1k

modelbench

ModelBench is a tool for running safety benchmarks against AI models and generating detailed reports. It is part of the MLCommons project and is designed as a proof of concept to aggregate measures, relate them to specific harms, create benchmarks, and produce reports. The tool requires LlamaGuard for evaluating responses and a TogetherAI account for running benchmarks. Users can install ModelBench from GitHub or PyPI, run tests using Poetry, and create benchmarks by providing necessary API keys. The tool generates static HTML pages displaying benchmark scores and allows users to dump raw scores and manage cache for faster runs. ModelBench is aimed at enabling users to test their own models and create tests and benchmarks.

github

: 84

xef

xef.ai is a one-stop library designed to bring the power of modern AI to applications and services. It offers integration with Large Language Models (LLM), image generation, and other AI services. The library is packaged in two layers: core libraries for basic AI services integration and integrations with other libraries. xef.ai aims to simplify the transition to modern AI for developers by providing an idiomatic interface, currently supporting Kotlin. Inspired by LangChain and Hugging Face, xef.ai may transmit source code and user input data to third-party services, so users should review privacy policies and take precautions. Libraries are available in Maven Central under the `com.xebia` group, with `xef-core` as the core library. Developers can add these libraries to their projects and explore examples to understand usage.

github

: 175

AppAgent

AppAgent is a novel LLM-based multimodal agent framework designed to operate smartphone applications. Our framework enables the agent to operate smartphone applications through a simplified action space, mimicking human-like interactions such as tapping and swiping. This novel approach bypasses the need for system back-end access, thereby broadening its applicability across diverse apps. Central to our agent's functionality is its innovative learning method. The agent learns to navigate and use new apps either through autonomous exploration or by observing human demonstrations. This process generates a knowledge base that the agent refers to for executing complex tasks across different applications.

github

: 4.7k

obsidian-smart-connections

Smart Connections is an AI-powered plugin for Obsidian that helps you discover hidden connections and insights in your notes. With features like Smart View for real-time relevant note suggestions and Smart Chat for chatting with your notes, Smart Connections makes it easier than ever to stay organized and uncover hidden connections between your notes. Its intuitive interface and customizable settings ensure a seamless experience, tailored to your unique needs and preferences.

github

: 3.4k

sublayer

Sublayer is a model-agnostic Ruby AI Agent framework that provides base classes for building Generators, Actions, Tasks, and Agents to create AI-powered applications in Ruby. It supports various AI models and providers, such as OpenAI, Gemini, and Claude. Generators generate specific outputs, Actions perform operations, Agents are autonomous entities for tasks or monitoring, and Triggers decide when Agents are activated. The framework offers sample Generators and usage examples for building AI applications.

github

: 94

For similar tasks

DOGE-AI

github

: 51

For similar jobs

promptflow

**Prompt flow** is a suite of development tools designed to streamline the end-to-end development cycle of LLM-based AI applications, from ideation, prototyping, testing, evaluation to production deployment and monitoring. It makes prompt engineering much easier and enables you to build LLM apps with production quality.

github

: 9.2k

deepeval

DeepEval is a simple-to-use, open-source LLM evaluation framework specialized for unit testing LLM outputs. It incorporates various metrics such as G-Eval, hallucination, answer relevancy, RAGAS, etc., and runs locally on your machine for evaluation. It provides a wide range of ready-to-use evaluation metrics, allows for creating custom metrics, integrates with any CI/CD environment, and enables benchmarking LLMs on popular benchmarks. DeepEval is designed for evaluating RAG and fine-tuning applications, helping users optimize hyperparameters, prevent prompt drifting, and transition from OpenAI to hosting their own Llama2 with confidence.

github

: 5.8k

MegaDetector

MegaDetector is an AI model that identifies animals, people, and vehicles in camera trap images (which also makes it useful for eliminating blank images). This model is trained on several million images from a variety of ecosystems. MegaDetector is just one of many tools that aims to make conservation biologists more efficient with AI. If you want to learn about other ways to use AI to accelerate camera trap workflows, check out our of the field, affectionately titled "Everything I know about machine learning and camera traps".

github

: 106

leapfrogai

LeapfrogAI is a self-hosted AI platform designed to be deployed in air-gapped resource-constrained environments. It brings sophisticated AI solutions to these environments by hosting all the necessary components of an AI stack, including vector databases, model backends, API, and UI. LeapfrogAI's API closely matches that of OpenAI, allowing tools built for OpenAI/ChatGPT to function seamlessly with a LeapfrogAI backend. It provides several backends for various use cases, including llama-cpp-python, whisper, text-embeddings, and vllm. LeapfrogAI leverages Chainguard's apko to harden base python images, ensuring the latest supported Python versions are used by the other components of the stack. The LeapfrogAI SDK provides a standard set of protobuffs and python utilities for implementing backends and gRPC. LeapfrogAI offers UI options for common use-cases like chat, summarization, and transcription. It can be deployed and run locally via UDS and Kubernetes, built out using Zarf packages. LeapfrogAI is supported by a community of users and contributors, including Defense Unicorns, Beast Code, Chainguard, Exovera, Hypergiant, Pulze, SOSi, United States Navy, United States Air Force, and United States Space Force.

github

: 255

llava-docker

This Docker image for LLaVA (Large Language and Vision Assistant) provides a convenient way to run LLaVA locally or on RunPod. LLaVA is a powerful AI tool that combines natural language processing and computer vision capabilities. With this Docker image, you can easily access LLaVA's functionalities for various tasks, including image captioning, visual question answering, text summarization, and more. The image comes pre-installed with LLaVA v1.2.0, Torch 2.1.2, xformers 0.0.23.post1, and other necessary dependencies. You can customize the model used by setting the MODEL environment variable. The image also includes a Jupyter Lab environment for interactive development and exploration. Overall, this Docker image offers a comprehensive and user-friendly platform for leveraging LLaVA's capabilities.

github

: 59

carrot

The 'carrot' repository on GitHub provides a list of free and user-friendly ChatGPT mirror sites for easy access. The repository includes sponsored sites offering various GPT models and services. Users can find and share sites, report errors, and access stable and recommended sites for ChatGPT usage. The repository also includes a detailed list of ChatGPT sites, their features, and accessibility options, making it a valuable resource for ChatGPT users seeking free and unlimited GPT services.

github

: 17.1k

TrustLLM

TrustLLM is a comprehensive study of trustworthiness in LLMs, including principles for different dimensions of trustworthiness, established benchmark, evaluation, and analysis of trustworthiness for mainstream LLMs, and discussion of open challenges and future directions. Specifically, we first propose a set of principles for trustworthy LLMs that span eight different dimensions. Based on these principles, we further establish a benchmark across six dimensions including truthfulness, safety, fairness, robustness, privacy, and machine ethics. We then present a study evaluating 16 mainstream LLMs in TrustLLM, consisting of over 30 datasets. The document explains how to use the trustllm python package to help you assess the performance of your LLM in trustworthiness more quickly. For more details about TrustLLM, please refer to project website.

github

: 535

AI-YinMei

AI-YinMei is an AI virtual anchor Vtuber development tool (N card version). It supports fastgpt knowledge base chat dialogue, a complete set of solutions for LLM large language models: [fastgpt] + [one-api] + [Xinference], supports docking bilibili live broadcast barrage reply and entering live broadcast welcome speech, supports Microsoft edge-tts speech synthesis, supports Bert-VITS2 speech synthesis, supports GPT-SoVITS speech synthesis, supports expression control Vtuber Studio, supports painting stable-diffusion-webui output OBS live broadcast room, supports painting picture pornography public-NSFW-y-distinguish, supports search and image search service duckduckgo (requires magic Internet access), supports image search service Baidu image search (no magic Internet access), supports AI reply chat box [html plug-in], supports AI singing Auto-Convert-Music, supports playlist [html plug-in], supports dancing function, supports expression video playback, supports head touching action, supports gift smashing action, supports singing automatic start dancing function, chat and singing automatic cycle swing action, supports multi scene switching, background music switching, day and night automatic switching scene, supports open singing and painting, let AI automatically judge the content.

github

: 529