TaxHacker

Automatic AI analyzer for receipts, invoices, transactions, etc

Stars: 230

Visit

TaxHacker is a self-hosted accountant app designed for freelancers and small businesses to automate expense and income tracking using the power of GenAI. It can analyze uploaded photos, receipts, or PDFs to extract important data like name, total amount, date, merchant, and VAT, saving them as structured transactions. The tool supports automatic currency conversion, filters, multiple projects, import-export functionalities, custom categories, and allows users to create custom fields for extraction. TaxHacker simplifies reporting and tax filing by organizing and storing data efficiently.

README:

TaxHacker

I'm a small self-hosted accountant app that can help you deal with invoices, receipts and taxes with power of GenAI.

Share TaxHacker

👋🏻 Getting Started

TaxHacker is a self-hosted accounting app for freelancers and small businesses who want to save time and automate expences and income tracking with power of GenAI. It can recognise uploaded photos, receipts or PDFs and extract important data (e.g. name, total amount, date, merchant, VAT) and save it as structured transactions to a table. You can also create your own custom fields to extract with your LLM prompts.

It supports automatic currency conversion on a day of transaction. Even for crypto!

Built-in system of filters, support for multiple projects, import-export of transactions for a certain time (along with attached files) and custom categories, allows you to simplify reporting and tax filing.

[!IMPORTANT]

This project is still at a very early stage. Use it at your own risk! Star Us to receive notifications about new bugfixes and features from GitHub ⭐️

✨ Features

`1` Upload photos or documents to analyze with LLM

https://github.com/user-attachments/assets/3326d0e3-0bf6-4c39-9e00-4bf0983d9b7a

🎥 Watch the video

Take a photo on upload or a PDF and TaxHacker will automatically recognise, categorise and store transaction information.

Upload multiple documents and store in "unsorted" until you get the time to sort them out by hand or with an AI
Use LLM to extract key information like date, amount, and vendor
Automatically categorize transactions based on its content
Store everything in a structured format for easy filtering and retrieval
Organize your documents by a tax season

TaxHacker recognizes a wide variety of documents including store receipts, restaurant bills, invoices, bank checks, letters, even handwritten receipts.

`2` Multi-currency support with automatic conversion (even for crypto)

TaxHacker automatically converts foreign currencies and even knows the historical exchange rates on the invoice date.

Automatically detect currency in your documents
Convert it to your base currency
Historical exchange rate lookup for past transactions
Support for over 170 world currencies and 14 popular cryptocurrencies (BTC, ETC, LTC, DOT, etc)!

`3` Customize any LLM prompt

You can customize LLM Prompts for built-in fields, categories, and projects, as well as modify global templates in the application settings. This allows to customize the quality of recognizing specific things to your specific use-cases.

General prompt template is configurable is settings
Create custom extraction rules for your specific needs
Adjust field extraction priorities and naming conventions
Fine-tune the AI for your industry-specific documents

The whole extraction process is under your contoll all the time!

`4` Create custom fields, projects, categories

Adapt TaxHacker to your specific tracking needs. You can create new fields, projects or categories to extract additional information from documents. For example, if you need to save emails, addresses, and any custom information into separate fields, you can do it. Custom fields will be available when exporting too.

Create unlimited custom fields for transaction tracking
Automatically extract custom field data using AI
Include custom fields in exports and reports
Create new categories or projects to organise your transactions and filter by them

`5` Flexible data filtering and export

Once all documents have been uploaded and analyzed, you can view, filter and export your transaction history.

Filter transactions by time, category, and other features
Use full-text search by recognized document content
Export filtered transactions to CSV with attached documents
Upload your entire income and expense history at the end of the year for your tax advisor to analyze

`6` Local data storage and self-hosting

🛳 Deploying or Self-hosting

TaxHacker can be self-hosted on your own infrastructure for complete control over your data and application environment. We provide a Docker image and Docker Compose files that makes setting up TaxHacker simple:

curl -O https://raw.githubusercontent.com/vas3k/TaxHacker/main/docker-compose.yml

docker compose up

The Docker Compose setup includes:

TaxHacker application container
PostgreSQL 17 database container
Automatic database migrations
Volume mounts for persistent data storage

New docker image is automatically built and published on every new release. You can use specific version tags (e.g. v1.0.0) or latest for the most recent version.

For more advanced setups, you can adapt Docker Compose configuration to your own needs. The default configuration uses the pre-built image from GHCR, but you can still build locally using the provided Dockerfile if needed.

For example:

services:
  app:
    image: ghcr.io/vas3k/taxhacker:latest
    ports:
      - "7331:7331"
    environment:
      - SELF_HOSTED_MODE=true
      - UPLOAD_PATH=/app/data/uploads
      - DATABASE_URL=postgresql://postgres:postgres@localhost:5432/taxhacker
    volumes:
      - ./data:/app/data
    restart: unless-stopped

Environment Variables

Configure TaxHacker to suit your needs with these environment variables:

Variable	Required	Description	Example
`PORT`	No	Port to run the app on	`7331` (default)
`UPLOAD_PATH`	Yes	Local directory for uploading files	`./data/uploads`
`DATABASE_URL`	Yes	PostgreSQL connection string	`postgresql://user@localhost:5432/taxhacker`
`SELF_HOSTED_MODE`	No	Set it to "true" if you're self-hosting the app: it enables auto-login, custom API keys, and more	`false`
`DISABLE_SIGNUP`	No	Disable new user registration on your instance	`false`
`OPENAI_API_KEY`	No	OpenAI API key for AI features. In self-hosted mode you can set it up in settings too.	`sk-...`
`RESEND_API_KEY`	No	Resend API key for email notifications	`re_...`
`RESEND_FROM_EMAIL`	No	Email address to send from	`TaxHacker <[email protected]>`

⌨️ Local Development

We use:

Next.js version 15+ or later
Prisma for database models and migrations
PostgreSQL as a database (PostgreSQL 17+ recommended)
Ghostscript and graphicsmagick libs for PDF files (can be installed on macOS via brew install gs graphicsmagick)

Set up a local development environment with these steps:

# Clone the repository
git clone https://github.com/vas3k/TaxHacker.git
cd TaxHacker

# Install dependencies
npm install

# Set up environment variables
cp .env.example .env

# Edit .env with your configuration
# Make sure to set DATABASE_URL to your PostgreSQL connection string
# Example: postgresql://user@localhost:5432/taxhacker

# Initialize the database
npx prisma generate && npx prisma migrate dev

# Start the development server
npm run dev

Visit http://localhost:7331 to see your local instance of TaxHacker.

For a production build, instead of npm run dev use the following commands:

# Build the application
npm run build

# Start the production server
npm run start

🤝 Contributing

Contributions to TaxHacker are welcome and appreciated! Here's how you can help:

Bug Reports: File detailed issues when you encounter problems
Feature Requests: Share your ideas for new features
Code Contributions: Submit pull requests to improve the application
Documentation: Help improve documentation

All work is done on GitHub through issues and pull requests.

❤️ Donate

If TaxHacker has helped you - help us in return! You donations will support maintainance and development. If you find this project valuable for your personal or business use, consider making a donation.

📄 License

TaxHacker is licensed under the MIT License - see the LICENSE file for details.

For Tasks:

Click tags to check more tools for each tasks

manage receipts track expenses automate income tracking analyze invoices simplify tax filing

For Jobs:

accountant freelancer small business owner tax advisor financial analyst

Alternative AI tools for TaxHacker

Similar Open Source Tools

TaxHacker

github

: 230

NekoImageGallery

NekoImageGallery is an online AI image search engine that utilizes the Clip model and Qdrant vector database. It supports keyword search and similar image search. The tool generates 768-dimensional vectors for each image using the Clip model, supports OCR text search using PaddleOCR, and efficiently searches vectors using the Qdrant vector database. Users can deploy the tool locally or via Docker, with options for metadata storage using Qdrant database or local file storage. The tool provides API documentation through FastAPI's built-in Swagger UI and can be used for tasks like image search, text extraction, and vector search.

github

: 97

atropos

Atropos is a robust and scalable framework for Reinforcement Learning Environments with Large Language Models (LLMs). It provides a flexible platform to accelerate LLM-based RL research across diverse interactive settings. Atropos supports multi-turn and asynchronous RL interactions, integrates with various inference APIs, offers a standardized training interface for experimenting with different RL algorithms, and allows for easy scalability by launching more environment instances. The framework manages diverse environment types concurrently for heterogeneous, multi-modal training.

github

: 700

helix-db

HelixDB is a database designed specifically for AI applications, providing a single platform to manage all components needed for AI applications. It supports graph + vector data model and also KV, documents, and relational data. Key features include built-in tools for MCP, embeddings, knowledge graphs, RAG, security, logical isolation, and ultra-low latency. Users can interact with HelixDB using the Helix CLI tool and SDKs in TypeScript and Python. The roadmap includes features like organizational auth, server code improvements, 3rd party integrations, educational content, and binary quantisation for better performance. Long term projects involve developing in-house tools for knowledge graph ingestion, graph-vector storage engine, and network protocol & serdes libraries.

github

: 2.7k

swirl-search

Swirl is an open-source software that allows users to simultaneously search multiple content sources and receive AI-ranked results. It connects to various data sources, including databases, public data services, and enterprise sources, and utilizes AI and LLMs to generate insights and answers based on the user's data. Swirl is easy to use, requiring only the download of a YML file, starting in Docker, and searching with Swirl. Users can add credentials to preloaded SearchProviders to access more sources. Swirl also offers integration with ChatGPT as a configured AI model. It adapts and distributes user queries to anything with a search API, re-ranking the unified results using Large Language Models without extracting or indexing anything. Swirl includes five Google Programmable Search Engines (PSEs) to get users up and running quickly. Key features of Swirl include Microsoft 365 integration, SearchProvider configurations, query adaptation, synchronous or asynchronous search federation, optional subscribe feature, pipelining of Processor stages, results stored in SQLite3 or PostgreSQL, built-in Query Transformation support, matching on word stems and handling of stopwords, duplicate detection, re-ranking of unified results using Cosine Vector Similarity, result mixers, page through all results requested, sample data sets, optional spell correction, optional search/result expiration service, easily extensible Connector and Mixer objects, and a welcoming community for collaboration and support.

github

: 2.7k

inspector-laravel

Inspector is a code execution monitoring tool specifically designed for Laravel applications. It provides simple and efficient monitoring capabilities to track and analyze the performance of your Laravel code. With Inspector, you can easily monitor web requests, test the functionality of your application, and explore data through a user-friendly dashboard. The tool requires PHP version 7.2.0 or higher and Laravel version 5.5 or above. By configuring the ingestion key and attaching the middleware, users can seamlessly integrate Inspector into their Laravel projects. The official documentation provides detailed instructions on installation, configuration, and usage of Inspector. Contributions to the tool are welcome, and users are encouraged to follow the Contribution Guidelines to participate in the development of Inspector.

github

: 197

job-llm

ResumeFlow is an automated system utilizing Large Language Models (LLMs) to streamline the job application process. It aims to reduce human effort in various steps of job hunting by integrating LLM technology. Users can access ResumeFlow as a web tool, install it as a Python package, or download the source code. The project focuses on leveraging LLMs to automate tasks such as resume generation and refinement, making job applications smoother and more efficient.

github

: 70

RainbowGPT

RainbowGPT is a versatile tool that offers a range of functionalities, including Stock Analysis for financial decision-making, MySQL Management for database navigation, and integration of AI technologies like GPT-4 and ChatGlm3. It provides a user-friendly interface suitable for all skill levels, ensuring seamless information flow and continuous expansion of emerging technologies. The tool enhances adaptability, creativity, and insight, making it a valuable asset for various projects and tasks.

github

: 86

botpress

Botpress is a platform for building next-generation chatbots and assistants powered by OpenAI. It provides a range of tools and integrations to help developers quickly and easily create and deploy chatbots for various use cases.

github

: 14.2k

KlicStudio

Klic Studio is a versatile audio and video localization and enhancement solution developed by Krillin AI. This minimalist yet powerful tool integrates video translation, dubbing, and voice cloning, supporting both landscape and portrait formats. With an end-to-end workflow, users can transform raw materials into beautifully ready-to-use cross-platform content with just a few clicks. The tool offers features like video acquisition, accurate speech recognition, intelligent segmentation, terminology replacement, professional translation, voice cloning, video composition, and cross-platform support. It also supports various speech recognition services, large language models, and TTS text-to-speech services. Users can easily deploy the tool using Docker and configure it for different tasks like subtitle translation, large model translation, and optional voice services.

github

: 8.4k

KrillinAI

KrillinAI is a video subtitle translation and dubbing tool based on AI large models, featuring speech recognition, intelligent sentence segmentation, professional translation, and one-click deployment of the entire process. It provides a one-stop workflow from video downloading to the final product, empowering cross-language cultural communication with AI. The tool supports multiple languages for input and translation, integrates features like automatic dependency installation, video downloading from platforms like YouTube and Bilibili, high-speed subtitle recognition, intelligent subtitle segmentation and alignment, custom vocabulary replacement, professional-level translation engine, and diverse external service selection for speech and large model services.

github

: 8.5k

ai-prompts

Instructa AI Prompts is an open-source repository dedicated to collecting and sharing AI prompts, best practices, and curated rules for developers. The goal is to help users quickly set up and refine their workflow with ready-to-use prompts. Users can dynamically include prompts in AI-assisted coding tools like Cursor, GitHub Copilot, Zed, Windsurf, and Cline to adhere to project-specific coding standards, best practices, and automation workflows.

github

: 217

superduper

superduper.io is a Python framework that integrates AI models, APIs, and vector search engines directly with existing databases. It allows hosting of models, streaming inference, and scalable model training/fine-tuning. Key features include integration of AI with data infrastructure, inference via change-data-capture, scalable model training, model chaining, simple Python interface, Python-first approach, working with difficult data types, feature storing, and vector search capabilities. The tool enables users to turn their existing databases into centralized repositories for managing AI model inputs and outputs, as well as conducting vector searches without the need for specialized databases.

github

: 5.0k

agentok

Agentok Studio is a tool built upon AG2, a powerful agent framework from Microsoft, offering intuitive visual tools to streamline the creation and management of complex agent-based workflows. It simplifies the process for creators and developers by generating native Python code with minimal dependencies, enabling users to create self-contained code that can be executed anywhere. The tool is currently under development and not recommended for production use, but contributions are welcome from the community to enhance its capabilities and functionalities.

github

: 242

agents

Polymarket Agents is a developer framework and set of utilities for building AI agents to trade autonomously on Polymarket. It integrates with Polymarket API, provides AI agent utilities for prediction markets, supports local and remote RAG, sources data from various services, and offers comprehensive LLM tools for prompt engineering. The architecture features modular components like APIs and scripts for managing local environments, server set-up, and CLI for end-user commands.

github

: 60

ProX

ProX is a lm-based data refinement framework that automates the process of cleaning and improving data used in pre-training large language models. It offers better performance, domain flexibility, efficiency, and cost-effectiveness compared to traditional methods. The framework has been shown to improve model performance by over 2% and boost accuracy by up to 20% in tasks like math. ProX is designed to refine data at scale without the need for manual adjustments, making it a valuable tool for data preprocessing in natural language processing tasks.

github

: 164

For similar tasks

TaxHacker

github

: 230

LangGraph-Expense-Tracker

LangGraph Expense tracker is a small project that explores the possibilities of LangGraph. It allows users to send pictures of invoices, which are then structured and categorized into expenses and stored in a database. The project includes functionalities for invoice extraction, database setup, and API configuration. It consists of various modules for categorizing expenses, creating database tables, and running the API. The database schema includes tables for categories, payment methods, and expenses, each with specific columns to track transaction details. The API documentation is available for reference, and the project utilizes LangChain for processing expense data.

github

: 82

travel-planner-ai

Travel Planner AI is a Software as a Service (SaaS) product that simplifies travel planning by generating comprehensive itineraries based on user preferences. It leverages cutting-edge technologies to provide tailored schedules, optimal timing suggestions, food recommendations, prime experiences, expense tracking, and collaboration features. The tool aims to be the ultimate travel companion for users looking to plan seamless and smart travel adventures.

github

: 84

gemini-android

Gemini-Android is a mobile application that allows users to track their expenses and manage their finances on the go. The app provides a user-friendly interface for adding and categorizing expenses, setting budgets, and generating reports to help users make informed financial decisions. With Gemini-Android, users can easily monitor their spending habits, identify areas for saving, and stay on top of their financial goals.

github

: 366

wealth-tracker

Wealth Tracker is a personal finance management tool designed to help users track their income, expenses, and investments in one place. With intuitive features and customizable categories, users can easily monitor their financial health and make informed decisions. The tool provides detailed reports and visualizations to analyze spending patterns and set financial goals. Whether you are budgeting, saving for a big purchase, or planning for retirement, Wealth Tracker offers a comprehensive solution to manage your money effectively.

github

: 711

aider-desk

AiderDesk is a desktop application that enhances coding workflow by leveraging AI capabilities. It offers an intuitive GUI, project management, IDE integration, MCP support, settings management, cost tracking, structured messages, visual file management, model switching, code diff viewer, one-click reverts, and easy sharing. Users can install it by downloading the latest release and running the executable. AiderDesk also supports Python version detection and auto update disabling. It includes features like multiple project management, context file management, model switching, chat mode selection, question answering, cost tracking, MCP server integration, and MCP support for external tools and context. Development setup involves cloning the repository, installing dependencies, running in development mode, and building executables for different platforms. Contributions from the community are welcome following specific guidelines.

github

: 769

zero-finance

Zero Finance is a bank account that automates your finances, allowing you to easily create invoices, get paid directly to your personal IBAN, use a debit card worldwide with 0% conversion fees, optimize yield by automatically allocating idle funds to highest-yielding opportunities, and automate finances with a complete accounting system including expense tracking and tax optimization. The tool also syncs with various data sources to help you stay on track of your financial tasks by surfacing critical information, auto-categorizing based on AI-rules, auto-scheduling vendor payments from invoices via AI-rules, and allowing export to CSV. The project is structured as a monorepo containing multiple packages for the bank web app and a smart contract for securely automating savings.

github

: 127

pennywiseai-tracker

PennyWise AI Tracker is a free and open-source expense tracker that uses on-device AI to turn bank SMS into a clean and searchable money timeline. It offers smart SMS parsing, clear insights, subscription tracking, on-device AI assistant, auto-categorization, data export, and supports major Indian banks. All processing happens on the user's device for privacy. The tool is designed for Android users in India who want automatic expense tracking from bank SMS, with clean categories, subscription detection, and clear insights.

github

: 141

For similar jobs

SheetCopilot

SheetCopilot is an assistant agent that manipulates spreadsheets by following user commands. It leverages Large Language Models (LLMs) to interact with spreadsheets like a human expert, enabling non-expert users to complete tasks on complex software such as Google Sheets and Excel via a language interface. The tool observes spreadsheet states, polishes generated solutions based on external action documents and error feedback, and aims to improve success rate and efficiency. SheetCopilot offers a dataset with diverse task categories and operations, supporting operations like entry & manipulation, management, formatting, charts, and pivot tables. Users can interact with SheetCopilot in Excel or Google Sheets, executing tasks like calculating revenue, creating pivot tables, and plotting charts. The tool's evaluation includes performance comparisons with leading LLMs and VBA-based methods on specific datasets, showcasing its capabilities in controlling various aspects of a spreadsheet.

github

: 82

LangGraph-Expense-Tracker

github

: 82

receipt-scanner

The receipt-scanner repository is an AI-Powered Receipt and Invoice Scanner for Laravel that allows users to easily extract structured receipt data from images, PDFs, and emails within their Laravel application using OpenAI. It provides a light wrapper around OpenAI Chat and Completion endpoints, supports various input formats, and integrates with Textract for OCR functionality. Users can install the package via composer, publish configuration files, and use it to extract data from plain text, PDFs, images, Word documents, and web content. The scanned receipt data is parsed into a DTO structure with main classes like Receipt, Merchant, and LineItem.

github

: 95

actual-ai

Actual AI is a project designed to categorize uncategorized transactions for Actual Budget using OpenAI or OpenAI specification compatible API. It sends requests to the OpenAI API to classify transactions based on their description, amount, and notes. Transactions that cannot be classified are marked as 'not guessed' in notes. The tool allows users to sync accounts before classification and classify transactions on a cron schedule. Guessed transactions are marked in notes for easy review.

github

: 139

TaxHacker

README:

TaxHacker

👋🏻 Getting Started

✨ Features

1 Upload photos or documents to analyze with LLM

2 Multi-currency support with automatic conversion (even for crypto)

3 Customize any LLM prompt

4 Create custom fields, projects, categories

5 Flexible data filtering and export

6 Local data storage and self-hosting

🛳 Deploying or Self-hosting

Environment Variables

⌨️ Local Development

🤝 Contributing

❤️ Donate

📄 License

For Tasks:

For Jobs:

Alternative AI tools for TaxHacker

Similar Open Source Tools

TaxHacker

NekoImageGallery

atropos

helix-db

swirl-search

inspector-laravel

job-llm

RainbowGPT

botpress

KlicStudio

KrillinAI

ai-prompts

superduper

agentok

agents

ProX

For similar tasks

TaxHacker

LangGraph-Expense-Tracker

travel-planner-ai

gemini-android

wealth-tracker

aider-desk

zero-finance

pennywiseai-tracker

For similar jobs

SheetCopilot

LangGraph-Expense-Tracker

receipt-scanner

actual-ai

gemini-android

wealth-tracker

TaxHacker

zero-finance

`1` Upload photos or documents to analyze with LLM

`2` Multi-currency support with automatic conversion (even for crypto)

`3` Customize any LLM prompt

`4` Create custom fields, projects, categories

`5` Flexible data filtering and export

`6` Local data storage and self-hosting