json_repair

A python module to repair invalid JSON from LLMs

Stars: 1653

Visit

This simple package can be used to fix an invalid json string. To know all cases in which this package will work, check out the unit test. Inspired by https://github.com/josdejong/jsonrepair Motivation Some LLMs are a bit iffy when it comes to returning well formed JSON data, sometimes they skip a parentheses and sometimes they add some words in it, because that's what an LLM does. Luckily, the mistakes LLMs make are simple enough to be fixed without destroying the content. I searched for a lightweight python package that was able to reliably fix this problem but couldn't find any. So I wrote one How to use from json_repair import repair_json good_json_string = repair_json(bad_json_string) # If the string was super broken this will return an empty string You can use this library to completely replace `json.loads()`: import json_repair decoded_object = json_repair.loads(json_string) or just import json_repair decoded_object = json_repair.repair_json(json_string, return_objects=True) Read json from a file or file descriptor JSON repair provides also a drop-in replacement for `json.load()`: import json_repair try: file_descriptor = open(fname, 'rb') except OSError: ... with file_descriptor: decoded_object = json_repair.load(file_descriptor) and another method to read from a file: import json_repair try: decoded_object = json_repair.from_file(json_file) except OSError: ... except IOError: ... Keep in mind that the library will not catch any IO-related exception and those will need to be managed by you Performance considerations If you find this library too slow because is using `json.loads()` you can skip that by passing `skip_json_loads=True` to `repair_json`. Like: from json_repair import repair_json good_json_string = repair_json(bad_json_string, skip_json_loads=True) I made a choice of not using any fast json library to avoid having any external dependency, so that anybody can use it regardless of their stack. Some rules of thumb to use: - Setting `return_objects=True` will always be faster because the parser returns an object already and it doesn't have serialize that object to JSON - `skip_json_loads` is faster only if you 100% know that the string is not a valid JSON - If you are having issues with escaping pass the string as **raw** string like: `r"string with escaping\"" Adding to requirements Please pin this library only on the major version! We use TDD and strict semantic versioning, there will be frequent updates and no breaking changes in minor and patch versions. To ensure that you only pin the major version of this library in your `requirements.txt`, specify the package name followed by the major version and a wildcard for minor and patch versions. For example: json_repair==0.* In this example, any version that starts with `0.` will be acceptable, allowing for updates on minor and patch versions. How it works This module will parse the JSON file following the BNF definition: ::= | ::= | | ; Where: ; is a valid real number expressed in one of a number of given formats ; is a string of valid characters enclosed in quotes ; is one of the literal strings 'true', 'false', or 'null' (unquoted) ::=

json_repair

README:

Offer me a beer

Demo

Motivation

Wouldn't GPT-4o Structured Output make this library outdated?

Supported use cases

Fixing Syntax Errors in JSON

Repairing Malformed JSON Arrays and Objects

Auto-Completion for Missing JSON Values

How to use

Avoid this antipattern

Read json from a file or file descriptor

Non-Latin characters

Performance considerations

Use json_repair from CLI

Adding to requirements

How to cite

How it works

How to develop

How to release

Repair JSON in other programming languages

Star History

For Tasks:

For Jobs:

Alternative AI tools for json_repair

Similar Open Source Tools

json_repair

fortuna

laragenie

Bard-API

neo4j-genai-python

windows9x

langchain-decorators

fabrice-ai

ai-component-generator

AnkiAIUtils

llamabot

bia-bob

autoscraper

MARS5-TTS

paper-qa

mosec

For similar tasks

json_repair

json-repair

For similar jobs

lollms-webui

Azure-Analytics-and-AI-Engagement

minio

mage-ai

AiTreasureBox

tidb

airbyte

labelbox-python