mimir

mimir

Python package for measuring memorization in LLMs.

Stars: 106

Visit
 screenshot

MIMIR is a Python package designed for measuring memorization in Large Language Models (LLMs). It provides functionalities for conducting experiments related to membership inference attacks on LLMs. The package includes implementations of various attacks such as Likelihood, Reference-based, Zlib Entropy, Neighborhood, Min-K% Prob, Min-K%++, Gradient Norm, and allows users to extend it by adding their own datasets and attacks.

README:

MIMIR

MIMIR logo

MIMIR - Python package for measuring memorization in LLMs.

Documentation is available here.

Tests Documentation

Instructions

First install the python dependencies

pip install -r requirements.txt

Then, install our package

pip install -e .

To use, run the scripts in scripts/bash

Note: Intermediate results are saved in tmp_results/ and tmp_results_cross/ for bash scripts. If your experiment completes successfully, the results will be moved into the results/ and results_cross/ directory.

Setting environment variables

You can either provide the following environment variables, or pass them via your config/CLI:

MIMIR_CACHE_PATH: Path to cache directory
MIMIR_DATA_SOURCE: Path to data directory

Using cached data

The data we used for our experiments is available on Hugging Face Datasets. You can either choose to either load the data directly from Hugging Face with the load_from_hf flag in the config (preferred), or download the cache_100_200_.... folders into your MIMIR_CACHE_PATH directory.

MIA experiments how to run

python run.py --config configs/mi.json

Attacks

We include and implement the following attacks, as described in our paper.

  • Likelihood (loss). Works by simply using the likelihood of the target datapoint as score.
  • Reference-based (ref). Normalizes likelihood score with score obtained from a reference model.
  • Zlib Entropy (zlib). Uses the zlib compression size of a sample to approximate local difficulty of sample.
  • Neighborhood (ne). Generates neighbors using auxiliary model and measures change in likelihood.
  • Min-K% Prob (min_k). Uses k% of tokens with minimum likelihood for score computation.
  • Min-K%++ (min_k++). Uses k% of tokens with minimum normalized likelihood for score computation.
  • Gradient Norm (gradnorm). Uses gradient norm of the target datapoint as score.
  • ReCaLL(recall). Operates by comparing the unconditional and conditional log-likelihoods.

Adding your own dataset

To extend the package for your own dataset, you can directly load your data inside load_cached() in data_utils.py, or add an additional if-else within load() in data_utils.py if it cannot be loaded from memory (or some source) easily. We will probably add a more general way to do this in the future.

Adding your own attack

To add an attack, create a file for your attack (e.g. attacks/my_attack.py) and implement the interface described in attacks/all_attacks.py. Then, add a name for your attack to the dictionary in attacks/utils.py.

If you would like to submit your attack to the repository, please open a pull request describing your attack and the paper it is based on.

Citation

If you use MIMIR in your research, please cite our paper:

@inproceedings{duan2024membership,
      title={Do Membership Inference Attacks Work on Large Language Models?}, 
      author={Michael Duan and Anshuman Suri and Niloofar Mireshghallah and Sewon Min and Weijia Shi and Luke Zettlemoyer and Yulia Tsvetkov and Yejin Choi and David Evans and Hannaneh Hajishirzi},
      year={2024},
      booktitle={Conference on Language Modeling (COLM)},
}

For Tasks:

Click tags to check more tools for each tasks

For Jobs:

Alternative AI tools for mimir

Similar Open Source Tools

For similar tasks

For similar jobs