Best AI tools for< Quantize Kv Cache >
0 - AI tool Sites
No tools available
1 - Open Source AI Tools

kvpress
This repository implements multiple key-value cache pruning methods and benchmarks using transformers, aiming to simplify the development of new methods for researchers and developers in the field of long-context language models. It provides a set of 'presses' that compress the cache during the pre-filling phase, with each press having a compression ratio attribute. The repository includes various training-free presses, special presses, and supports KV cache quantization. Users can contribute new presses and evaluate the performance of different presses on long-context datasets.
github
: 600
0 - OpenAI Gpts
No tools available