llm_processes

None

Stars: 55

Visit

README:

Code for LLM Processes: Numerical Predictive Distributions Conditioned on Natural Language

This repository contains the code to reproduce the experiments carried out in LLM Processes: Numerical Predictive Distributions Conditioned on Natural Language.

The code has been authored by: John Bronskill, James Requeima, and Dami Choi.

Dependencies

This code requires the following:

python 3.9 or greater
PyTorch 2.3.0 or greater
transformers 4.41.0 or greater
accelerate 0.30.1 or greater
jsonargparse 4.28.0 or greater
matplotlib 3.9.0 or greater
optuna 3.6.1 or greater (only needed if you intend to run the black-box optimization experiments)
gpytorch 1.14 or greater (only if you intend to run the Gaussian Process code)

LLM Support and GPU Requirements

We support a variety of LLMs through the Hugging Face transformer APIs. The code currently supports the following LLMs:

LLM Type	URL	GPU Memory Required (GB)
phi-3-mini-128k-instruct	https://huggingface.co/microsoft/Phi-3-mini-128k-instruct	8
llama-2-7B	https://huggingface.co/meta-llama/Llama-2-7b	24
llama-2-70B	https://huggingface.co/meta-llama/Llama-2-70b	160
llama-3-8B	https://huggingface.co/meta-llama/Meta-Llama-3-8B	24
llama-3-70B	https://huggingface.co/meta-llama/Meta-Llama-3-70B	160
mixtral-8x7B	https://huggingface.co/mistralai/Mixtral-8x7B-v0.1	160
mixtral-8x7B-instruct	https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1	160

Adding a new LLM that supports the hugging face APIs is not difficult, just modify hf_api.py.

Installation

Clone or download this repository.
Run pip install . to install the llm_processes package and all dependencies.

Running the code

Installing the llm_processes package will automatically install the llm_process command. You can view its arguments by running llm_process --help.

Use the command as: llm_process --llm_type <LLM Type> [additional options]

Common options:

--experiment_name <value> specifies a name that will be used to name any output or plot files, default is test.

--output_dir <directory where output files are written>, default is ./output.

--plot_dir <directory where output plot files are written>, default is ./plots.

--num_samples <number of samples to take at each target location>, default is 50.

--autoregressive <True/False>, if True, run A-LLMP, if False, run I-LLMP, default is False.

--batch_size <value> controls how many samples for each target point are processed at once. A higher value will result in faster execution, but will consume more GPU memory. Lower this number if you get out of memory errors. Default is 5.

Reproducing the Experiments

Prompt Engineering

The additional options are:

Data: --data_path <choose a file from the data/functions directory>. In the experiments we used sigmoid_10_seed_*.pkl, square_20_seed_*.pkl, and linear_cos_75_seed_*.pkl, where you would substitute a seed number for the *.

Prompt Format: --x_prefix <value>, --y_prefix <value>, and --break_str <value>

Prompt Order: --prompt_ordering <sequential/random/distance>

Prompt y-Scaling: --y_min <value> and --y_max <value>

Top-p and Temperature: --top_p <value> and --temperature <value>

Autoregressive: --autoregressive True

1D Synthetic Data

From the root directory of the repo, run: python ./experiments/run_functions_exp.py --llm_type <LLM Type> --function <beat/exp/gaussian_wave/linear/linear_cos/log/sigmoid/sinc/sine/square/x_times_sine/xsin>

Compare to LLMTime

From the root directory of the repo, run: python ./experiments/run_compare_exp.py --llm_type <LLM Type>

Fashion MNIST

From the root directory of the repo, run: python ./experiments/run_fashion_mnist_exp.py --llm_type <LLM Type>

Black-box Optimization

From the root directory of the repo, run: python ./experiments/run_black_box_opt_exp.py --llm_type <LLM Type> --experiment_name_prefix <see table> --function <see table> --max_generated_length <see table> --num_cold_start_points <see table>

function	experiment_name_prefix	max_generated_length	num_cold_start_points
Sinusoidal	Sinusoidal	7	7
Gramacy	Gramacy	8	12
Branin	Branin	7	12
Bohachevsky	Bohachevsky	11	12
Goldstein	Goldstein	12	12
Hartmann3	Hartmann3	7	15

Simultaneous Temperature, Rainfall, and Wind Speed Regression

From the root directory of the repo, run: python run_llm_process.py --llm_type <LLM Type> --experiment_name weather_3 --data_path ./data/weather/weather_3.pkl --autoregressive True --num_decimal_places_y 1 --max_generated_length 20

In-context Learning Using Related Data Examples

From the root directory of the repo, run: python ./experiments/run_in_context.py --llm_type <LLM Type>

Conditioning LLMPs on Textual Information

Scenario-conditional Predictions

From the root directory of the repo, run: llm_process --llm_type <LLM Type> --data_path ./data/scenario/scenario_data_2_points.pkl --prefix <prompt to try> --autoregressive True --plot_trajectories 5 --forecast True

Labelling Features Using Text

From the root directory of the repo, run: python ./experiments/run_housing_exp.py --llm_type <LLM Type>

Attributions

Contact

To ask questions or report issues, please open an issue on the issues tracker.

Citation

If you use this code, please cite our paper:

@inproceedings{requeima2024llm,
 author = {Requeima, James and Bronskill, John and Choi, Dami and Turner, Richard E and Duvenaud, David},
 booktitle = {Advances in Neural Information Processing Systems},
 editor = {A. Globerson and L. Mackey and D. Belgrave and A. Fan and U. Paquet and J. Tomczak and C. Zhang},
 pages = {109609--109671},
 publisher = {Curran Associates, Inc.},
 title = {LLM Processes: Numerical Predictive Distributions Conditioned on Natural Language},
 url = {https://proceedings.neurips.cc/paper_files/paper/2024/file/c5ec22711f3a4a2f4a0a8ffd92167190-Paper-Conference.pdf},
 volume = {37},
 year = {2024}
}

For Tasks:

Click tags to check more tools for each tasks

For Jobs:

Alternative AI tools for llm_processes

Similar Open Source Tools

llm_processes

github

: 55

TPI-LLM

TPI-LLM (Tensor Parallelism Inference for Large Language Models) is a system designed to bring LLM functions to low-resource edge devices, addressing privacy concerns by enabling LLM inference on edge devices with limited resources. It leverages multiple edge devices for inference through tensor parallelism and a sliding window memory scheduler to minimize memory usage. TPI-LLM demonstrates significant improvements in TTFT and token latency compared to other models, and plans to support infinitely large models with low token latency in the future.

github

: 123

ovos-installer

The ovos-installer is a simple and multilingual tool designed to install Open Voice OS and HiveMind using Bash, Whiptail, and Ansible. It supports various Linux distributions and provides an automated installation process. Users can easily start and stop services, update their Open Voice OS instance, and uninstall the tool if needed. The installer also allows for non-interactive installation through scenario files. It offers a user-friendly way to set up Open Voice OS on different systems.

github

: 138

graphrag-visualizer

GraphRAG Visualizer is an application designed to visualize Microsoft GraphRAG artifacts by uploading parquet files generated from the GraphRAG indexing pipeline. Users can view and analyze data in 2D or 3D graphs, display data tables, search for specific nodes or relationships, and process artifacts locally for data security and privacy.

github

: 301

rwkv.cpp

rwkv.cpp is a port of BlinkDL/RWKV-LM to ggerganov/ggml, supporting FP32, FP16, and quantized INT4, INT5, and INT8 inference. It focuses on CPU but also supports cuBLAS. The project provides a C library rwkv.h and a Python wrapper. RWKV is a large language model architecture with models like RWKV v5 and v6. It requires only state from the previous step for calculations, making it CPU-friendly on large context lengths. Users are advised to test all available formats for perplexity and latency on a representative dataset before serious use.

github

: 1.1k

gollama

Gollama is a delightful tool that brings Ollama, your offline conversational AI companion, directly into your terminal. It provides a fun and interactive way to generate responses from various models without needing internet connectivity. Whether you're brainstorming ideas, exploring creative writing, or just looking for inspiration, Gollama is here to assist you. The tool offers an interactive interface, customizable prompts, multiple models selection, and visual feedback to enhance user experience. It can be installed via different methods like downloading the latest release, using Go, running with Docker, or building from source. Users can interact with Gollama through various options like specifying a custom base URL, prompt, model, and enabling raw output mode. The tool supports different modes like interactive, piped, CLI with image, and TUI with image. Gollama relies on third-party packages like bubbletea, glamour, huh, and lipgloss. The roadmap includes implementing piped mode, support for extracting codeblocks, copying responses/codeblocks to clipboard, GitHub Actions for automated releases, and downloading models directly from Ollama using the rest API. Contributions are welcome, and the project is licensed under the MIT License.

github

: 80

StableToolBench

StableToolBench is a new benchmark developed to address the instability of Tool Learning benchmarks. It aims to balance stability and reality by introducing features such as a Virtual API System with caching and API simulators, a new set of solvable queries determined by LLMs, and a Stable Evaluation System using GPT-4. The Virtual API Server can be set up either by building from source or using a prebuilt Docker image. Users can test the server using provided scripts and evaluate models with Solvable Pass Rate and Solvable Win Rate metrics. The tool also includes model experiments results comparing different models' performance.

github

: 59

llm-structured-output-benchmarks

Benchmark various LLM Structured Output frameworks like Instructor, Mirascope, Langchain, LlamaIndex, Fructose, Marvin, Outlines, LMFormatEnforcer, etc on tasks like multi-label classification, named entity recognition, synthetic data generation. The tool provides benchmark results, methodology, instructions to run the benchmark, add new data, and add a new framework. It also includes a roadmap for framework-related tasks, contribution guidelines, citation information, and feedback request.

github

: 111

dvc

DVC, or Data Version Control, is a command-line tool and VS Code extension that helps you develop reproducible machine learning projects. With DVC, you can version your data and models, iterate fast with lightweight pipelines, track experiments in your local Git repo, compare any data, code, parameters, model, or performance plots, and share experiments and automatically reproduce anyone's experiment.

github

: 13.6k

local-deep-research

Local Deep Research is a powerful AI-powered research assistant that performs deep, iterative analysis using multiple LLMs and web searches. It can be run locally for privacy or configured to use cloud-based LLMs for enhanced capabilities. The tool offers advanced research capabilities, flexible LLM support, rich output options, privacy-focused operation, enhanced search integration, and academic & scientific integration. It also provides a web interface, command line interface, and supports multiple LLM providers and search engines. Users can configure AI models, search engines, and research parameters for customized research experiences.

github

: 2.0k

optillm

optillm is an OpenAI API compatible optimizing inference proxy implementing state-of-the-art techniques to enhance accuracy and performance of LLMs, focusing on reasoning over coding, logical, and mathematical queries. By leveraging additional compute at inference time, it surpasses frontier models across diverse tasks.

github

: 2.1k

mistral.rs

Mistral.rs is a fast LLM inference platform written in Rust. We support inference on a variety of devices, quantization, and easy-to-use application with an Open-AI API compatible HTTP server and Python bindings.

github

: 5.4k

StableToolBench

StableToolBench is a new benchmark developed to address the instability of Tool Learning benchmarks. It aims to balance stability and reality by introducing features like Virtual API System, Solvable Queries, and Stable Evaluation System. The benchmark ensures consistency through a caching system and API simulators, filters queries based on solvability using LLMs, and evaluates model performance using GPT-4 with metrics like Solvable Pass Rate and Solvable Win Rate.

github

: 135

aiosmb

aiosmb is a fully asynchronous SMB library written in pure Python, supporting Python 3.7 and above. It offers various authentication methods such as Kerberos, NTLM, SSPI, and NEGOEX. The library supports connections over TCP and QUIC protocols, with proxy support for SOCKS4 and SOCKS5. Users can specify an SMB connection using a URL format, making it easier to authenticate and connect to SMB hosts. The project aims to implement DCERPC features, VSS mountpoint operations, and other enhancements in the future. It is inspired by Impacket and AzureADJoinedMachinePTC projects.

github

: 202

factorio-learning-environment

Factorio Learning Environment is an open source framework designed for developing and evaluating LLM agents in the game of Factorio. It provides two settings: Lab-play with structured tasks and Open-play for building large factories. Results show limitations in spatial reasoning and automation strategies. Agents interact with the environment through code synthesis, observation, action, and feedback. Tools are provided for game actions and state representation. Agents operate in episodes with observation, planning, and action execution. Tasks specify agent goals and are implemented in JSON files. The project structure includes directories for agents, environment, cluster, data, docs, eval, and more. A database is used for checkpointing agent steps. Benchmarks show performance metrics for different configurations.

github

: 525

AiOS

AiOS is a tool for human pose and shape estimation, performing human localization and SMPL-X estimation in a progressive manner. It consists of body localization, body refinement, and whole-body refinement stages. Users can download datasets for evaluation, SMPL-X body models, and AiOS checkpoint. Installation involves creating a conda virtual environment, installing PyTorch, torchvision, Pytorch3D, MMCV, and other dependencies. Inference requires placing the video for inference and pretrained models in specific directories. Test results are provided for NMVE, NMJE, MVE, and MPJPE on datasets like BEDLAM and AGORA. Users can run scripts for AGORA validation, AGORA test leaderboard, and BEDLAM leaderboard. The tool acknowledges codes from MMHuman3D, ED-Pose, and SMPLer-X.

github

: 121

For similar tasks

No tools available

For similar jobs

No tools available

llm_processes

README:

Code for LLM Processes: Numerical Predictive Distributions Conditioned on Natural Language

Dependencies

LLM Support and GPU Requirements

Installation

Running the code

Reproducing the Experiments

Prompt Engineering

1D Synthetic Data

Compare to LLMTime

Fashion MNIST

Black-box Optimization

Simultaneous Temperature, Rainfall, and Wind Speed Regression

In-context Learning Using Related Data Examples

Conditioning LLMPs on Textual Information

Scenario-conditional Predictions

Labelling Features Using Text

Attributions

Contact

Citation

See also

For Tasks:

For Jobs:

Alternative AI tools for llm_processes

Similar Open Source Tools

llm_processes

TPI-LLM

ovos-installer

graphrag-visualizer

rwkv.cpp

gollama

StableToolBench

llm-structured-output-benchmarks

dvc

local-deep-research

optillm

mistral.rs

StableToolBench

aiosmb

factorio-learning-environment

AiOS

For similar tasks

For similar jobs