crystal-text-llm

crystal-text-llm

Large language models to generate stable crystals.

Stars: 54

Visit
 screenshot

This repository contains the code for the paper Fine-Tuned Language Models Generate Stable Inorganic Materials as Text. It demonstrates how finetuned LLMs can be used to generate stable materials, match or exceed the performance of domain specific models, mutate existing materials, and sample crystal structures conditioned on text descriptions. The method is distinct from CrystaLLM, which trains language models from scratch on CIF-formatted crystals.

README:

Fine-Tuned Language Models Generate Stable Inorganic Materials as Text

This repository contains the code for the paper Fine-Tuned Language Models Generate Stable Inorganic Materials as Text by Nate Gruver, Anuroop Sriram, Andrea Madotto, Andrew Gordon Wilson, C. Lawrence Zitnick, and Zachary Ward Ulissi (ICLR 2024).

Image

We show that finetuned LLMs can be used to generate stable materials using string encodings. These finetuned LLMs can match or exceed the performance of a domain specific diffusion model (CDVAE). LLMs can also be used to mutate existing materials or to sample crystal structures conditioned on text descriptions.


⚠️ Our method resembles but is not the same as CrystaLLM (https://arxiv.org/abs/2307.04340), which explores training language models from scratch on CIF-formatted crystals. You can find the code associated with that project at the following link: https://github.com/lantunes/CrystaLLM. ⚠️

🛠 Installation

Run the following command to install all dependencies.

source install.sh

After installation, activate the environment with

conda activate crystal-llm

If you prefer not using conda, you can also install the dependencies listed in install.sh manually.

🚀 Training and Sampling Models

Run training with

python llama_finetune.py --run-name 7b-test-run --model 7b

and sample from a PEFT model with

python llama_sample.py --model_name 7b --model_path=exp/7b-test-run/checkpoint-500 --out_path=llm_samples.csv

License

The majority of crystall-llm is licensed under CC-BY-NC, however portions of the project are available under separate license terms: https://github.com/materialsproject/pymatgen is licensed under the MIT license, https://github.com/huggingface/transformers is licensed under Apache 2.0, and https://gitlab.com/ase/ase/-/ is licensed under GNU Lesser General License

Citation

Please cite our work as:

@inproceedings{gruver2023llmtime,
    title={Fine-Tuned Language Models Generate Stable Inorganic Materials as Text},
    author={Nate Gruver, Anuroop Sriram, Andrea Madotto, Andrew Gordon Wilson, C. Lawrence Zitnick, and Zachary Ward Ulissi},
    booktitle={International Conference on Learning Representations 2024},
    year={2024}
}

For Tasks:

Click tags to check more tools for each tasks

For Jobs:

Alternative AI tools for crystal-text-llm

Similar Open Source Tools

For similar tasks

For similar jobs