amazon-sagemaker-llm-fine-tuning-remote-decorator

amazon-sagemaker-llm-fine-tuning-remote-decorator

None

Stars: 57

Visit
 screenshot

README:

Interactive fine-tuning of Foundation Models with Amazon SageMaker Training using @remote decorator

Important: The scope of these notebook examples is to showcase interactive experience with SageMaker AI capabilities and @remote decorator, for Small/Large Language Models fine-tuning by using different distribution techniques, such as FSDP, and DDP.

In this example we will go through the steps required for interactively fine-tuning foundation models on Amazon SageMaker AI by using @remote decorator for executing Training jobs.

You can run this repository from Amazon SageMaker Studio or from your local IDE.

For additional information, take a look at the AWS Blog Fine-tune Falcon 7B and other LLMs on Amazon SageMaker with @remote decorator

Prerequistes

The notebooks are currently using the latest PyTorch Training Container available for the region us-east-1. If you are running the notebooks in a different region, make sure to update the ImageUri in the file config.yaml.

Make sure the Python version in your local environment matches the one in the used container

Python version used in the training container: Python 3.11

⚠️ Make sure your local Python version is aligned with the Python version in the container.

If you want to operate in a different AWS region

  1. Navigate [Available Deep Learning Containers Images](Available Deep Learning Containers Images)
  2. Select the right Hugging Face TGI container for model training based on your selected region
  3. Update ImageUri in the file config.yaml

Notebooks

  1. [Supervised - QLoRA] Falcon-7B
  2. [Supervised - QLoRA, FSDP] Llama-13B
  3. [Self-supervised - QLoRA, FSDP] Llama-13B
  4. [Self-supervised - QLoRA] Mistral-7B
  5. [Supervised - QLoRA, FSDP] Mixtral-8x7B
  6. [Supervised - QLoRA, DDP] Code-Llama 13B
  7. [Supervised - QLORA, DDP] Llama-3 8B
  8. [Supervised - QLoRA, DDP] Llama-3.1 8B
  9. [Supervised - QLoRA, DDP] Arcee AI Llama-3.1 Supernova Lite
  10. [Supervised - QLoRA] Llama-3.2 1B
  11. [Supervised - QLoRA] Llama-3.2 3B
  12. [Supervised - QLoRA, FSDP] Codestral-22B
  13. [Supervised - LoRA] TinyLlama 1.1B
  14. [Supervised - LoRA] Arcee Lite 1.5B
  15. [Supervised - LoRA] SmolLM2-1.7B-Instruct
  16. [Supervised - QLORA, FSDP] Qwen 2.5 7B
  17. [Supervised - QLORA] Falcon3 3B
  18. [Supervised - QLORA, FSDP] Falcon3 7B
  19. [Supervised - QLORA, FSDP] Llama-3.1 70B
  20. [Self-supervised - DoRA, FSDP] Mistral-7B v0.3
  21. [Supervised - QLORA, FSDP] Llama-3.3 70B
  22. [Supervised - QLORA, FSDP] OpenCoder-8B-Instruct
  23. [Supervised - QLORA, FSDP] DeepSeek-R1-Distill-Qwen-32B
  24. [Supervised - QLORA, FSDP] DeepSeek-R1-Distill-Llama-70B
  25. [Supervised - QLORA, FSDP] DeepSeek-R1-Distill-Llama-8B
  26. [Supervised - QLORA, DDP] DeepSeek-R1-Distill-Qwen-1.5B
  27. [Supervised - QLORA, FSDP] DeepSeek-R1-Distill-Qwen-7B
  28. [Supervised - QLORA, FSDP] Mistral-Small-24B-Instruct-2501

Troubleshoot

return cloudpickle.loads(bytes_to_deserialize)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Traceback (most recent call last): in deserialize return cloudpickle.loads(bytes_to_deserialize)
YYYY-MM-DDThh:mm:ss
AttributeError: Can't get attribute '_function_setstate' on <module 'cloudpickle.cloudpickle' from '/opt/conda/lib/python3.11/site-packages/cloudpickle/cloudpickle.py'>

Solution

Align your cloudpickle local version to the container one, by including in your requirements.txt:

cloudpickle==x.x.x

Where x.x.x is the version you want to install.


For Tasks:

Click tags to check more tools for each tasks

For Jobs:

Alternative AI tools for amazon-sagemaker-llm-fine-tuning-remote-decorator

Similar Open Source Tools

For similar tasks

No tools available

For similar jobs

No tools available