Fine-tuning LLMs

Lecture slides: LLM-Course Lecture 4

Lab Exercise

Run the supervised_finetuning.ipynb notebook in Google Colab.
Change the base model used (search for small <7B parameter models in Hugging Face).
Change the dataset used in fine-tuning.
Bonus challenge:
- Change the fine-tuning method from supervised fine-tuning to DPO.
- Change the code accordingly, see: Hugging Face DPO Trainer Documentation
- Select an appropriate DPO dataset. Search Hugging Face Datasets.