https://learn.deeplearning.ai/reinforcement-learning-from-human-feedback
A conceptual and hands-on introduction to tuning and evaluating large language models (LLMs) using Reinforcement Learning from Human Feedback.
- Get a conceptual understanding of Reinforcement Learning from Human Feedback (RLHF), as well as the datasets needed for this technique
- Fine-tune the Llama 2 model using RLHF with the open source Google Cloud Pipeline Components Library
- Evaluate tuned model performance against the base model with evaluation methods