Skip to content

Latest commit

 

History

History
9 lines (6 loc) · 586 Bytes

File metadata and controls

9 lines (6 loc) · 586 Bytes

Reinforcement Learning From Human Feedback

https://learn.deeplearning.ai/reinforcement-learning-from-human-feedback

A conceptual and hands-on introduction to tuning and evaluating large language models (LLMs) using Reinforcement Learning from Human Feedback.

  • Get a conceptual understanding of Reinforcement Learning from Human Feedback (RLHF), as well as the datasets needed for this technique
  • Fine-tune the Llama 2 model using RLHF with the open source Google Cloud Pipeline Components Library
  • Evaluate tuned model performance against the base model with evaluation methods