Illustrating Reinforcement Learning from Human Feedback (... | Illustrating Reinforcement Learning from Human Feedback (...
Illustrating Reinforcement Learning from Human Feedback (RLHF)
Language models have shown impressive capabilities in the past few years by generating diverse and compelling text from human input prompts. However, what makes a "good" text is inherently hard to define as it is subjective and context dependent. There are many applications such as writing stories where you want creativity, pieces of informative text which should be truthful, or code snippets that we want to be executable.
https://github.com/huggingface/blog/blob/main/rlhf.md