Yesterday @mattshumer released mattshumer/Reflection-Lla... | Yesterday @mattshumer released mattshumer/Reflection-Lla...
Yesterday @mattshumer released
mattshumer/Reflection-Llama-3.1-70B
, an impressive model that achieved incredible results in benchmarks like MMLU. The model was fine-tuned using Reflection-Tuning and the dataset used wasn't released, but I created a small recipe with distilabel that allows generating a dataset with a similar output format:

1. We use MagPie ๐Ÿฆ in combination with
meta-llama/Meta-Llama-3.1-70B-Instruct
to generate reasoning instructions.
2. We generate a response again using
meta-llama/Meta-Llama-3.1-70B-Instruct
, but we steer the LLM to generate an specific output format using a custom system prompt. In the system prompt, we instruct the LLM that it will have first to think ๐Ÿ’ญ and have reflections that will help resolving ambiguities. After that, we instruct the LLM to generate an output based on the previous thinking

In this dataset
gabrielmbmb/distilabel-reflection-tuning
you can found 5 rows that I generated with this recipe. You can also found the code of the pipeline in the file called reflection.py.