Yesterday @mattshumer released mattshumer/Reflection-Lla...

Yesterday @mattshumer released
mattshumer/Reflection-Llama-3.1-70B
, an impressive model that achieved incredible results in benchmarks like MMLU. The model was fine-tuned using Reflection-Tuning and the dataset used wasn't released, but I created a small recipe with distilabel that allows generating a dataset with a similar output format:

1. We use MagPie 🐦 in combination with
meta-llama/Meta-Llama-3.1-70B-Instruct
to generate reasoning instructions.
2. We generate a response again using
meta-llama/Meta-Llama-3.1-70B-Instruct
, but we steer the LLM to generate an specific output format using a custom system prompt. In the system prompt, we instruct the LLM that it will have first to think 💭 and have reflections that will help resolving ambiguities. After that, we instruct the LLM to generate an output based on the previous thinking

In this dataset
gabrielmbmb/distilabel-reflection-tuning
you can found 5 rows that I generated with this recipe. You can also found the code of the pipeline in the file called reflection.py.