A 7B LLM, Code Execution, and synthetic data are all you... | A 7B LLM, Code Execution, and synthetic data are all you...
A 7B LLM, Code Execution, and synthetic data are all you need!

NuminaMath 7B is a task-specific LLM that can solve complex math problems better than most high school students! It uses tool-integrated reasoning to solve problems by applying Chain-of-Thought reasoning and Python REPLs in an agentic flow with self-healing.

NuminaMath 7B TIR solves math problems by:
1. Generating a Chain of Thought reasoning on how to approach the problem.
2. Translating the CoT into Python Code.
3. Executes the Python Code in a REPL
3. If it fails, it tries to self-heal, repeating steps 1-3 using the wrong output.
If it succeeds, it generates a nice response with the result.

Model TL;DR:
> Fine-tuned from deepseek-math-7b-base
> Won the first progress prize in the AI Math Olympiad (AIMO)
> Built a large synthetic dataset following ToRA paper
> Trained in two-stage using Supervised Fine-Tuning on the Hugging Face cluster
> Utilizes tool-integrated reasoning with Python REPL
> Available on
@huggingface