GRPO

January 22, 2025 updated November 19, 2025 1 min read

[2402.03300] DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models

[x.com](https://x.com/Hesamation/status/1883992881914077493