Test Time Compute, LLM Reasoning, Inference Time Scaling

October 15, 2024 updated October 23, 2024 3 min read

Papers

Image Generation

[2501.09732] Inference-Time Scaling for Diffusion Models beyond Scaling Denoising Steps

Open Source

Reasoning Distillation

Bespoke-Stratos: The unreasonable effectiveness of reasoning distillation

[2203.01517] Correct-N-Contrast: A Contrastive Approach for Improving Robustness to Spurious Correlations

Reasoning Datasets

Research

STaR

[2203.14465] STaR: Bootstrapping Reasoning With Reasoning

We propose a technique to iteratively leverage a small number of rationale examples and a large dataset without rationales, to bootstrap the ability to perform successively more complex reasoning. This technique, the “Self-Taught Reasoner” (STaR), relies on a simple loop: generate rationales to answer many questions, prompted with a few rationale examples; if the generated answers are wrong, try again to generate a rationale given the correct answer; fine-tune on all the rationales that ultimately yielded correct answers; repeat. We show that STaR significantly improves performance on multiple datasets compared to a model fine-tuned to directly predict final answers, and performs comparably to fine-tuning a 30× larger state-of-the-art language model on CommensenseQA. Thus, STaR lets a model improve itself by learning from its own generated reasoning.

Test Time Compute, LLM Reasoning, Inference Time Scaling

Papers

Image Generation

Open Source

Reasoning Distillation

Related

Reasoning Datasets

Research

STaR