This research paper argues against “reasoning/thinking” hype: intermediate tokens often lack substance, despite appearances.
Summary
- Anthropomorphizing intermediate tokens as "reasoning/thinking traces" is misleading, lacks evidence, and leads to questionable research directions.
- Intermediate tokens often lack semantics or causal connection to model outputs, undermining claims of capturing "human-like" reasoning.
- Increased length of intermediate tokens during training does not necessarily reflect improved "reasoning effort" but can arise from simplistic reward structures.