Just send the prompt twice? A new paper argues that repeating helps non-reasoning models. There’s a catch: The models tested (4o, Claude 3.7) are retired by now.
- Google researchers tested a simple trick across Gemini, GPT-4o, Claude, and DeepSeek, and found that repeating the input prompt verbatim improved accuracy in 47 out of 70 benchmark tests, with zero losses, and no increase in output length or latency.
- The gains come from transformer architecture: because LLMs process tokens left to right, a token early in the prompt can't "see" what comes later.
- When reasoning mode is on, the trick is mostly neutral (models already tend to restate the prompt themselves).