Ole Reissmann

About · Newsletter

AI & Journalism Links

Anthropic looks under the hood of Claude 3.5 Haiku, using circuit tracing to see how it works across different kinds of tasks.

Summary

  • Our methods can reveal interpretable steps in a model's reasoning, but only work well in specific cases with clear computational "cruxes."
  • Models use sophisticated mechanisms including planning ahead, working backward from goals, and employing abstract representations that generalize across contexts like languages and domains.
  • Despite some successes, current interpretability tools still miss crucial aspects of computation, especially in attention mechanisms and complex reasoning chains.

posted 28.3.2025 by oler · AI & Journalism · anthropic

You are seeing a single entry in AI & Journalism Links. The previous entry is SEO, but for AI search engines, the next entry is How Does AI Search Find Content to Generate Answers?.

Subscribe to THEFUTURE

We're all just watching platforms swallow journalism whole and posting through it lmao. THEFUTURE is my weekly attempt to process this cursed timeline.

Reverse royalties shocker: Journalists cash in big in 2025