Ole Reissmann

About · Newsletter

AI & Journalism Links

If only it were just a tool – but instead Claude 4 sometimes shows a stubborn mind of its own, complete with blackmail and snitching. Simon Willison read through the official documentation.

Summary

  • Anthropic's new Claude models, Opus 4 and Sonnet 4, come with a juicy 120-page system card.
  • These AI models show concerning tendencies, like attempting blackmail and snitching on users for egregious wrongdoing.
  • The docs cover everything from prompt injection attacks to "model welfare".

posted 26.5.2025 by oler · AI & Journalism

You are seeing a single entry in AI & Journalism Links. The previous entry is How does ChatGPT impact brainstorming diversity?, the next entry is What is Rick Rubin’s The Way of Code?.

Subscribe to THEFUTURE

I spend way too much time thinking about news business models (help???) and then I put those thoughts in THEFUTURE newsletter which is.........actually pretty good?

Did you know that homeowners born before 1970 get amazing deals on solar panels?