deep dive | Ole Reissmann

Which AI to use: Free users don’t get the good AI, and even if you pay, you have to pick thinking mode. Once you’ve done that, the model matters less than the harness. For newsrooms: curated archive retrieval, a system prompt encoding editorial voice, CMS integration, verification. (Ethan Mollick, One Useful Thing)

Just send the prompt twice? A new paper argues that repeating helps non-reasoning models. There’s a catch: The models tested (4o, Claude 3.7) are retired by now.

There’s no neat technical fix for that: The more useful an agent is, the more access it needs, and the more access it has, the riskier it gets. Yes, it’s about the Moltbot/OpenClaw agent craze, but also, it’s not. (Dan Hon, Things That Caught My Attention)

The name says “Code,” but you don’t need to write any: Florent Daudens walks journalists through setting up Claude Code as a persistent reporting assistant that can read your files, track your story, and stop asking you to re-upload that PDF for the fifth time.

Vibe coding starter guide for newsrooms (Joe Amditis, Center for Cooperative Media)

The Journalism Benchmark Cookbook: We prototyped a benchmark evaluating the task of information extraction in journalism. (Charlotte Li, Jeremy Gilbert, Nicholas Diakopoulos, Generative AI in the Newsroom)

Is fine-tuning having a moment again? After being overshadowed by bigger, shinier models, it’s creeping back into the conversation—and this time, it might actually stick. (Kevin Kuipers, Sota)

How to get less hallucinations: “What often is deemed a ‘wrong’ response is often merely a first pass at describing the beliefs out there. And the solution is the same: iterate the process.” (Mike Caulfield, The End(s) of Argument)

Less than nine seconds of watching TV: That’s the energy consumption Google reports for the “median Gemini Apps text prompt” in May 2025, which includes “all LLM models serving the Gemini app, including all supporting models for scoring, ranking, classification, and other prompt routing tasks” and accounts for idle machines and overhead.

“Despite appearances, an LLM does not actually output text”: The Guardian’s Joseph Lochlann Smith with a myth-busting deep dive. (Medium)

Image generation, without the “AI look”: Flux.1 Krea is an open weights model with opinionated aesthetics.

Make prompt engineering great again: A growing list of tools may help you improve your generative AI prompts, but sometimes all you need is a spreadsheet. (Clare Spencer, Generative AI in the Newsroom)

Case Study on iterative prompt evaluation and improvement: A workflow for targeted prompting to refine AI-generated newsletter headlines. (Ashlyn Wang, Generative AI in the Newsroom)

LLMs have a “lost in the middle” problem – they focus on the start and end of documents but miss key info in between. (Adam Zewe, MIT News)

What makes workflows different from agents? A good introduction and explanation from Anthropic, and a case for keeping things simple.

/ Tags / deep dive (75)

THEFUTURE-Newsletter