A visit to the nonprofit that powers most of today’s AI training: Common Crawl ingests the web, journalism and all, unapologetically. The article falls a bit short on fair use and other archives, but it’s a good read. (Alex Reisner, The Atlantic)