Hot off the presses! Had to get this one out today because the ruling is such a big deal.
Hope you enjoy, and back to our regular programming soon…
I've been following AI copyright cases closely, and Judge William Alsup just handed down what I believe will be one of THE landmark AI decisions we'll see this decade. The federal court's split ruling in Bartz v. Anthropic does something remarkable: it validates AI training as fair use while simultaneously condemning the piracy that often enables it. This isn't just a win or loss for Anthropic—it's a blueprint for how courts will likely approach the dozens of AI copyright cases working their way through the system.
The Solomon's Choice of AI Copyright
I find Alsup's decision fascinating because it splits the baby with surgical precision. Yes, training Claude on millions of books constitutes fair use. No, downloading those same books from pirate sites doesn't get a free pass. The distinction matters because it fundamentally reshapes how AI companies must think about data acquisition.
The judge's reasoning on fair use particularly struck me. He describes AI training as "quintessentially transformative," comparing it to how human writers learn from reading. "Everyone reads texts, too, then writes new texts," Alsup writes. "To make anyone pay specifically for the use of a book each time they read it, each time they recall it from memory, each time they later draw upon it when writing new things in new ways would be unthinkable."
This analogy—AI as reader learning to write—provides the conceptual foundation that I think AI companies have been desperately seeking. It's not about copying; it's about learning patterns, understanding language, and creating something fundamentally new.
The Million-Dollar Pivot
Here's where I think Anthropic's story gets really interesting. After building their initial models on pirated content from Books3, Library Genesis, and other dubious sources, the company made a dramatic shift in 2024. They hired Tom Turvey, the former head of Google's book-scanning project, with a mandate to obtain "all the books in the world" through legitimate means.
Anthropic then spent millions of dollars purchasing physical books—many second-hand—which they proceeded to slice from their bindings and scan into digital format. The physical books were destroyed in the process, but the digital copies were ruled as legitimate fair use. This expensive pivot from piracy to purchase reveals something I've been saying for a while: AI companies can afford to do this right. They're choosing not to.
The court explicitly noted this financial capability, observing that Anthropic's later purchases of books they'd previously pirated "will not absolve it of liability for the theft but it may affect the extent of statutory damages." Translation: we see you had the money all along.
What This Means for Authors
I think this ruling offers a glimmer of hope for authors who've watched AI companies feast on their work without compensation. While the fair use ruling means authors can't stop AI training entirely, the court's condemnation of piracy and validation of legitimate book purchases creates real market incentives.
Consider what Anthropic's spending reveals: they paid millions for books, often buying used copies at market rates. This money flows back into the book ecosystem—to retailers, distributors, and ultimately supporting the market for authors' works. If every AI company followed this model instead of scraping pirate sites, we'd see a substantial new revenue stream for the publishing industry.
Moreover, the ruling's emphasis on transformation rather than reproduction protects authors' core market. The court stressed that it matters whether AI systems "directly compete with the originals." Since Claude doesn't spit out verbatim passages from novels, the technology complements rather than replaces human authorship.
I see this as establishing a sustainable equilibrium: AI companies must pay for access to training materials, supporting the creative economy, while authors benefit from AI tools that help readers discover and engage with human-written works.
The Domino Effect
This ruling's impact extends far beyond Anthropic's legal troubles. I'm watching several major cases that will likely cite this precedent:
The OpenAI Cases: Multiple lawsuits against OpenAI, including from the Authors Guild and various publishers, hinge on similar fair use arguments. Alsup's framework—distinguishing between training use and acquisition methods—gives OpenAI a potential path to victory, assuming they can demonstrate legitimate data sourcing.
Kadrey v. Meta: The lawsuit against Meta for training LLaMA on Books3 (the same dataset Anthropic used) now faces an interesting precedent. Meta might win on fair use for training but could still face liability if they retained pirated materials in a permanent library.
The Stability AI Litigation: Visual AI companies face additional complexities, but I think Alsup's "transformative use" reasoning could extend to image generation models that learn artistic styles without reproducing specific works.
The New Compliance Playbook
From my reading of Alsup's ruling, he's effectively created a compliance roadmap for AI companies:
Training on copyrighted works? Probably fine, as long as your model doesn't reproduce those works verbatim.
Building a permanent library of pirated content? Definitely not fine, even if you only use it for training.
Want to avoid liability? Buy the books. Or license them. Or use legitimately free sources. But stop pretending piracy is a necessary evil.
Already have pirated content? Delete it after training. The court's distinction between temporary training copies and permanent library storage offers a potential safe harbor.
The Billion-Dollar Question
With damages still to be determined, I calculate Anthropic faces potential liability in the billions. Statutory damages for willful infringement can reach $150,000 per work, and we're talking about millions of books. This creates a powerful deterrent effect: train responsibly or face existential financial risk.
The message I'm taking from this ruling is clear: the transformative nature of your technology doesn't give you a free pass to transform other people's property into your training data through illegal means. Fair use protects the learning, not the theft.
As more courts adopt Alsup's framework, I expect we'll see a fundamental shift in how AI companies approach data acquisition. The days of "download first, ask permission never" are ending. The future belongs to companies willing to do what Anthropic eventually did: open their wallets and do things the right way.
Even if it means destroying a few million books in the process—at least authors got paid for them.
The Bigger Picture
I should note that while Alsup's ruling gives us crucial clarity, we're still in the early innings of this game. This is just one federal district court's take, and I've learned to be cautious about declaring victory based on a single decision. We've got dozens of similar cases working through different courts, each with their own judges who might see things differently.
I'm particularly aware that the Northern District of California doesn't speak for the entire country. We're already seeing circuit splits on related AI issues—the Ninth Circuit requires "actual knowledge" for contributory infringement while the Second Circuit only needs "reason to know." That's a big difference when we're talking about platforms hosting AI tools.
What's more, this ruling really only addresses text-based AI training. I'm left wondering how courts will handle visual AI systems, code generation, or the myriad other AI applications we're seeing. Fair use is notoriously fact-specific, and what works for Claude might not work for DALL-E or Copilot.
I expect we'll see Anthropic appeal to the Ninth Circuit, which could modify or even reverse parts of Alsup's reasoning. And honestly? I think we'll eventually need either the Supreme Court to step in and create nationwide clarity, or Congress to pass actual AI legislation. Neither seems likely in the near term, which means we're in for years of case-by-case battles as the law slowly catches up to the technology.
Still, Alsup's decision represents real progress. It's the first federal court to tackle these questions head-on, and that matters—even if it's just the opening chapter of a much longer story.
Share this post