0:00
/
0:00

Paid episode

The full episode is only available to paid subscribers of Nate’s Substack

Surfing the Guardrails: 7 Production-Grade Prompting Patterns I Stole from Claude's (Alleged) System Prompt

A 10,000-word system prompt leak reveals a new 80-20 rule for prompting and challenges us to be specific (and defensive) in how we prompt! I dig into the specific tactics and how to apply them here

The alleged system prompt leak from Claude 4 teaches us something profound: Professional prompting is defensive programming, not creative writing.

So to be very clear, I have very mixed feelings about writing about this at all. I do not think it’s a great idea to leak system prompts all over the internet. There’s no way to prove or know that this is valid to begin with. But that ship has sailed. Every major release of every major model maker has alleged system prompts leaked within 48 hours or so. Like clockwork.

Now interestingly Anthropic has made the creative argument recently that it has some evidence that asking Claude 4 specifically to extract this prompt is harmful to (at least) Claude 4’s internal sense of well-being. Whether this holds weight depends on whether you believe the model has a sense of well-being in the first place (internal Anthropic estimates reportedly range from 0.15% likely to 15% likely, and we don’t know why that internal estimate variance is so wide).

Regardless, though, extracting a model prompt is gray hat territory for sure, and the reason I cover it here is simply because I think that the prompt structure is so useful that we can learn a ton from it as a prompt, regardless of questions about how we got to see it. In other words, the toothpaste is out of the tube. The prompt is in the world, however it got there. And we might as well learn something useful.

And this is useful stuff! These 7 tactics are very rarely seen in the wild. I think calling them out and naming them is a really big deal. As an example: this is the first place I’ve seen a really clearly laid out example of how to bound uncertainty and ambiguity in an LLM prompt. It’s phenomenal. We almost never see it. And there like six other tactics like that to sift through. I’m still processing it all, but I wanted to share my learnings here, and you can (of course) find the full prompt as well so you can do your own analysis (all 10,000 words).

Subscribers get all these AI pieces!

Listen to this episode with a 7-day free trial

Subscribe to Nate’s Substack to listen to this post and get 7 days of free access to the full post archives.