0:00
/
0:00
/
Preview

You're using the wrong kind of agent. Here's the one question that tells you which one you actually need + 3 diagnostic prompts

Dark factories aren't coding tools. Auto research isn't a dark factory. If those distinctions don't mean anything to you yet, don't worry — we'll get there.

There are four kinds of agents. Most people think there’s one.

I hear this confusion everywhere now. Senior engineers evaluating tools. PMs writing roadmaps that say “integrate agentic capabilities.” ICs on Twitter arguing about whether Cursor or CrewAI is “better” — as if that comparison even makes sense. Founders pitching “agent-powered” products without being able to explain what kind of agent they mean. The word “agent” has become the “cloud” of 2026. Everyone uses it. Almost nobody means the same thing.

I’ve watched someone try to use a dark factory to write a novel. I’ve watched someone reach for CrewAI to solve a coding problem that needed a single agent and a good test suite. I’ve watched someone point an auto research loop at writing new software from scratch — which, if you understand what auto research actually does, is like pointing a profiler at an empty file. There’s nothing to optimize.

Every one of those people thought they were using “agents.” They were. They just picked the wrong subspecies for the problem, and the reason they couldn’t tell the difference is that every landing page, every pitch deck, every hot take uses the same word to describe four architectures that have about as much in common as a forklift and a bicycle.

Analysts project the AI agent market will grow from roughly $8 billion in 2025 to over $50 billion by 2030 (MarketsandMarkets, BCC Research). The money is real. The confusion about where to point it is the expensive part.

Here’s what’s inside:

  • The four architectures. Coding harnesses, dark factories, auto research, and orchestration frameworks — what each one actually does, who’s using them in production, and what breaks when you pick the wrong one.

  • The Karpathy-Lütke-StrongDM-DocuSign map. How the people getting this right choose different tools for different problems — and why none of them confuse one architecture for another.

  • The one-question test. One diagnostic question that cuts through the ambiguity — works whether you’re an IC choosing tools or a CTO setting strategy.

  • The operating principles. Decomposition, specification-as-code, metric-plus-guardrail, and handoff contracts — one governing principle per architecture that determines whether it works or wastes your quarter.

  • The prompts. Three diagnostic prompts that classify your problem into the right architecture, pressure-test whether you’re actually ready for it, and catch mismatches before they cost you a quarter.

The taxonomy matters because the fix for each problem lives in a different place than most people are looking. Let me show you the map.

Subscribers get all posts like these!

User's avatar

Continue reading this post for free, courtesy of Nate.