On February 5th, Anthropic dropped Claude Opus 4.6, and within the hour, OpenAI responded with GPT-5.3-Codex. The developer community spent the next three weeks running benchmarks, writing comparisons, and arguing about which model won.
They compared the wrong thing.
The model is the brain. The harness is everything else: where the AI actually runs, what it remembers between sessions, which tools it can touch, how it manages multiple tasks at once, and how deep your team’s dependency grows every week you build around it. Claude Code and Codex are making genuinely different bets about all of that. One works in your environment with full access to your machine, building up memory of your project over time. The other works in a sealed room with a copy of your code, thinks privately, and slides finished results under the door. Both bets are working. Neither is converging toward the other. And every week your team spends inside one harness is another week of compounding dependency that gets more expensive to undo.
One developer built six layers of workflow automation over a few months, each layer depending on the previous one. He couldn’t have started with the final version. It only works because it accumulated. If he switched harnesses tomorrow, every layer resets to zero. Now multiply that by every engineer on your team. That’s the lock-in nobody is pricing into their decisions, and it’s the most expensive blind spot in software tooling right now.
Here’s what’s inside:
The number that makes the thesis concrete. The same model scored 78% in one harness and 42% in another. Your evaluation process wouldn’t catch the difference.
Five architectural decisions locking your team in right now. Whether you chose them deliberately or not, each one compounds separately. All five compound together.
The $2 billion company spending 100% of its revenue on API costs. What Cursor’s trajectory reveals about the economics everyone is ignoring.
A harness audit that scores your lock-in across five dimensions and routes your actual work to the right tool. The prompt asks what you’ve built, maps where you’re exposed, and tells you what to do about it this week.
An executive brief generator that translates your audit into engineering-weeks and dollars. For getting leadership aligned on the harness decision as a strategic commitment, not a tool purchase.
The prompts in the kit are where this stops being an argument and becomes a decision you can actually make. Let’s get into it.
Listen to this episode with a 7-day free trial
Subscribe to Nate’s Substack to listen to this post and get 7 days of free access to the full post archives.













