0:00
/
0:00
0:00
/
0:00
Preview

AI Lab Trust Report 2025: Ranking OpenAI, Google, Anthropic, Meta, xAI

The 2025 scorecard rates each lab on transparency, alignment, and delivered performance—helping you separate genuine breakthroughs from smoke-and-mirrors marketing

There’s something uniquely problematic about buying intelligence.

When you purchase software or hire consultants, you test capabilities, review track records, and assess performance directly. Transactions build trust through clarity: everyone knows precisely what’s on offer.

But AI models flip this paradigm entirely. You’re acquiring a promise—an opaque bundle of potential wrapped in PR narratives, ambitious demos, and benchmark scores. The intelligence inside is hidden, making transparency not just scarce but often intentionally elusive.

Recent weeks have brought this issue into sharp focus.

First, developers erupted over Cursor’s opaque pricing shift (3x in one month!)

Later, developers questioned Anthropic’s Claude Code stability, desperate for transparency about sudden performance changes.

Then on Saturday came OpenAI’s announcement about achieving gold-medal performance at the International Math Olympiad. Tech circles were genuinely astonished. Overnight, AI had seemingly matched the world’s brightest mathematicians, surpassing expectations on betting platforms (which briefly soared to 70ish percent before crashing as the dimensions of the claim became known).

The excitement dimmed swiftly when mathematicians like Terence Tao asked a simple yet devastating question: Without knowing how the results were achieved, how can we properly evaluate their significance?

The Math Olympiad organizers revealed another uncomfortable truth: OpenAI had disregarded a request to delay the announcement out of respect for the kidsa request Google has honored.

There is a crisis of trust in intelligence, because intelligence is not something we can easily see, touch, or measure. (No, test scores don’t count.) In absence of clarity, we need to develop our own intuition about where and how we can trust model makers (or not).

So OpenAI wasn’t building trust on that one.

But the Math Olympiad moment also shone the spotlight on another critical trust divide: tech largely celebrated the outcome uncritically, while it was up to mathematicians to raise questions.

Both had their points:

  1. Tech was right that if OpenAI has developed a new reinforced learning technique that enables extended focus and strong logical performance without tools, that’s a big deal!

  2. Mathematicians were right to question OpenAI’s methodology and ask if the achievement was as universally applicable as OpenAI hinted at (and many in tech claimed).

Neither was wrong; both were evaluating different criteria, and that’s a massive trust breaker for AI. The problem is that tech can’t build trust if it can’t communicate clearly what’s useful.

The gap between promises and reality means those of us doing work with AI have to develop our own heuristics or rules of thumb to figure out who or what to trust.

In this post, I reveal my own personal model maker scorecard. How I decide who to trust and why.

The article that follows is a full breakdown of my scorecard for trust for each model lab in 2025. If we’re buying intelligence from them, I figure we should ask honestly what we can trust (or not) about each of them.

Grading the model makers matters because the problem isn’t isolated—it’s consistent across AI labs like Meta, Anthropic, Google, OpenAI, and xAI. Each lab carries a distinct DNA for managing the gap between vision and delivery. Recognizing these patterns isn’t merely insightful; it’s essential for anyone betting their future on AI (and increasingly that’s a lot of us).

Ultimately, Trust isn’t about believing promises—it’s about decoding patterns based on intelligence delivered.. Let’s dive in and see a scorecard for all five model makers…

Subscribers get all these newsletters!

Listen to this episode with a 7-day free trial

Subscribe to Nate’s Substack to listen to this post and get 7 days of free access to the full post archives.