One Sharp Thought: Voice Mode & Latent Space
What happens if we never knew the AI because we didn't know how to talk to it?
OpenAI has been rolling out voice mode for ChatGPT Plus users, and it's making waves in AI interaction. After the initial hype months ago (including talk shows), we all actually get to talk to our machines like people.
I actually think that’s good. Voice mode changes things. Talking to an AI feels different than typing—not because the AI is different but because I am. I feel more relaxed and natural. I notice myself shifting question sets: I ask about vacations. I use my pleases and thank you’s more. I ask for brainstorming more.
And the AI keeps up. Voice Mode is only available in 4o for now, but I honestly felt like I was talking to a slightly clangy person. There was a very slight chromium artificiality about how polished the answers were, but not an uncanny valley wth is going on vibe.
I think the pauses work in my brain’s favor here. So does the haptic. I get a few taps while it “thinks” so I pay attention, and the pause feels more natural in conversation than it does typing. It makes it feel more human.
And this got me thinking about latent space. I’m used to wondering if I’m typing and framing questions to leverage AI’s latent space effectively. I’m not used to asking if my mental model of personhood is getting in the way of exploring LLM latent space and fully getting value from these new machine comrades.
What are you learning?
Today’s Sharp Thought: Voice interactions with AI change our subconscious mental model of AI. And that matters. For design, sure. For product, sure. For workflows, sure. But I think Voice Mode is bigger. It’s going to shape the way we think about intelligence, because we’ve never had something this smart to talk to besides ourselves.