Move Fast & Unbreak Things

We shipped a new buyer portal in two weeks. Start to finish, blank slate to a thing that buyers can actually use. The team was three engineers and a designer, but most of the implementation was driven by agents.

I want to write about what surprised me, which wasn't the velocity. Velocity is the easy story and frankly it's been done to death. The interesting bit is what shorter iteration cycles do to the calculus behind "move fast and break things", which I'd never really interrogated until I found myself living inside it.

The original credo, briefly

Facebook's old motto wasn't really a licence to be careless. It was a bet: if we ship more and learn more, the value of that velocity will outpace the cost of the things we break along the way. Shipping was expensive (relative to today), so each release cycle had to count, and the breakage you accepted was the breakage that came as a necessary side-effect of emphasising immediate value over complete certainty.

The credo always struck me as slightly defensive, actually. It was less "go ahead and break things" and more "please understand that breakages are going to happen if we prioritise speed, the alternative is worse". An organisational permission slip rather than an engineering principle.

What happens when the cycle compresses

Here's the thing. With agent-led implementation, our cycle from "noticing a flow feels wrong" to "the flow no longer feels wrong" went from days to about forty minutes on average. Sometimes ten.

That sounds like a velocity story. It isn't, or rather it's only one. The more interesting consequence is that the entire economic argument for "move fast and break things" stops applying the way it used to.

The old equation, roughly: velocity gain outweighs breakage cost. You tolerate breakage because the fix loop is slow and the velocity matters more.

The new one: velocity gain outweighs breakage cost outweighs zero. You don't tolerate breakage, you fix it, immediately. The credo isn't about accepting breakage as the price of speed any more. The speed is now so high that you can fix the breakage inside the same sprint, the same afternoon, sometimes the same hour.

So "move fast and break things" becomes a half-statement. The back half (and break things) was load-bearing in 2010 because you couldn't really have one without the other. Now you can. You can move fast and fix things, in the same breath, before anyone notices. It's a genuinely different equation rather than an iterative improvement.

What we actually broke

We broke.. a couple of things. In particular an empty-state handling edge case, a timezone bug, and a data boundary issue which the QA gate caught before any customer saw it. Each of those would have been a major story in a slower-cycle world. Instead they were each a forty-minute round trip. What changed isn't the rate of breakage. Agents write more code, more tests, more documentation than any human developer could in an equivalent amount of time, so they make more mistakes and catch more mistakes. What changes is the cost of each breakage.

The part that's actually challenging

The part that poses an actual risk, and I'll be honest about it, is that the things you break get harder to notice in proportion to how fast you fix them. There's no window for someone else to spot the bug. It lives in a branch for ten minutes and then it's gone, and if you didn't write it down anywhere you've also gone past your chance to learn from it.

So the discipline shift, I think, is the move from "review every change carefully because changes are rare" to "write down every interesting failure because they are otherwise lost". The unit of attention shifts from the change to the lesson.

A few things that worked for us:

Pick a scope where breakage is recoverable, and keep the rapid loop on prospective new functionality only. An internal-facing buyer portal v1 with a small pilot cohort behind a QA gate, yes. A live payments flow touching customer money, or anything safety-critical, no, or at least not yet, and not the same way
Have one human accountable for the shape of the product, not the implementation. The agents will gladly implement five different versions of a screen, they will not tell you which one is right
Write down the failure modes as you find them, even the small ones. Especially the small ones. They're the ones that would have been a code review comment in the old world and now nobody will ever see them unless you say so

It's a new world out there, and everything about the way we do business needs to change, all the way down to the cheesy truisms we throw around in retros.

Move Fast & Unbreak Things

The original credo, briefly

What happens when the cycle compresses

What we actually broke

The part that's actually challenging

Read more

Specialists, Generalists, and Builders

The Single-Engineer Stack Problem: Sequential, Parallel, or Orchestrated?

Naming is not a style preference. It is a schema.

Software is a Commodity. Data is the Moat.