Retry logic looks simple—but in real systems, it’s one of the fastest ways to accidentally take everything down. I’ve seen small issues turn into full outages, not because something failed—but because ...