Chaos Engineering Saved Your Netflix

Photo illustration of the Netflix logo with Netflix displayed on a mobile phone.

To hear Greg Orzell tell it, the original Chaos Monkey tool was simple: It randomly picked a virtual machine hosted somewhere on Netflix’s cloud and sent it a “Terminate” command. Unplugged it. Then the Netflix team would have to figure out what to do.

That was a decade ago now, when Netflix moved its systems to the cloud and subsequently navigated itself around a major U.S. East Coast service outage caused by its new partner, Amazon Web Services (AWS).

Orzell is currently a principal software engineer at GitHub and lives in Mainz, Germany. As he recently recalled the early days of Chaos Monkey, Germany got ready for another long round of COVID-related pandemic lockdowns and deathly fear. Chaos itself raced outside.

