During peak hours, Netflix video streams make up more than one third of internet traffic. Netflix must stream uninterrupted in the face of widespread network issues, bad code deploys, AWS service outages, and much more. Failovers make this possible.
Failover is the process of transferring all of our traffic from one region in AWS to another. While most of Netflix runs on Java, failovers are powered entirely by Python. Python's versatility and rich ecosystem means we can use it for everything from predicting our traffic patterns to orchestrating traffic movement, while dealing with the eventual consistency of AWS.
Today, we can shift all of our 100 million+ users in under seven minutes. A lot of engineering work went into making this possible. The issues we faced and solutions we created have broad application to availability strategies in the cloud or the datacenter.