There’s been some negative news lately around cloud computing as one of Amazon Web Services’ East Coast availability zones failed late Friday night, taking down Netflix, Instagram, and a host of other web services. The story is familiar to those of us who have studied large-scale crises: cascading catastrophic failures exposing unknown bugs and hidden dependencies, in this case with the ELB and EBS control plane. Monday morning saw a host of articles questioning the wisdom of using such clouds for critical applications.
Here at Meraki, while we weren’t affected by the AWS outage, we take the reliability of our cloud infrastructure very seriously. We have built a secure, highly-available architecture for our Cloud Controller and placed it in geographically dispersed datacenters. We also never rely on a single provider (e.g., Amazon) so we have redundancy in the face of systemic outages. Customer data is mirrored across multiple sites with automatic failover so should a catastrophic outage strike one of our datacenters, customer data will be available at other sites. And of course, Meraki APs, switches, and security appliances will continue to serve clients even if our Cloud Controller is temporarily unavailable.
If you’re interested in the specific details on the outage, Amazon has a great post-mortem here.