Amazon and Microsoft Cloud Outages in Europe

Amazon and Microsoft Cloud Outages in Europe.

Apparently both Amazon’s Elastic Compute Cloud (EC2) and Microsoft’s Business Productivity Online Suite (BPOS) house their cloud services near Dublin, Ireland.  I can personally attest that Dublin is a great place to be, especially if you don’t mind a lot of drizzle and rain.  If you are a business, you like Ireland because of the tax breaks, educated workforce, a cool climate to avoid excessive air conditioning of data centers, and good Internet connections to the mainland.

With the rain, at times, comes lightening, and apparently lightening knocked out both Amazon and Microsoft’s cloud facilities by knocking out a power substation nearby.  It appears that neither Amazon nor Microsoft had backup cloud facilities in Europe.  Thus their cloud customers, whose data must remain physically in Europe by law, were knocked out.

Amazon reported, “We understand at this point that a lighting strike hit a transformer from a utility provider to one of our Availability Zones in Dublin, sparking an explosion and fire,” Amazon wrote. “Normally, upon dropping the utility power provided by the transformer, electrical load would be seamlessly picked up by backup generators. The transient electric deviation caused by the explosion was large enough that it propagated to a portion of the phase control system that synchronizes the backup generator plant, disabling some of them. Power sources must be phase-synchronized before they can be brought online to load. Bringing these generators online required manual synchronization.”  According to Amazon, EC2 instances were being brought back within three hours of the lightening strike, and within about 12 hours 60 percent of instances had been restored.  However, “Due to the scale of the power disruption, a large number of EBS servers lost power and require manual operations before volumes can be restored,” said Amazon. “Restoring these volumes requires that we make an extra copy of all data, which has consumed most spare capacity and slowed our recovery process.”  This took several days to complete.

Microsoft reported, BPOS was also knocked offline for several hours by the lightning strike. According to Microsoft, European BPOS services were restored within four hours. In a statement, Microsoft said “a widespread power outage in Dublin caused connectivity issues.”

Well, some EC2 customers weren’t knocked out, in the case of Amazon.  Amazon had three “Availability Zones” (AZs) within the same facility.  An AZ is just another cloud, and a customer can design their cloud application to fail-over from one AZ to another.  Amazon must have designed their AZs in Dublin reasonably well, since NetFlix reported that they were able to execute a failover, because only one of the Dublin AZs was knocked out by the storm.

OK, so kudos for Amazon for having backup AZs there in Dublin that were resilient to minor storm damage.  (Amazon probably should have had better line conditioning so that the surge didn’t propagate to the other AZs.)  Small raspberries to Amazon for not explaining to their customers exactly how AZs worked in Dublin so that more customers could have taken advantage of them for “storm protection.”  On the other hand, big raspberries for Amazon for not having a geographically separate site, preferably far from Dublin.  Only with a second location would Amazon’s cloud (EC2) be protected from a devastating lightening strike, a major earthquake, or another disaster that would knock out the entire location.  I’ll note that a customer that takes advantage of say two AZ’s in Dublin, could easily move their backup AZ to another location whenever Amazon gets its head out of a dark place.  (To be fair, Amazon probably needs more European customers to make a second Ireland site cost effective.)

These outages weren’t the first for either Amazon nor for Microsoft.  It is clear that Cloud Services in general are still maturing.



Tags: , , , ,

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: