Lessons From The World's Most Expensive Data Outages

The bigger they are, the harder they fail. The world’s biggest companies have an unreal amount of money invested in their data networks, but that doesn’t always protect them from failure.

When these outages happen, the world’s biggest brands are now faced with a massive hit against their reputation and trustworthiness in the eyes of the public. On top of this, the cost of the lost business and repairing the actual problem can be in the tens of millions of dollars.

Today we will explore 3 of the most public and expensive data outages in recent years, while exploring what other companies can learn from them.

British Airways – May 2017

You don’t mess with people and their long weekends. However, British Airways did just that and had a data outage at one of the most inopportune times possible.

The fallout from this quickly canceled more than 400 flights and stranded 75,000 furious passengers in a single day. It’s estimated to have cost 100 million euros ($112 million USD) and scalded BA’s reputation as news reports showed masses of wayward travelers expressing their disappointment and rage.

How did this happen? One engineer disconnected and reconnected a power supply the wrong way and this triggered a power surge that brought down their entire infrastructure.

Lesson: Catastrophe doesn’t wait until you’re ready and can hit you when you’re most vulnerable. Human beings are still responsible for 70% of data center failures, which places the highest importance on training and leadership.

If you’re looking to prevent these types of mistakes, consider contacting a company such as Upstack.com for more information on choosing the right data center.

Amazon – April 2017 & 2018

The world’s biggest cloud provider was brought to its knees by one employee, who was ironically trying to debug their billing system.

The engineer accidentally mistyped a command, which ended up turning the lights out on countless online stores and e-commerce websites for several hours. This outage is estimated to have cost well over $150 million.

Then, in 2018, Amazon experienced another outage that lasted only 13 minutes. However, based on the sheer number of customers relying on Amazon to sell their goods and services, it’s been estimated:

It cost them over $200,000 for every minute of the outage
The 13 minutes of downtime likely cost them $2 million in total

Lesson: Multi-cloud and colocation strategies can mitigate the damage when a vendor has problems.

Microsoft Azure – September 2018

A devastating lightning strike near one of Azure’s San Antonio-based data centers caused a voltage surge resulting in damage to their hardware, network devices, and storage units.

Some of their customers were back online in a day, but it took as long as 3 full days for some customers to achieve “full mitigation.”

This outage took place only months after a top US economist stated that the US’ reliance on cloud-based servers meant that a 3-day mass outage could result in as much as $15 billion of damage to the economy.

Lesson: Bad weather of some sort happens in every part of the country, and reliance on a single provider is insufficient when millions of dollars are at stake.

Protect your business. Remember that downtime is inevitable and preparedness is key. Put the proper solutions and systems in place to ensure that interruption is minimal, as is the disruption and the financial damage.