09 August 2016

Single Router Failure Blamed on Southwest Outage

Screenshot of Computerworld story on Southwest Airlines Outage in July 2016
Computerworld Story on July 2016 Outage at Southwest Airlkines

Single point of failure.  You know the concept.  Those single points are easier to see in the rear view mirror. Can a single router cause an entire data center to go down?

In a Dallas Morning News story, CEO Gary Kelly said it could -- and did. There was a "backup system" in place to address a single router failure. 

But, Kelly insisted, because of the router's unusual "partial failure," the backup procedures weren't triggered, and the problem became a massive one.

No, this isn't a very good technical explanation .

Kelly took pains to insist that the legacy technology in their data center wasn't involved in the failure. Nor was the cause a hack of some sort (probably true, though at the time of this report it might be premature to completely rule that out).

Whether they're justified in saying so or not, Southwest Airlines unions want a change of leadership, arguing that the CEO and others are delaying technology improvements in order to maintain profits for investors. (True, Southwest employees hold plenty of Southwest stock, too.) What do the IT folks at Southwest say? I didn't see any comment on that.

CNBC listed the publicly reported airline technology failures since 2015.

No comments: