You are here

How a software bug slammed Facebook users

public://pictures/Todd-DeCapua-CEO-DMC.png
Todd DeCapua, Technology leader, speaker & author, CSC

This article is part of an ongoing series of Performance Retrospectives that assess real-world application performance issues in the news, analyze what might have happened, and offer up best practices that just might help you avoid similar problems.

On April 30, Facebook faced a firestorm of criticism when millions of users' posts suddenly disappeared from the social network.

[ Learn how to apply DevOps principles to succeed with your SAP modernization in TechBeacon's new guide. Plus: Get the SAP HANA migration white paper. ]

What happened

A software bug deleted and/or hid millions of Facebook posts that included links. Users were told that the links were "violating the security policy." Another Facebook message stated that "We believe the link you are trying to visit is malicious. For your safety we have blocked it."

[ Is it time to rethink your release management strategy? Learn why Adaptive Release Governance is essential to DevOps success (Gartner). ]

Why it happened

A spokesperson from Facebook said that "An error in our system that helps block bad links on Facebook incorrectly marked some URLs as malicious or inappropriate."

A story in Naked Security described how the software bug hid posts, blocked links, and led to security warnings. Some functionality (use case) can enable bad things (production incident) to happen; in this example, it was a common flow resulting in a "link blocked" alert while Facebook users were trying to post. Additionally, the glitch caused a proactive mitigation of perceived risks in existing posts by deleting or hiding them. The root cause, as reported in the Facebook developer forums, appears to have been an image-scraping system that pulls pictures from shared links automatically.

The business impact

The event generated a tremendous amount of negative publicity. Unable to vent on Facebook, millions of affected users expressed their outrage on Twitter. What's the value of a lost user for Facebook? One simple way to make the calculation would be to take the company's market capitalization and divide it by the number of active users. One of the first to use that formula, Forbes contributor George Anders came up with a value of $98 per active Facebook user in 2013 ($117 billion divided by 1.19 billion active users). By that measure, a loss of just 0.5 percent of active users would result in tens of millions of dollars in lost value.

Fortunately, Facebook quickly fixed the problem and offered an apology to users "for the inconvenience this has caused." The furor died down, but user perceptions were no doubt affected by all the negative publicity.

Takeaways: How to slay similar software bugs

As reported by the BBC, the incident followed a software update by Facebook "that seems to have caused the glitch." The software bug might have been prevented by incorporating a careful regression test procedure into the company's pre-production testing practices, which can help mitigate or prevent these types of production incidents. The potential impact to end users and the brand makes this practice worth the extra cost.

Like many businesses, Facebook's systems are complex, and software updates can have a dramatic impact on both the service and the users who depend on it. Public scrutiny, magnified by social media, can be intense when a software bug occurs. Facebook is a free service to users (it derives its revenue from advertising), and most would say it's a non-critical service. But Facebook's passionate users express their frustrations quickly through Facebook and other social media outlets when things go awry. Will your users be any less passionate when a bug inconveniences them? By following modern quality testing practices, you'll be far less likely to find out.

[ Get Report: Buyer’s Guide to Software Test Automation Tools ]