Tuesday, January 18, 2022
Facebook, WhatsApp, Instagram went down globally on October 4, 2021 at around 20:00 hrs (IST) and it remained down for almost six hours. As per the reports in media, this seems to be the most significant downtime in Facebook’s corporate history. This outage led to tanking of approx. 6% share, roughly US$6 billion loss to Mark Zuckerberg.

An official statement from a Facebook representative confirmed the issue with the configuration changes of the backbone routers, which made Facebook’s digital infrastructure inaccessible to Facebook’s users in the external world and their internal employees. Even the employees responsible for managing the data centers could not get inside the data center.

During this massive outage, since Facebook’s digital infrastructure was not able to communicate even with its domain name, i.e., www.facebook.com, for some time, this domain got listed for sale.

Facebook has cited the main reason for the outage as BGP Router’s configuration error, through which it stopped communicating with the DNS server.

For the easy and everyone’s understanding full form of BGP is Border Gateway Protocol, which is similar to Post Office, which ensures delivery of the packets (contents) in the shortest & most accessible way. Its BGP’s configuration responsibility is to provide the path & route for the information transmission.

Similarly, the full form of DNS is Domain Name Service, compared with the Address book in Phone or Mobile Devices. The way phone numbers are mapped to Name, same way under DNS, IP Address gets mapped with Domain Name of the digital entity & they may have multiple DNS addresses, e.g. Facebook’s DNS’s one of the range consists –

The problem was there was no way to reach the DNS server due to the BGP configuration issues & which meant the Name to IP address couldn’t be translated.

We need to wait for the official disclosure about:- What led to this BGP configuration error? Who did it? Was it a genuine oversight error or some malicious intent by a third party?

However, going by industry-standard, all companies operation depends on the fundamental principle of People, Process Technology (PPT) & here we have a classic example of the failure of People & Process for sure! Example :

Single Point Failure. Enough has been discussed Disaster Recovery (DR Site) with n+1 redundancy, wherein if the main Datacenter downs alternate should pick up automatically. If this can happen with a company like Facebook, well, all the tall claims are mere eyewash & marketing jargons to fool.

Process: Human errors can never be ruled out & that is why Process has been defined to address Human Error. As cited, it was a configuration upload error; why wasn’t the maker & checkers and Change Management process was followed?

Internet’s Original Concept: The concept of the Internet by DARPA was to have a Large, Scale Distributed Network, which was supposed to sustain the nuclear attacks on any part of it. However, in reality, its evident, at its core, is a centralized infrastructure of routing devices and centralized Internet services. The protocols its uses are just the ones drafted when we connected to mainframe computers from dumb terminals. Overall, though, a single glitch in its core infrastructure can bring the whole thing crashing to the floor. And then, if you can’t get connected to the network, you often will struggle to fix it. A bit like trying to fix your car when you have locked yourself out and don’t have the key to get in.

Interesting Coincidence: It’s a mere coincidence that there just a few hours before this massive outage, an interview of an employee of Facebook named Ms.Frances Haugen

got telecasted by USA’s CBS News Channel, who claimed herself as Whistleblower. This was a 60 minutes interview, which contained quite a few discriminating revelations about, how does Facebook work as a company from a business perspective. Lots of hue-cry took place post this interview, especially those who care about privacy, ethics, etc. Ms.Frances highlighted Nine points that are horrifying by nature for normal & sensible human beings, which are as follows:

  1. Facebook’s algorithm intentionally shows users things to make them angry
  2. Facebook is worse than most other social media companies Facebook dissolved its
  3. Civic Integrity unit after the 2020 election and before the Jan. 6 Capitol insurrection
  4. Political parties in Europe ran negative ads because it was the only way to reach people on Facebook
  5. Facebook only identifies a tiny fraction of hate and misinformation on the platform
  6. Instagram is making kids miserable
  7. Employees at Facebook aren’t necessarily evil; they just have perverse incentives
  8. Haugen even has empathy for Zuck for some stupid reason
  9. Haugen believes she’s covered by whistle-blower laws, but we’ll see

Case Study by: RED Team of Armantec, led by Shamsher Bahadur – Cyber Security Practice Head.

This Article has been Submitted by Armantec Systems Pvt Ltd (www.armantecsystems.com), a Noida Based Threat Intelligence & RED Teaming Consulting Firm, with the prime focus on custom Ransomware Attacks Solution for Critical Information Infrastructures (CIIs).

