On Fri. July 19th, CrowdStrike, a leading provider of Endpoint Detection and Response services, caused the single largest computer outage in history. The outage is estimated to have cost fortune 500 businesses $5.4 billion (Parametrix Report)! Truthfully, this type of event has happened many times in the past, though with globally impact! In 2006, Microsoft caused a BSOD with a patch update. In 2010 a McAfee virus signature update did the same. This blog breaks down what happened, what it means, and how organizations should take from the event. History repeats itself for those who forget. Let’s look at this latest incident then to understand it better.
On July 19th, 2024, systems running Windows 7 and above with CrowdStrike’s Falcon sensor received a faulty channel file. This file called a memory location that did not exist, leading to kernel instability and a Blue Screen of Death (BSOD) loop. Channel File 291 caused the issue, and though it was only distributed for one hour, an estimated 8.5 million Windows devices crashed worldwide. The crash persisted across reboots and required a local machine visit to fix the issue. The event led to widespread disruptions in manufacturing, air travel, hospitals, and so many more enterprises globally. It led some to ask whether there was a nation state attack in progress. Fortunately that was not the case; a simple update gone wrong was the root cause which we’ll detail next.
CrowdStrike confirmed the outage was not due to a cyberattack, but rather a faulty update that was improperly tested. The CEO of CrowdStrike was coincidentally in charge at McAfee back in the BSOD event in 2010 but appears not to have learned from his past mistake. In light of this global BSOD, CrowdStrike offered the following changes to its business practices to avoid a similar mistake in the future.
These are all great measures to adopt in the face of such an incident. Will it be enough for CrowdStrike to recover from? Much of that depends on what the long-term implications of such a widespread outage will lead to. Here now are some possible outcomes for CrowdStrike.
A global outage at a cybersecurity firm like CrowdStrike has several significant implications:
In light of this incident and similar incident’s in the past outlined above, organizations must plan for future BSOD events. Here are some practical steps any organization can take to partially mitigate and recover more quickly from these events:
The CrowdStrike global outage offers us a potent reminder of what can happen when a mainstream product causes a major outage. Such incidents remind us of how important a multi-layered security strategy can be. It also reminds us to be better prepared by practicing our Incident Management and BCDR plan ahead of time. Given this BSOD threat will always exist, we can partially mitigate the risk by slightly delaying updates (12 to 24 hours).
In a world where cyber threats are constantly evolving, staying proactive and alert is the best defense. Understanding this BSOD risk and implementing various protective strategies will help your organization remain secure and resilient in the face of such unexpected challenges.
Discover and share the latest cybersecurity trends, tips and best practices – alongside new threats to watch out for.
Stop tricking employees. Start training them. Take Control of Your Security Awareness Training with a Platform...
Read moreA recent discovery by cybersecurity firm Oligo Security has unveiled a series of critical vulnerabilities in...
Read moreGet sharper eyes on human risks, with the positive approach that beats traditional phish testing.