Technology Analysis

CrowdStrike causes global I.T. meltdown after Microsoft update

By | | comments |
Multiple blue screens of death, caused by an update pushed by CrowdStrike, on airport luggage conveyer belts at LaGuardia Airport, New York City (Image by Smishra | Wikimedia Commons)

The recent CrowdStrike outage brought systems crashing down worldwide and has reminded us that a more resilient global digital infrastructure is imperative, writes Paul Budde.

THE OPTUS OUTAGE from last year was immediately on my mind when on Friday afternoon, a similar event swept, this time across the world. Also, in this case, it was a software update that caused the problem, this time from global security software provider CrowdStrike.

The culprit appears to be an update to the CrowdStrike Falcon platform, a security monitoring tool widely deployed by businesses and organisations on Microsoft desktop computers and notebooks.

The extent of the disruption

This unprecedented software failure has led to a major I.T. outage, affecting a vast number of organisations worldwide. In Australia, high-profile entities such as the Commonwealth Bank, Telstra, the ABC and others have been significantly impacted.

The outage has left customers unable to use EFTPOS to pay for goods and services in many businesses. Although Telstra has assured that the Triple Zero Emergency Call service remains operational, the broader impact on essential services is profound. Major airlines, banks, shops and various other businesses have had to suspend their operations, resulting in thousands of people being stranded at airports and potential disruptions to bus and train services.

The cause of the chaos

The issue traces back to a problematic software update from CrowdStrike, which caused Windows computers to crash and display the notorious “blue screen of death”. This error screen appears when the operating system encounters a critical failure and cannot load correctly.

There is no evidence to suggest a cybersecurity incident behind the outage. However, the ramifications of the faulty update are severe, highlighting vulnerabilities in the resilience of I.T. systems dependent on a single provider's software.

Efforts to mitigate the impact

CrowdStrike has been quick to address the situation. A representative from the company mentioned that same afternoon, Australian time, ‘the bleeding has been stopped,’ indicating that unaffected computers are unlikely to encounter the issue moving forward. However, the process of rectifying affected computers is labour-intensive. CrowdStrike has advised that affected machines must be booted into safe mode and a specific file deleted manually, a process that could take considerable time given the scale of the problem.

Notifications from CrowdStrike are being sent out to customers and updates are being posted on support pages, although these are accessible only with a login. This restricted access may further complicate the recovery process for many organisations.

Analyses and implications

My analysis of the event has a lot of similarities with my reflections on the Optus outage last year. The incident underscores the critical issue of resilience in I.T. infrastructure, particularly in systems that lack diversity.

The reliance on a single provider's software – in this case, Microsoft’s Windows operating system coupled with CrowdStrike’s Falcon platform – has highlighted the risks of a monoculture approach. The cascading effects of a single point of failure have brought about significant operational disruptions, reflecting a need for more robust and diverse I.T. strategies.

Operationally, this event points to a possible bypass of continuous integration testing in the update process, which should serve as a cautionary tale for software providers about the importance of thorough testing. Tactically, the necessity for physical access to fix the issue – especially in cases where cloud-based systems require manual intervention – reveals a glaring vulnerability in current I.T. practices.

Again, we saw this also in the case of the Optus outage, where modems had to be manually reset. Also, some of the outcomes of the government review after the Optus event might be relevant to the CrowdStrike event.

Strategically, the incident has broader implications for the global economy. The extensive downtime and the overtime required to rectify the issues could impact productivity and economic output. It also raises questions about liability and the potential for consequential loss claims, as businesses grapple with the financial fallout of the disruption.

Looking ahead

To prevent future failures, the I.T. industry must prioritise resilience and diversity. This means not only diversifying software and hardware providers but also implementing more robust testing and update procedures to ensure that a single point of failure does not bring down entire systems.

As businesses and governments work to recover from this significant outage, the incident serves as a stark reminder of the vulnerabilities inherent in modern I.T. infrastructure and the urgent need for more resilient and diversified systems. The lessons learned from this event will be crucial in shaping the future of I.T. resilience and security.

While this was, in the end, a human error, it also shows what we can expect will happen more often if we continue the path of cyberwarfare. You don’t need bombs anymore to bring a country to its knees.

Paul Budde is an Independent Australia columnist and managing director of Paul Budde Consulting, an independent telecommunications research and consultancy organisation. You can follow Paul on Twitter @PaulBudde.

Support independent journalism Subscribe to IA.

 
Recent articles by Paul Budde
The humble spreadsheet is about to get a whole lot smarter

Spreadsheets are about to be given an AI overhaul, but while it will make working ...  
Government revitalises NBN with $3 billion injection

The Albanese Government has reiterated its commitment to completing the NBN and ...  
Meta and X abandon fact-checking: a further blow to democratic values

The announcement by Meta that fact-checking will be abandoned causes great concern ...  
Join the conversation
comments powered by Disqus

Support Fearless Journalism

If you got something from this article, please consider making a one-off donation to support fearless journalism.

Single Donation

$

Support IAIndependent Australia

Subscribe to IA and investigate Australia today.

Close Subscribe Donate