Global IT Outage Calls for Stronger Digital Resilience

July 19th, 2024 Hannah Tichansky Reading Time: 4 minutes
People waiting airline counter

If you woke up today with a delayed flight or issues with your Microsoft programs, you’re definitely not alone. Earlier today on July 19th, major cybersecurity company, CrowdStrike experienced total service disruptions due to an update, in what is being called “the largest IT outage in history.”

In a statement on X, CrowdStrike CEO, George Kurtz clarified this was not a cyber breach incident and stated, “CrowdStrike is actively working with customers impacted by a defect found in a single content update for Windows hosts.”

The effects of this outage are still being analyzed and are expected to be far and long-reaching. With upcoming regulations such as the Digital Operational Resilience Act (DORA), there is a growing need for increased internal third-party risk management (TPRM) improvements, and formal IT guidance.

What is Affected In the IT Outages?

Wondering why so many agencies and areas are affected? Well, it’s becoming clear that a lot of organizations are using CrowdStrike, an American cybersecurity technology company that provides cloud protection, threat intelligence, and cyberattack response services for organizations across the world, and across industries.

Just a few of the key areas that experienced IT outages due to this incident include:

  • Widespread technical disruptions across Microsoft products and Windows 365 Cloud PCs, causing users to encounter “the blue screen of death”. These disruptions were seen around the world, including the US, UK, India, Germany, the Netherlands, and elsewhere.
  • US 911 lines across multiple states went down due to the outage. The US Emergency Alerts System advises anyone experiencing an emergency to call their local police or fire department lines directly.  
  • The London Stock Exchange Group’s workplace platform went out, preventing it from being able to publish statements, and banks around the world also affected.

Concentration Risk and the Need for Third-Party Digital Resiliency

The issues seen today with CrowdStrike, a major technology vendor utilized within some of the world’s largest government, infrastructure, and technology companies, highlight the dependence organizations have on their third-party vendors.

Cloud and IT protection is necessary in today’s risk-prone environment, with ransomware and cyber-attacks being reported in the news on a daily basis. But what happens when that critical supplier causes a major disruption?

One of the major factors that played a part in this global outage is concentration risk. This can take a variety of forms:

  • Dependence on a single vendor
  • Geographic concentration
  • Fourth-party concentration

Today’s incident showcases the risk when a huge concentration of major technology, government, and infrastructure organizations utilize the same vendor for critical operations.

Incorporating strategic procurement practices allows organizations to focus on the selection of suppliers that can offer flexibility and reliability under varying conditions, helping to avoid disruptive factors like concentration risk.

Best Practices for Third-Party Digital Resiliency

Without digital resilience processes and systems, we will continue to see outages that can, quite literally, put the world at a standstill.

Taking steps to build your IT and digital resiliency helps mitigate impacts like what we’re seeing with this incident. Some best practices include:

Tier your vendors:

Who are the most critical? Which of your operations are dependent on these third-party services?

Risk scoring:

Utilizing risk intelligence data and automation, gain an understanding of what risks each vendor brings to the relationship. Which are the highest risk? What happens if they experience a disruption like we’ve seen with CrowdStrike?

Look at your concentration risk:

Are most of your vendors located in one region? Do they use the same software? If they experienced a cyber-attack or update-related disruption, would your operations also go down?

Have some backup suppliers in place:

If your vendor goes down, are you debilitated, or can you utilize another option? Having alternate suppliers that can be utilized during a disruption helps to prevent total shut-down of operations and services. While they may not be able to fulfill everything your primary supplier does, having these plans in place helps to prevent incidents like we’ve seen today.

Build an incident response plan:

So, you’ve experienced an outage, and are not able to resume operations at the moment. What is the plan to manage and respond to this? Build an incident response plan with your vendors as part of your TPRM activities to understand how to proceed during an emergency.

The Digital Operational Resilience Act (DORA)

This major disruption showcases the necessity for both improved internal practices and governance around digital resiliency.  The Digital Operational Resilience Act (DORA) for financial institutions went into force in 2023 and will be applied within the EU in January of 2025.

The primary goals of DORA are:

  • Streamlining the integration of information and communications technology (ICT) risk management processes across all EU regulations
  • Mitigating cybersecurity risks of outsourcing operations to ICT third-party suppliers

While DORA is designed for financial institutions, regulations like this highlight the need for building better, documented digital resilience processes and practices, no matter the industry. Not only that, but in many cases, organizations will be expected to comply to similar governance somewhere down the line, and should begin making compliance-related preparations early.

By putting some of these IT and TPRM practices in place, organizations can avoid, react quickly, and recover quickly from incidents like the major IT outage experienced today.

Contact Aravo to learn more more about improving IT and digital resilience, or how to integrate DORA into your TPRM program.

Hannah Tichansky

Hannah Tichansky is the Senior Content Marketing Manager at Aravo Solutions, the market’s smartest third-party risk and resilience solutions, powered by intelligent automation. At Aravo, she manages all content and thought leadership produced for products and campaigns, and contributes as an author for articles and blog posts.

Hannah holds over 12 years of writing and marketing experience, with 6 years of specialization in the risk management, supply chain, and ESG industries. Hannah holds an MA from Monmouth University and a Certificate in Product Marketing from Cornell University.

Hannah Tichansky is the Senior Content Marketing Manager at Aravo Solutions, the market’s smartest third-party risk and resilience solutions, powered by intelligent automation. At Aravo, she manages all content and thought leadership produced for products and campaigns, and contributes as an author for articles and blog posts.

Share with Your Friends:

Subscribe to Blog Updates

Tags
Our Expertise
Expertise
Who We Help
Customers

Ready to get started?

Get in touch for a better approach to third-party risk management