Last week, we witnessed one of the most widespread IT outages the world has ever seen.
CrowdStrike, an American Cybersecurity company that supplies Microsoft, announced that a bug in a recent software update impacted companies across the world. From minor inconveniences, such as people not getting their morning coffee, to starker consequences of grounded planes and unretrievable patient records, some of the largest corporations in the world were forced to a standstill by millions of little blue screens.
The outage is a striking illustration of just how serious third-party risk can get. In this blog, RiskSmart’s experts reflect on how Risk professionals and organisations can learn from the incident.
Unless you have been hiding under a rock, you might have already seen the coverage of the outage, but just in case, here’s a quick summary of what happened.
The outage is a reminder that a company is only as good as its third-party suppliers or plans to manage problems they might bring. Risk Management teams around the world suddenly had their phones and inboxes full of crisis management requests, many of them caught off guard, unsupported or unprepared to deal with a challenge like this.
So, how do you prepare for a global outage or any other disaster a third-party supplier might bring your way?
According to a PwC report, as many as 93% of GRC (Governance, Risk and Compliance) professionals surveyed still manage compliance manually with spreadsheets.
In addition to the well-known weaknesses presented by relying on spreadsheets alone, this outage demonstrated the unique situation of being unable to access risk and governance documentation stored in spreadsheets when needed the most.
The irony wasn’t lost on us.
In the wake of the outage, it’s already been established that insurers could face thousands of business interruption claims, many of them increased in severity due to a lack of a timely response from the broker.
“A common challenge we see amongst Risk professionals is that Risk often takes quite a siloed approach within the business, a tendency often exacerbated by using spreadsheets,” says Ryan Swann, Co-founder of RiskSmart.
Ryan is an accomplished Fintech lawyer and Chief Risk Officer with over 20 years of experience in various organisations. He recommends assessing if the tools used by your team are enough to support your business and its ambitions.
“The cost of implementing a system can sound counterintuitive to business leaders. But if you think about how much time and effort your team puts into manual data entry, resolving issues caused by data duplication and keeping spreadsheets up to date, you start to see how building your risk framework on Excel might be holding you back, or even putting your company at risk.”
His point about Risk teams working in silos leads us to the challenge of building a solid Risk culture throughout your organisation. A typical scenario would be that other departments are unaware of Risk protocol or framework.
Until disaster knocks on the door.
Engaging cross-functional teams to gain diverse perspectives and prioritise investments in contingency planning is vital. The key is proactively identifying Single Points of Failure (SPOFs) and addressing them before they can cause significant disruption.
To make this successful, Risk Management must become something the broader organisation is actively involved in, not just a tick-box exercise.
“Increasing suppliers' standards by implementing a thorough due-diligence process for third parties is critical. Identifying their adherence to compliance and any processes in place that might put you at risk should not be a tick-box exercise that’s done and forgotten but something that is continuously monitored closely. What part do third parties play in your business strategy, and what could go wrong? Have robust plans for this even if it seems unlikely, ” says Emma Bamford, Head of Client Success at RiskSmart.
Emma has over 15 years of experience in the Risk field and works on the front line with our clients, setting them up for success with RiskSmart.
Her advice highlights the critical ability to gather and store information about third parties and connect this to your own risk framework, which is often tricky when using outdated, clunky spreadsheets or tools with limited functionality and customisation options.
An effective incident response plan is critical for mitigating the impact of outages.
“Although the CrowdStrike outage incident revealed gaps in automated recovery systems, and this highlighted the need for clear communication channels, strong governance frameworks and compliance monitoring, it also revealed that the response to risk is often very knee-jerk and in-the-moment, rather than a measured and strategic approach,” says Emma.
She advises having a clear and up-to-date Business Continuity Strategy and an understanding of critical business functions. Recovery objectives can help you respond quicker to incidents and get everyone in the company, from staff to stakeholders, aligned on the next critical actions.
Although it was first suspected that a cyber attack caused the outage, CrowdStrike quickly established that nothing malicious was behind the disturbance.
But what if that was the case?
“One of the main concerns we see for tech leaders is Data Security and Single Sign-On (SSO) capabilities,” says Richard Poole, CTO of RiskSmart.
Richard has over twenty years of knowledge in various sectors, including FinTech, Retail, Insurance, Logistics, Media, Healthcare, Infrastructure and Communications.
According to IBM’s Cost of a Data Breach Report, they discovered that breaches initiated with stolen or compromised credentials, on average, took the longest to resolve. He explains that choosing a tool that strengthens your safety measures should always be at the forefront of organisations' minds.
“Unlike many SaaS products that impose additional charges for SSO, RiskSmart offers this as a complimentary feature. Additionally, we do not charge for additional users, allowing unrestricted access to our Governance, Risk, and Compliance (GRC) platform in a secure and monitored manner.”
The grand outage of 2024 serves as a potent reminder that no system is infallible. For Risk Management and GRC teams in regulated industries, it is an opportunity to revisit and reinforce strategies.
By conducting comprehensive risk assessments, strengthening governance frameworks, promoting training and awareness, and leveraging technology, organisations can better prepare for and mitigate the impact of future outages.
With RiskSmart, organisations gain a powerful ally in managing risks and ensuring compliance. In a world increasingly reliant on cloud services, the ability to learn and adapt from incidents like the Azure outage, with the support of advanced tools like RiskSmart, will be a key differentiator for organizations striving to maintain operational continuity and safeguard their business interests while adhering to stringent regulatory requirements.