A global IT outage has caused chaos at airports, banks, railways andbusinesses around the world as a wide range of services were taken offline and millions of people were affected.
In one of the most widespread IT crashes ever to hit companies and institutions globally, air transport ground to a halt, hospitals were affected and large numbers of workers were unable to access their computers. In the UK Sky News was taken off air temporarily and the NHS GP booking system was down.
Microsoft’s Windows service was at the centre of the outage, with experts linking the problem to a software update from cybersecurity firm Crowdstrike that has affected computer systems around the world. Experts said the outage could take days from which to recover because every PC may have to be fixed manually.
Overnight, Microsoft confirmed it was investigating an issue with its services and apps, with the organisation’s service health website warning of “service degradation” that meant users may not be able to access many of the company’s most popular services, used by millions of business and people around the world.
Among the affected firms are Ryanair, Europe’s largest airline, which said on its website: “Potential disruptions across the network (Fri 19 July) due to a global third party system outage … We advise passengers to arrive at the airport three hours in advance of their flight to avoid any disruptions.”
Having half of the world depend on a corporate proprietary single company is the stupidest thing ever. They will learn nothing with this, sadly
Reminds me of when Canada lost internet to 12 million of it’s 33 million people because one company messed up doing maintenance.
While you are right, this outage has basically nothing to do with Windows or Microsoft. It’s a Crowdstrike issue.
It also has to do with software updates being performed without the user having any control over them.
Agreed, but again these updates were done by the Crowdstrike software. Nothing to do with Microsoft or Windows.
In this case it was an update to the security component which is specifically designed to protect against exploits on the endpoint. You’d want your security system to be up to date to protect as much as possible against new exploits. So updating this every day is a normal thing. In a corporate environment you do not want you end users to be able to block or postpone security updates.
With Microsoft updates they get rolled out to different so called rings, which get bigger and bigger with each ring. This means every update is already in use by a smaller population, which reduces the chances of an update destroying the world like this greatly.
Best part? George Kurtz (crowdstrike CEO) won’t be available for handling the fallout. He’s busy racing this weekend.
Car #04 in the entry list https://www.gt-world-challenge-america.com/event/95/virginia-international-raceway
I absolutely expect vendors to push out new patterns automatically and as fast as possible.
But in this case, a new system driver was rolled out. And when updating system software, I absolutely expect security vendors to use a staged rollout like everyone else.
100% agreed, Crowdstrike fucked up with this one. I’m very interested to hear what went wrong. I assume they test their device drivers before deploying them to millions of customers, so something must have gone wrong between testing and deployment.
Something like this simply cannot happen and this will cost them customers. Your reputation is everything in the security business, you trust you security provider to protect your systems. If the trust is gone, they are gone.
I’m very interested to hear what went wrong.
We’ll probably never know. Given the impact of this fuck up, the most that crowdstrike will probably publish is a lawyer-corpo-talk how they did an oopsie doopsie, how complicated, unforseen, and absolutely unavoidable this issue has been, and how they are absolutely not responsible for it, but because they are such a great company and such good guys, they will implement measures that this absolutely, never ever again will happen.
If they admit any smallest wrongdoing whatsoever they will be piledrived by more lawyers than even they’d be able to handle. That’s a lot of CEO yachts in compensations if they will be held responsible.
One time years ago, Sophos provided an update the blocked every updater on the machine. Each computer had to be manually updated. They are still in business. My point is that this isnt the first and wont be the last time it happens.
Yeah, I mean Microsoft can release something like Windows 11 and still be in business, so I don’t expect a lot will change. But if you had any stocks in Crowdstrike, RIP.
It’s great to have alternatives. If it was all linux, and linux got hit, then it’d be the entire world in danger. Too bad M$ is just not good enough for it’s second most popular position.
Well, we got to see roughly something play out with the xz thing. In which case only redhat were going to be impacted because they were the only ones to patch ssh that way.
Most examples I can think of only end of affecting one slice or another of the Linux ecosystem. So a Linux based heterogenous market would likely be more diverse than this.
Of course, this was a relative nothing burger for companies that used windows but not crowdstrike. Including my own company. Well except a whole lot fewer emails from clients today compared to typical Fridays…
Windows PC running Crowdstrike.
Shhh
The OS getting fully bricked because of a third party software update is still very much a OS level fuck up.
Depends. Since this is security software it probably has a kernel driver component. I think in linux a 3rd party kernel module could do the same. But the community would not accept closed source security software, especially not in the kernel.
They even have a version for Linux, which is a kernel module.
Except crowdstrike literally doesn’t work like that on Linux.
Everyone shitting on windows, yet this thing exists on Linux as well… I also started to dislike windows, yet this is not the time to be against windows users, this is to go against Cloudstrike together for even letting this happen.
Citation needed, my NUC running Fedora made it through this without a hitch
I agree. I also think part of the blame can be placed on the system administrators who failed to make a recovery plan for circumstances like these – it’s not good to blindly place your trust in software that can be remotely updated.
In Linux, this type of scenario could be prevented by configuring servers to make copy-on-write snapshots before every software upgrade (e.g. with BTRFS or LVM), and automatically switching back to the last good snapshot if a kernel panic or other error is detected. Do you know if something similar can be achieved under Windows?
Exactly, the blame here is entirely on Crowdstrike. they could just as easily have made similar mistake in an update for the Linux agent that would crash the system and bring down half the planet.
I will say, the problem MIGHT have been easier to fix or work around on the Linux systems.