CrowdStrike recovery is an activity and topic of focus for many organizations following the CrowdStrike bug that caused a global IT outage on July 18th and 19th, 2024. The number of users effected is unknown, however, we can glean some insights from news reports:
It is clear the CrowdStrike update bug caused significant disruption for many organizations.
The outage was caused by a faulty update for the Windows version of their Falcon sensor.
Here’s a breakdown of the issue:
Here’s some additional information:
It’s important to note that CrowdStrike has not released any official reports detailing the exact cause of the bug within the update. However, based on the available information, it appears to be a software error within the update itself that caused the system crashes.
CrowdStrike recovery presents a perfect use case for automated system recovery. In this explainer video, Sky News business correspondent Paul Kelso outlines the laborious manual process required to recover systems to a state that allows for deletion of the disruptive CrowdStrike driver file. Users with large server estates that do not utilize automated system recovery or boot management tools would face a significant amount of manual intervention and downtime in order to facilitate driver removal from all effected machines. Cristie Software bare machine recovery (BMR) provides system recovery from leading backup solutions such as Rubrik Security Cloud, Cohesity DataProtect, IBM Storage Protect and Dell Technologies backup solutions Avamar and Networker. Using Cristie recovery software automation, the following steps would be required to recover effected machines to a point before the disruptive CrowdStrike driver was applied:
The recovery process for the CrowdStrike Falcon update bug depended on the severity of the issue and your access to the affected system. Here are the two main approaches taken from online research. Users effected by the CrowdStrike update bug should conduct their own due diligence and refer to CrowdStrike support services to verify the procedure for their specific environment:
Additional Tips:
Remember: These are general guidelines taken from online resources. The specific steps may vary depending on your system configuration and the severity of the issue. It’s always best to consult with a qualified IT professional if you are unsure about any of the recovery procedures.
The CrowdStrike driver update failure has demonstrated how vulnerable enterprises are to system level driver changes that have the ability to disrupt the boot process of any operating system. Most companies invest in data backup solutions to safeguard application data but many fail to implement system recovery solutions that capture operating system configurations with the ability to restore complete systems to any available point in time. Furthermore, automated system recovery solutions such as the Cristie BMR suite which offer automation for physical machine recovery can eliminate manual intervention from the recovery process, potentially saving hours of administrative overhead when large scale server estate recovery is needed.
Contact the Cristie Software team if you have been effected by the CrowdStrike update failure and would like to learn more about system recovery and recovery automation.
New Mill, Chestnut Lane
Stroud, GL5 3EW
United Kingdom