What is a ransomware playbook?
A ransomware playbook is a documented sequence of decisions and actions an organization follows when ransomware is detected in its environment.
Key takeaways
- A ransomware playbook documents the decisions and actions for the first 24 to 72 hours of a ransomware incident, when the response window is tightest.
- The playbook covers technical containment, communications, ransom decision framework, insurance coordination, and (for manufacturers) production-stoppage authority.
- For mid-market manufacturers, the playbook has to address OT and production explicitly. Generic IT-only ransomware playbooks miss the most consequential decisions.
- The decision to pay or not pay the ransom should be made in advance with executive leadership, not under pressure during the incident.
- ARG builds ransomware playbooks tuned to specific plant operations as part of the broader incident response plan scaffolding.
What does a ransomware playbook contain, step by step?
A serious ransomware playbook covers nine sections, each addressing a specific aspect of the response.
1. Activation criteria and severity. When the playbook is invoked. Confirmed ransomware execution, suspected staging activity, known indicators of compromise from ransomware affiliate groups. Severity tier criteria: how widespread is the encryption, what systems are affected, has data exfiltration occurred.
2. Immediate containment actions. First-hour technical steps. Isolate affected systems from the network without powering them off (powering off destroys forensic evidence in memory). Block the C2 infrastructure identified by EDR. Revoke active sessions. Disable suspected compromised accounts. The actions are sequenced; each step has an owner and a time target.
3. Incident commander activation and team assembly. Who is the incident commander for this incident, who is on the response team, how the war room is established. The playbook references the incident response plan for role definitions.
4. Notification sequence. Who gets called, in what order, on what channel, in what time window:
- IR retainer partner (within 30 minutes)
- Cyber insurance carrier (within 4 hours; some policies require sooner)
- Executive sponsor (immediately)
- Legal counsel (within 4 hours)
- Bank (if BEC or wire involvement)
- FBI IC3 (within 24 hours; before any ransom payment)
- Customers, regulators, board (per separate communications timeline)
5. Production decision authority. For manufacturers specifically. Who decides whether to stop production, under what conditions, with what authority. Pre-authorized within documented thresholds; escalates to executive sponsor for high-impact decisions. The pre-authorization is what makes the decision possible at 2 a.m.
6. Forensic and evidence preservation. What to capture (memory images of affected systems, network captures, log exports), what not to do (delete malicious files, wipe systems, change configurations beyond containment), how chain-of-custody is documented. Aligned with insurance, regulatory, and (for CUI handlers) CMMC requirements.
7. Communications plan. Internal staff, customers, vendors, regulators, public. Templates for first-hour notifications, first-day updates, first-week communications. Decisions on public disclosure tied to specific severity tiers and breach criteria.
8. Ransom decision framework. Documented criteria for whether to engage in negotiation, when to consider payment, OFAC review process, carrier-approved negotiator engagement. The framework is decided in advance; the application happens during the incident.
9. Recovery sequence. Order of system restoration, dependencies, validation steps before returning systems to production, monitoring during the post-recovery window. The sequence prioritizes systems that enable other restoration; rebuilding the email server before the engineering file server, etc.
The nine sections together produce a document that a tired incident commander can navigate at hour two of an incident.
The first decisions that have to be made
In the first hour, four decisions matter most.
1. Isolate or wait. Aggressive isolation (cutting network connections to affected systems) stops the spread but disrupts operations and may destroy live forensic data. Conservative isolation (firewall-level blocks, leaving systems running for analysis) preserves evidence but allows continued damage. The playbook pre-decides which approach is the default for each severity tier; the incident commander applies it.
2. Notify or hold. Notification of insurance, IR partners, and external counsel is usually within the first hour, but specific timing matters. Some carriers require notification within a specific window; missing the window risks coverage. The playbook lists notification timing as non-discretionary.
3. Production decision (for manufacturers). Continue running, halt at next safe stopping point, halt immediately. The decision depends on which systems are affected, how the attack is spreading, and whether production data is being exfiltrated. Pre-authorized within documented thresholds; manufacturers without pre-authorized thresholds spend hours on this decision.
4. Negotiate or refuse. Whether to engage with the attacker at all. Most professional IR engagements involve some form of communication with the attacker (to establish their identity, learn what data was exfiltrated, understand demands), even if the organization has decided not to pay. The decision to negotiate is not the same as the decision to pay.
Each of the four decisions has implications for the next 24 hours. The playbook makes them faster because the framework is pre-built; the incident commander applies the framework rather than designing it.
Why ransomware playbooks for manufacturers must address production downtime explicitly
For most organizations, ransomware is an IT incident. For manufacturers, it is potentially a production incident. The distinction drives the playbook structure.
Generic IT ransomware playbooks address: which servers to isolate, which credentials to rotate, how to restore from backup, how to communicate with customers. Manufacturer-specific playbooks add:
- Production-stoppage authority. Pre-authorized within documented thresholds. Without pre-authorization, the IT team isolates affected systems but production continues until executive decision; ransomware can spread further during the wait.
- OT-side detection and isolation. OT systems often have minimal detection. The playbook describes how OT impact is assessed and what isolation actions are appropriate (often more conservative than IT-side actions because of operational risk).
- Engineering workstation handling. Engineering workstations are the IT-to-OT pivot. The playbook describes specific containment for engineering workstations and the validation required before re-introducing them.
- Supplier and customer notification. Manufacturers in supply chains for defense, automotive, aerospace, or other regulated industries have contractual notification obligations. The playbook lists the specific contracts, contacts, and timing.
- Production restart criteria. What has to be verified before production resumes. Not just "systems restored" but "this specific verification has been completed".
For a defense supplier specifically, the playbook also covers DC3 reporting (within 72 hours of confirmed incident under DFARS 252.204-7012), customer notification under contract terms, and CUI exposure assessment.
The playbook that addresses these manufacturer-specific concerns explicitly is materially different from the playbook used at a SaaS company. Generic templates do not survive contact with a real manufacturing incident.
Examples of ransomware-hit manufacturers and what they did right or wrong
Public ransomware incidents at manufacturers and what they reveal.
- JBS Foods (May 2021). Ransomware attack on the world's largest meat processor. Production halted across multiple plants. Paid $11M ransom to REvil. Lessons: large-scale manufacturing operations have tight tolerance for downtime; aggressive containment plus offline backup restoration was viable but the company paid for speed.
- Colonial Pipeline (May 2021). Ransomware on the IT side; precautionary OT-side shutdown lasted six days; gasoline shortages on the US East Coast followed. Paid $4.4M to DarkSide. FBI recovered a portion. Lessons: OT-IT entanglement makes "IT-only" incidents into operational incidents; backup restoration without good documentation is slower than expected.
- Norsk Hydro (March 2019). Aluminum manufacturer hit by LockerGoga. Refused to pay. Recovered from backups over months. Total cost ~$70M. Lessons: paying and not paying both have costs; reputation and transparency benefits accrued from refusing payment; recovery time without ransom can be measured in months.
- Brunswick Corp (June 2023). Boat manufacturer hit; nine-day operational disruption. Lessons: even mid-market and upper-mid-market manufacturers see multi-day disruptions.
- MKS Instruments (February 2023). Semiconductor-equipment supplier; estimated revenue impact $200M; supply chain disruption affected downstream customers. Lessons: a single supplier ransomware incident propagates through customer base.
- Numerous mid-market manufacturer events (2023-2026, mostly private). Steady cadence of ransomware events affecting tier-2 and tier-3 manufacturers. Few make public news; aggregate impact is large. Most show similar patterns: phishing or credential theft, lateral movement, ransomware deployment in 2-7 days, production impact ranging from minimal to multi-week.
The lessons are consistent: prepared organizations recover faster, less expensive incidents come from earlier detection, and the decision-making framework matters more than the technical response.
How to test a ransomware playbook without taking down production
Playbooks that have not been tested do not work. Testing without operational risk:
- Tabletop exercise on a ransomware scenario. Discussion-based, two to four hours, with the executive team and IT lead. The scenario is realistic for the specific facility; the playbook is in the room; the team walks through their response. See What is a tabletop exercise?.
- Functional exercise on backup restoration. Restore a specific system from backup in a controlled environment (not production). Time the restoration. Validate the restored data. Identify gaps in the backup or restoration procedure.
- Detection validation through BAS. Run ransomware-pattern simulations (encryption-like behavior, C2 staging, lateral movement) against the production environment with controlled tooling. Confirm that EDR alerts, SIEM correlates, and the IR partner is notified.
- Walk-through with the IR partner. Annual or semi-annual walk-through with the retainer-provided IR team. Confirm contacts, escalation, evidence-handling procedures, and response-time SLAs.
- Insurance carrier coordination. Annual review with the carrier on notification timeline, approved negotiators, ransom-payment process. Updates to the playbook reflect carrier changes.
- Post-engagement updates. After every adversarial simulation, purple team exercise, and tabletop, the playbook updates with lessons learned.
The testing cadence: tabletop quarterly, functional backup test semi-annually, IR-partner walk-through annually. The discipline produces a playbook that works.
Best practices for ransom-payment decision frameworks
The hardest decision in a ransomware incident is whether to pay. The framework is decided in advance.
- Default posture. The organization has a documented default position (e.g., "do not pay unless specific criteria are met"). The default is the starting point for the decision.
- Decision criteria. What has to be true to deviate from the default. Examples: backups are confirmed unrecoverable, business continuity requires faster recovery than backup restoration allows, regulatory or contractual obligations to customers require immediate restoration.
- OFAC review. Mandatory check that the threat actor is not on the OFAC sanctions list. Paying a sanctioned actor is illegal regardless of business necessity. The carrier-approved negotiator typically handles the screening.
- Carrier coordination. The insurance carrier must be informed before any payment. Coverage typically depends on adherence to carrier procedures. A payment made without carrier coordination is usually uncovered.
- Negotiator engagement. Direct negotiation by the affected organization is almost always a mistake. A professional negotiator (typically engaged through the IR partner or carrier panel) handles the communication, reduces the demand, and confirms the attacker's identity.
- Authorization level. Decisions to pay require executive sponsor authorization at minimum; board notification for large payments; legal review in all cases.
- Documentation. The decision is documented: criteria considered, factors weighed, who authorized, what was paid, what was received. The documentation supports insurance, regulatory, and post-incident review.
- Post-payment posture. What happens after payment (or non-payment). Forensic review continues; the organization assumes the attacker may still have access; rebuilding from clean state is required regardless of payment.
The framework does not predetermine the answer; it produces a process that produces an informed answer under pressure.
Ransomware playbook FAQs
Should we ever pay the ransom?
It depends on the situation, but the decision should not be made under pressure during the incident. The right time to decide the organization's ransom-payment posture is in advance, with executive leadership, legal counsel, and insurance involvement. Considerations include OFAC sanctions (some ransom payments are illegal), business continuity necessity, backup viability, and ethical positioning. Most mature programs avoid payment when possible; some incidents leave no realistic alternative.
Does cyber insurance pay ransoms?
Sometimes, depending on policy. Many cyber policies cover ransom payments through an extortion-coverage component, subject to limits, sub-limits, and conditions. Carriers often require carrier-approved negotiator involvement before paying. OFAC-sanctioned actor payments are generally not covered because the underlying payment is illegal. See What is cyber insurance underwriting?.
How long does it take to recover from a ransomware attack in manufacturing?
Three days to three weeks for the technical recovery, depending on backup state and operational complexity. Production resumption depends on the systems affected: if only IT is hit, production may continue; if engineering systems, MES, or SCADA are affected, production stops. The Colonial Pipeline incident is a public example of precautionary OT-side shutdown lasting six days; mid-market events with smaller scope can be shorter or longer.
What is the difference between ransomware and double-extortion ransomware?
Traditional ransomware encrypts data and demands payment for decryption. Double-extortion adds data exfiltration: the attacker takes a copy of the data before encrypting, then threatens to publish it if the ransom is not paid. The attacker has two levers (decryption and data publication); paying for decryption alone does not prevent the data publication. Almost all modern ransomware operates in double-extortion mode.
How ARG builds ransomware playbooks tuned to plant operations
ARG builds ransomware playbooks as part of the incident response plan scaffolding delivered with each engagement. The playbook is tuned to the specific facility: actual systems, actual vendors, actual production dependencies, actual contract obligations.
The work is led by James Wall on the digital and procedural side, with David Ashby contributing on physical and OT scenarios. The output reflects manufacturing reality, not generic IT templates.
The playbook covers:
- Activation and severity. Specific criteria for the client's environment: which systems flag a ransomware incident, what severity tiers apply.
- Containment actions sequenced by system. Engineering workstations isolated first (the IT-OT pivot); file servers next; OT-adjacent systems with explicit decision criteria; production systems with pre-authorized stoppage thresholds.
- Notification sequence with specific contacts. IR partner, insurance, executive, legal, bank (where applicable), prime contractors (for defense work), customers (per contract), FBI IC3, state authorities (per breach laws).
- Production decision matrix. Specific scenarios with pre-authorized actions. "If ransomware reaches the engineering file server, halt production at next safe stopping point" with executive-sponsor escalation authority.
- Forensic preservation aligned with CMMC and insurance requirements. Evidence captured to support both incident review and regulatory reporting.
- Ransom decision framework. Pre-decided default posture, decision criteria, OFAC review path, carrier coordination.
- Recovery sequence prioritizing production resumption. Order of system restoration with documented dependencies and validation steps.
The playbook is reviewed quarterly in tabletop exercises and updates with material findings from the continuous adversarial simulation. The version in place at the end of Year 1 is materially different from the initial draft because the engagement has surfaced specific facility realities.
For founding clients, the ransomware playbook is part of the monthly retainer alongside the broader IRP scaffolding.
Apply as a founding client or see how the engagement works for the full delivery cycle.
Find what gets through.
ARG runs continuous AI-driven adversarial simulation and on-site physical audits for mid-market manufacturers. Two founding-client spots remain.