What is incident response (IR)?
Incident response (IR) is the process of preparing for, detecting, containing, eradicating, recovering from, and learning from cybersecurity incidents.
Key takeaways
- Incident response is a process, not a one-time effort. The six phases (prepare, detect, contain, eradicate, recover, lessons learned) repeat for every incident.
- The biggest predictor of incident cost at a mid-market manufacturer is not the sophistication of the attacker; it is the preparation level of the defender. Same incident, different preparation, different outcome.
- A pre-arranged IR retainer with a known team materially reduces response time and total incident cost.
- Cyber insurance and IR are complementary; neither substitutes for the other.
- ARG's continuous engagement produces IR readiness as a byproduct: tested plans, current contact lists, exercised playbooks, evidence packages ready for insurance and regulators.
What are the phases of incident response?
Incident response runs through six phases. The standard reference is the NIST SP 800-61 incident response lifecycle, with minor variations across other frameworks (SANS, ISO 27035).
1. Prepare. Build the capability to respond before the incident happens. Plans, playbooks, contact lists, tools, training, retainer agreements, evidence-handling procedures. The phase that determines how the next five phases go. See What is an incident response plan (IRP)?.
2. Detect and analyze. Identify that an event is happening and assess what kind of incident it is. Combines technical detection (EDR alerts, SIEM correlation, user reports, OT-aware monitoring) with analyst judgment to confirm or dismiss. See What is dwell time?.
3. Contain. Stop the incident from spreading. Short-term containment (isolate affected systems, block malicious indicators) is followed by long-term containment (patch the underlying vulnerability, rotate credentials, remove attacker access). Containment decisions involve trade-offs: aggressive containment may stop an attack but disrupt operations; conservative containment may preserve operations but allow continued damage.
4. Eradicate. Remove the attacker and their tooling. Identify all affected systems; remove malware, modified configurations, persistent access mechanisms; close the entry vector. Eradication that misses a foothold produces re-infection within days.
5. Recover. Restore normal operations. Rebuild compromised systems, restore data from clean backups, re-establish monitoring, re-introduce systems to production with confirmed clean state. Recovery includes the gradual return to normal, not just the technical restoration.
6. Lessons learned. Post-incident review. What happened, why, what worked in the response, what did not, what changes prevent recurrence. The phase that turns an incident into program improvement. Without it, the same gaps produce the same incidents.
The phases overlap in practice. Detection continues during containment; containment continues during eradication; recovery starts before eradication is complete. The framework is a sequence on paper and a parallel set of streams in execution.
The difference between an incident, an event, and a breach
The vocabulary matters because the response differs.
Event. Anything that happens on a system or network. A user logon, a configuration change, a software update. Events are not inherently good or bad; they are the raw material the detection layer works with.
Security event. An event with security implications. A failed logon, an EDR alert, a firewall block. Most security events are benign; some are the leading edge of an incident.
Incident. A security event (or set of events) that has been confirmed as malicious or as a violation of policy. An incident triggers the response process. The escalation from event to incident is the analyst judgment that something is actually happening.
Breach. An incident where confidential information has been disclosed to unauthorized parties, or where regulatory definitions of breach are satisfied. A breach triggers additional obligations: notification to affected parties, regulators, customers, insurers; legal review; public disclosure where applicable.
The escalation path is event -> security event -> incident -> breach. Most events do not become incidents; most incidents do not become breaches. The response process activates at the incident threshold; breach obligations stack on top.
For mid-market manufacturers, the distinction matters operationally. An IT alert at 2 a.m. that is later confirmed benign is not an incident. An IT alert at 2 a.m. that turns out to be a real attack is. The vocabulary protects the team from declaring an incident every time something looks suspicious and from waiting too long to declare one when something actually is.
Why mid-market manufacturers without an IR retainer pay 3-5x more during an incident
The cost differential is structural, not anecdotal. Five components drive it.
- First-hour delay. Without a pre-arranged retainer, the first hour after detection is spent finding an IR provider, getting them under contract, and getting them oriented. With a retainer, the same hour is spent containing the incident. The first hour is when an attacker is most likely to be stopped before spreading.
- Premium pricing for unplanned engagement. IR providers price unplanned engagement at higher rates than retainer work. The premium can be 50% to 100% above retainer rates.
- Slower escalation to expert resources. Retainer clients reach senior responders faster. Walk-in clients work through whoever is available.
- No prior knowledge of the environment. A retainer engagement includes pre-incident familiarization (network diagrams, EDR coverage, key contacts, known systems). A walk-in engagement starts from zero, which adds days to the response.
- Insurance friction. Many cyber policies require pre-approved IR providers; using a non-panel provider creates coverage disputes. A retainer with a panel-approved provider sidesteps the issue.
The math for a mid-market manufacturer: an annual retainer at $15,000 to $30,000 versus an incident response that runs from $150,000 to $500,000 for the same event without preparation. The retainer is insurance against the magnitude of an unplanned response.
Examples of IR engagements in manufacturing
What ARG and the broader industry see at mid-market manufacturers:
- Ransomware reaching the historian on Friday night. EDR alerts on encryption activity. IR begins within an hour because the retainer is in place. Containment isolates affected systems by Saturday morning; full restoration completes by Tuesday. Production loss: 1.5 days. Without retainer, equivalent incident produces 5-7 day production loss.
- BEC wire fraud confirmed Monday morning. Wire went out Friday afternoon; vendor noticed non-payment Monday. IR partner engages bank, FBI IC3, and insurance carrier within four hours. Funds recovered partially through the Financial Fraud Kill Chain. See What is business email compromise (BEC)?.
- OT vendor remote-access compromise. Vendor's infrastructure breached; attacker has reached customer manufacturers through vendor tools. Coordinated response across affected customers; vendor's IR team and customer IR teams share IOCs. Recovery includes vendor disconnection, vendor-side cleanup verification, and re-establishment with stronger controls.
- Insider misconduct discovered through audit log review. Departed engineer accessed file shares with CAD repositories during the notice period. IR partner conducts forensic review; HR and legal coordinate. No active attacker to contain; eradication is account closure and access revocation; the response focuses on scope of access and notification obligations.
- Deepfake CEO fraud caught at wire-approval step. Finance receives voice cloning call requesting wire; callback verification habit catches it. Not a successful incident; documented as a near-miss and exercised in the next tabletop.
- Engineering workstation compromise via spear phishing. Worker clicks lure; credential-harvest page captures M365 credentials; OAuth grant follows. Detection on identity-anomaly alert; containment includes revoking the OAuth grant, rotating credentials, and walking mailbox rules. See What is consent phishing (OAuth phishing)?.
The pattern: most incidents at mid-market manufacturers are not zero-days or nation-state operations. They are commodity attack chains that succeed because preparation was thin and detection was slow.
How to choose an IR retainer that actually answers the phone
For a mid-market manufacturer selecting an IR retainer:
- Response-time commitments in writing. Initial contact, named responder engagement, on-site or remote-active engagement. The numbers should be in the retainer agreement, not on the marketing site.
- Pre-engagement familiarization. A good retainer includes time before the incident: network diagrams reviewed, EDR coverage documented, key contacts confirmed, escalation path tested. Without familiarization, the retainer is a phone number.
- OT awareness for manufacturers. The IR partner understands OT environments, knows the appropriate caution around production systems, has experience with the vendor stacks the client runs. Generic IT IR partners can do parts of the job; OT-aware partners do all of it. See What is operational technology (OT) security?.
- Insurance panel approval. Most cyber insurance carriers maintain a list of approved IR providers. The retainer's panel-approval status matters; using a non-panel provider creates coverage disputes.
- Real coordination with the client team. The IR partner does not own the incident; the client does. A retainer that produces an outside team imposing decisions on the client is worse than a retainer that produces an outside team supporting the client's decisions.
- Documented evidence handling. Chain of custody, forensic preservation, evidence reports that hold up in regulatory or legal review. The retainer's evidence-handling procedures should be documented.
- Annual retainer testing. A tabletop with the retainer provider every twelve months confirms the relationship works. The first time the retainer responds should not be the first time the team meets the provider.
- Pricing structure that aligns incentives. Some retainer structures incentivize the provider to keep the engagement running; some incentivize fast resolution. Discuss alignment explicitly during contracting.
Best practices for the first hour of an incident
The first hour determines the next thirty days. Eight things that matter most:
- Confirm the event is an incident. Not every alert is an attack. The first task is analyst judgment, not action. Premature declaration is operationally costly; delayed declaration is more costly.
- Notify the IR partner. Within the first thirty minutes. Even if the situation is ambiguous, get the partner aware.
- Notify insurance. Many policies require notification within specific windows. The first hour is not too early; the third day is sometimes too late.
- Preserve evidence. Do not delete malicious files, do not wipe affected systems, do not change configurations to "fix" the problem. The evidence is essential for both containment effectiveness and post-incident review.
- Activate the IRP. Pull out the incident response plan, follow it, document deviations.
- Establish a single source of truth. A shared document, chat channel, or war-room board where the timeline, decisions, and ownership are visible. Loss of central record is a frequent failure mode.
- Limit notification scope initially. Only people who need to know in the first hour are notified. Broader notification follows in later hours when the situation is clearer.
- Start the timeline. Document timestamps, actions, decisions from the first minute. The timeline is essential for insurance, regulatory, and learning purposes.
Incident response FAQs
Should I have an IR retainer if I have cyber insurance?
Yes, usually. Cyber insurance covers loss; an IR retainer is what gets you operational again. Many policies include a panel of approved IR providers, but the relationship is established at incident time, not before. A pre-arranged retainer with a known team produces faster response, and most policies pay the retainer-provided IR work even when other panel options exist. See What is cyber insurance underwriting?.
What does an IR retainer cost?
Retainer structures vary. Some providers charge a monthly fee against pre-purchased hours; some charge an annual minimum that converts to incident hours when needed; some offer zero-cost retainers with higher hourly rates during an incident. For a mid-market manufacturer, annual retainer fees typically run mid-four to low-five figures with attached hourly rates during active incidents.
How fast should an IR partner respond?
Initial contact within one hour, named responder engaged within four hours, on-site or remote-active response within twenty-four hours. SLAs vary by provider; the ones that matter are the response-time commitments in the retainer agreement, not the marketing-page response time.
Do I need an internal IR team if I have a retainer?
You need internal incident commanders, not an internal IR team. Even with a retainer, the named decision-makers in your organization (executive sponsor, IT lead, legal, operations) own the response. The retainer provides technical capability and external perspective; the internal team owns decisions and operational continuity.
How ARG findings feed directly into IR readiness
ARG does not run incident response engagements during active incidents; the work is done by specialized DFIR firms. ARG's engagement model produces the readiness that determines how a future incident plays out.
The readiness work is led by James Wall on the digital side and David Ashby on the physical and procedural side. The continuous engagement produces, as byproducts:
- A tested IRP. The incident response plan is exercised quarterly through tabletops. Gaps surface; corrective actions land; the plan evolves with the environment.
- Current contact lists. People change roles, leave the organization, change phone numbers. The continuous engagement keeps the contact list current.
- Validated detection coverage. Continuous adversarial simulation exercises the detection layer constantly. Drift surfaces in days, not after the next breach.
- Decision-authority documentation. Who decides what, by whom, when. Pre-authorized within documented thresholds.
- Insurance-aligned evidence. Documentation supports renewal underwriting and a future claim. The carrier knows what posture the manufacturer maintains because the evidence has been visible all year, not just at renewal.
- Coordination with an external IR retainer. ARG works with the client's IR retainer (or recommends one) and participates in joint tabletops. The retainer arrives at an incident already familiar with the environment because of the prior coordination.
For founding clients, IR readiness is part of the standard engagement, not an add-on. The output of every other ARG service feeds into the IR capability. When an actual incident occurs, the manufacturer is not starting from preparation work; they are activating preparation work that has been live for months.
Apply as a founding client or see how the engagement works for the full delivery cycle.
Find what gets through.
ARG runs continuous AI-driven adversarial simulation and on-site physical audits for mid-market manufacturers. Two founding-client spots remain.