WarRoom

P1 SLA15 minutes to Incident Commander

War RoomCreate dedicated Slack/Teams channel: #inc-ransomware-YYYYMMDD

Notify:

CISO
VP Engineering
Legal
CEO (if data exfil confirmed)

External Contacts:

Cyber insurance carrier hotline
External IR retainer (if contracted)
Law enforcement (FBI IC3 / local CERT)

MTTD (Mean Time to Detect)< 30 minutes (from first encryption to confirmed alert)

MTTC (Mean Time to Contain)< 2 hours (from alert to full network isolation of affected hosts)

MTTR (Mean Time to Recover)< 48 hours (to restore critical business operations)

Industry BenchmarkIndustry median MTTC for ransomware: 4.5 hours (IBM X-Force 2025)

Confirm ransomware indicators — check for: mass file renames (.encrypted, .locked, .crypt), ransom notes (README.txt, DECRYPT_FILES.html), process killing (vssadmin, bcdedit, wbadmin)

Identify patient zero and blast radius:

CrowdStrike: Go to Investigate > Host Search > sort by 'First Seen' encryption IOC

Sentinel KQL: `DeviceFileEvents | where FileName endswith '.encrypted' | summarize count() by DeviceName | sort by count_ desc`

Splunk: `index=edr sourcetype=file_events (file_extension=encrypted OR file_extension=locked) | stats count by host | sort -count`

Determine the variant — upload ransom note or encrypted sample to id-ransomware.malwarehunterteam.com

Check if decryptor exists: nomoreransom.org/en/decryption-tools

Decision Tree

IF: Encryption is actively spreading

THEN: SKIP to Contain → Emergency Isolation immediately. Do NOT wait for full scoping.

IF: Encryption stopped / single host only

THEN: Proceed with methodical scoping before containment.

IF: Threat actor is in chat / making demands

THEN: Do NOT engage. Notify Legal + Insurance immediately. They will handle negotiation if needed.

Network-isolate all confirmed infected hosts:

CrowdStrike: Host > Actions > Contain Host (or API: `containment-action v2 POST /devices/entities/devices-actions/v2?action_name=contain`)

Defender for Endpoint: Device page > 'Isolate device' (allows Defender comms only)

Manual: Disable switch port — `interface gi1/0/X` → `shutdown` / or disable Wi-Fi via MDM

Disable compromised accounts in AD:

PowerShell: `Get-ADUser -Filter {Name -like '*compromised_user*'} | Disable-ADAccount`

Azure AD: `Set-AzureADUser -ObjectId user@domain.com -AccountEnabled $false`

Block C2 at perimeter:

Palo Alto: `set security profiles anti-spyware block-ip` / Objects > External Dynamic Lists > add C2 IPs

Sentinel: Create TI indicator — Threat Intelligence > Add indicator > type: IP / domain

Suspend automated backups to prevent encryption of backup storage

Preserve evidence — do NOT reboot or wipe infected machines. Memory forensics first.

Decision Tree

IF: Domain admin account is compromised

THEN: Initiate emergency KRBTGT reset — double reset 12h apart: `krbtgt` password reset via AD Users & Computers (replication will take up to 10h per reset)

IF: Backup infrastructure is accessible from compromised network

THEN: Air-gap backups immediately — disconnect backup server NIC or disable backup agent service

Identify initial access vector — common entry points to check:

1. Email: Search email gateway for attachments/links delivered in last 7 days to patient zero

2. RDP: `DeviceLogonEvents | where LogonType == 'RemoteInteractive' | where AccountName == '<compromised>' | sort by Timestamp asc`

3. VPN: Check VPN logs for unusual geo/time patterns pre-incident

4. Vulnerable edge device: Check Shodan/Censys for exposed services on your IP range

Remove persistence mechanisms on every host in attack path:

Scheduled tasks: `schtasks /query /fo LIST /v | findstr /i '<suspicious>'` → `schtasks /delete /tn <name> /f`

Services: `Get-Service | Where-Object {$_.StartType -eq 'Automatic' -and $_.Status -eq 'Running'} | fl Name,DisplayName,BinaryPathName`

Registry Run keys: `reg query HKLM\Software\Microsoft\Windows\CurrentVersion\Run`

WMI subscriptions: `Get-WMIObject -Namespace root\subscription -Class __EventFilter`

Scan all hosts for dormant payloads using variant-specific IOCs

Reset ALL potentially compromised credentials (prioritize: service accounts > admin accounts > user accounts)

Patch the vulnerability used for initial access before bringing systems back online

Verify backup integrity BEFORE restoring:

Mount backup in isolated environment → check for ransom notes / encrypted files → scan with AV

Restore from clean backups or rebuild from golden images

Re-enable network connectivity in stages — start with monitoring/security infrastructure first:

Stage 1: SIEM, EDR, network monitoring

Stage 2: Domain controllers, DNS, DHCP

Stage 3: Critical business applications

Stage 4: User workstations (batches of 10-20, monitor between batches)

Full credential rotation for affected scope — force password change at next logon for all users in affected OUs

Resume backup operations — verify backup jobs complete successfully for 48h

Decision Tree

IF: No clean backup available

THEN: Rebuild from scratch using golden images. DO NOT pay ransom without Legal + Insurance + Law Enforcement guidance.

IF: Recovery time exceeds business tolerance

THEN: Activate business continuity plan — switch to manual/paper processes for critical operations.

Document full timeline: initial access → lateral movement → encryption → detection → containment → recovery

Conduct lessons-learned meeting within 48h — ALL responders must attend

Update detection rules in SIEM based on IOCs and TTPs discovered

Report to authorities:

US: FBI IC3 (ic3.gov), CISA (cisa.gov/report)

EU: National CERT + data protection authority within 72h (GDPR Art. 33)

Publicly traded: SEC notification requirements (4 business days)

File cyber insurance claim with full incident timeline

Memory dump of patient zero (before reboot) — use WinPMEM or Magnet RAM Capture
Disk image of patient zero — use FTK Imager or dd
Network capture during active encryption (if available) — from span port / tap
EDR telemetry export for all affected hosts (full timeline)
Email gateway logs for initial access email
Firewall/proxy logs covering C2 communication period
Active Directory logs (Security event log, replication logs)
Ransom note screenshot + cryptocurrency wallet address
Sample of encrypted file + original (for potential decryption)

Internal Notification

SUBJECT: [CRITICAL] Active Security Incident — Ransomware

Team — We are responding to an active ransomware incident affecting [X] systems.

Current status: [Detecting / Containing / Eradicating / Recovering]
War room: #inc-ransomware-YYYYMMDD
Incident Commander: [Name]

DO:
- Report any unusual file activity to the war room immediately
- Continue using approved devices only

DO NOT:
- Attempt to open encrypted files or ransom notes
- Connect personal devices to the corporate network
- Discuss the incident on social media or with external parties

Next update: [time]

External / Customer Notification

SUBJECT: Security Incident Notification

Dear [Customer/Partner],

We are writing to inform you that [Company] experienced a security incident on [date]. 
We detected the incident promptly and our security team contained it within [X hours].

What happened: [Brief factual description — reviewed by Legal]
What data was affected: [Specific data types — reviewed by Legal]
What we're doing: [Remediation steps taken]
What you should do: [Specific guidance for the recipient]

We will provide updates as our investigation progresses. For questions, contact [security contact].

P2 SLA30 minutes to SOC Lead

Escalate to P1If credentials were entered AND MFA was bypassed → escalate to Account Compromise runbook

Notify:

SOC Manager
Email Admin

External Contacts:

Anti-Phishing Working Group (APWG) — reportphishing@apwg.org

MTTD (Mean Time to Detect)< 5 minutes (from user report to analyst triage)

MTTC (Mean Time to Contain)< 30 minutes (from triage to org-wide email purge)

MTTR (Mean Time to Recover)< 60 minutes (for credential-entry cases: password reset + session revoke)

Industry BenchmarkTarget < 10% click rate on phishing simulations

Triage the phishing report — check source: user report, SEG alert, or automated detection

Analyze email headers (extract from .eml or message trace):

Check sender: SPF pass/fail, DKIM signature, DMARC alignment

Look for reply-to mismatch (sender ≠ reply-to is a strong phishing indicator)

Check X-Originating-IP against threat intel

Inspect URLs WITHOUT clicking:

urlscan.io: submit URL → check redirects, final domain, screenshots

VirusTotal: submit URL → check detection ratio

ANY.RUN: submit in sandbox → see full execution chain

Inspect attachments in sandbox:

Hybrid Analysis (hybrid-analysis.com) or Joe Sandbox for file detonation

Check file hash on VirusTotal before opening anything

Decision Tree

IF: Confirmed phishing — no user clicked

THEN: Proceed to Contain (purge email). Skip Eradicate.

IF: User clicked link but did NOT enter credentials

THEN: Contain + check endpoint for drive-by download. Run EDR scan on user device.

IF: User entered credentials

THEN: URGENT: Contain + Eradicate immediately. Reset password + revoke sessions within 15 min.

IF: User entered credentials AND MFA token was intercepted (AiTM/EvilProxy)

THEN: ESCALATE to P1. Treat as full account compromise. Switch to Account Compromise runbook.

Find all recipients of the same email:

Exchange Online: `Get-MessageTrace -SenderAddress '<phishing_sender>' -StartDate '<date>' -EndDate '<date>'`

O365 Threat Explorer: Email & collaboration > Explorer > Sender = '<address>' → Select all → Purge

Google Workspace: Admin Console > Investigation Tool > Gmail log events > sender = '<address>'

Purge/quarantine from all mailboxes:

O365: Threat Explorer > Select messages > Move to: Soft Delete / Hard Delete

Exchange: `New-ComplianceSearch -Name 'PhishPurge' -ExchangeLocation All -ContentMatchQuery 'subject:<subject> AND from:<sender>'`

Then: `New-ComplianceSearchAction -SearchName 'PhishPurge' -Purge -PurgeType SoftDelete`

Google: Admin > Investigation > Select emails > Delete

Block sender domain and malicious URLs:

O365: Email & collaboration > Policies > Anti-phishing > Block sender/domain

SEG (Proofpoint/Mimecast): Add sender/domain to block list

Web proxy: Block phishing URL/domain at proxy level

If user clicked — isolate endpoint for scanning:

CrowdStrike: Contain Host

Defender: Isolate Device

If credentials were submitted:

Azure AD: `Revoke-AzureADUserAllRefreshToken -ObjectId <user>` + `Set-AzureADUser -ObjectId <user> -PasswordProfile @{Password='Temp@' + (Get-Random); ForceChangePasswordNextLogin=$true}`

Google: Admin > Users > Reset password + check 'Require password change'

Check and remove OAuth app consent grants: `Get-AzureADAuditSignInLogs | where AppDisplayName eq '<suspicious app>'`

If attachment was opened — check for malware execution:

Run full EDR scan on affected endpoint

Check process tree in EDR for child processes of Office/PDF apps

Sentinel: `DeviceProcessEvents | where InitiatingProcessFileName in ('WINWORD.EXE','EXCEL.EXE','POWERPNT.EXE','AcroRd32.exe') | where Timestamp > ago(2h) | where DeviceName == '<host>'`

Add IOCs to blocklists — sender domain, phishing URL, file hash, C2 domain

Confirm all instances of the phishing email are purged (re-run search to verify zero results)

Verify affected users have reset credentials and MFA is active

If malware was found — rebuild endpoint from clean image

Update email filtering rules with new indicators

Send org-wide awareness notification about this specific campaign

Log incident with IOCs: sender address, reply-to, subject line, URLs, file hashes, C2 domains

Share indicators with ISACs (FS-ISAC, H-ISAC, etc.) if applicable

Submit to APWG: reportphishing@apwg.org

Schedule targeted phishing simulation for users who clicked (within 2 weeks)

Review email gateway rules — could this have been caught automatically?

Original email (.eml file with full headers)
Screenshot of phishing page (if URL was active)
List of all recipients and who clicked
Sandbox analysis report (urlscan.io / ANY.RUN output)
EDR logs from affected endpoints
Email gateway logs showing delivery

P1 SLA15 minutes — BEC with active wire transfer is ALWAYS P1

War Room#inc-bec-YYYYMMDD

Notify:

CFO
CISO
Legal
Bank fraud department

External Contacts:

FBI IC3 (ic3.gov) — file within 48h for wire recall
Bank fraud department — call within 30 min of discovery

MTTD (Mean Time to Detect)< 1 hour (from first fraudulent email to detection)

MTTC (Mean Time to Contain)< 30 minutes (from detection to account disable + bank notification)

MTTR (Mean Time to Recover)Wire recall success rate drops 50% after 24h — speed is everything

Industry BenchmarkFBI IC3 Recovery Asset Team recovered $538M in BEC losses in 2024 — but only when reported within 48h

Identify BEC indicators:

- Wire transfer / payment request from executive with unusual urgency

- Reply-to address differs from sender display name

- Mailbox rules created to hide replies (auto-forward, auto-delete)

- Login from unusual location / impossible travel

Verify with the alleged sender via OUT-OF-BAND communication:

Call them directly (phone number from contacts, NOT from the email)

Walk to their desk if same office

DO NOT reply to the email or use chat — attacker may control those

Check Azure AD / O365 sign-in logs for compromised account:

Sentinel: `SigninLogs | where UserPrincipalName == '<user>' | where TimeGenerated > ago(30d) | summarize by IPAddress, Location, AppDisplayName | sort by TimeGenerated desc`

Azure Portal: Azure AD > Users > <user> > Sign-in logs > Look for: unusual IP, new device, risky sign-in

Check for malicious inbox rules:

PowerShell: `Get-InboxRule -Mailbox <user> | where {$_.ForwardTo -or $_.ForwardAsAttachmentTo -or $_.DeleteMessage -eq $true} | fl Name,ForwardTo,DeleteMessage,MoveToFolder`

O365 Admin: Exchange Admin > Mailboxes > <user> > Manage email apps > Check mail flow rules

Decision Tree

IF: Wire transfer has been initiated but NOT yet completed

THEN: IMMEDIATE: Call bank fraud department to halt transfer. Time is critical — every minute counts.

IF: Wire transfer completed

THEN: File FBI IC3 complaint immediately (ic3.gov). Contact bank for recall attempt. Success rate drops rapidly after 24h.

IF: Email compromise confirmed but no financial impact yet

THEN: Disable account, proceed with standard containment.

Disable the compromised account immediately:

Azure AD: `Set-AzureADUser -ObjectId <user> -AccountEnabled $false`

Revoke ALL sessions: `Revoke-AzureADUserAllRefreshToken -ObjectId <user>`

Google: Admin > Users > Suspend user

Contact bank/finance to halt pending wire transfers — use phone, not email

Search for ALL emails sent from the compromised account during attacker's access window:

`Get-MessageTrace -SenderAddress '<compromised>' -StartDate '<attacker_first_access>' -EndDate (Get-Date)`

Remove malicious inbox rules:

`Get-InboxRule -Mailbox <user> | where {$_.ForwardTo -or $_.ForwardAsAttachmentTo} | Remove-InboxRule -Confirm:$false`

Remove delegates and connected apps:

`Get-MailboxPermission -Identity <user> | where {$_.IsInherited -eq $false} | Remove-MailboxPermission`

Azure AD > Enterprise apps > User consent > Revoke suspicious app permissions

Determine initial compromise method:

Password spray: Check Azure AD > Risky sign-ins for multiple failed attempts before success

Phishing/AiTM: Check email for credential harvesting link clicked before compromise date

Token theft: Check for suspicious OAuth app grants in Azure AD > Enterprise applications

Reset password with strong random + enforce MFA re-enrollment:

`Set-AzureADUserPassword -ObjectId <user> -Password (ConvertTo-SecureString '<random>' -AsPlainText -Force) -ForceChangePasswordNextLogin $true`

Require MFA re-registration: Azure AD > Users > Authentication methods > Require re-register

Audit ALL inbox rules, delegates, mail forwarding, and connected applications

Check for data exfiltration: mail forwarding to external addresses, large attachment sends

Re-enable account with new credentials + enforced MFA

Notify ALL recipients of fraudulent emails sent during compromise

Provide Finance with list of any payment instructions sent from compromised account

Work with Legal on any completed fraudulent transactions

File FBI IC3 report (ic3.gov) — include: wire details, bank info, email headers, timeline

Implement conditional access policies:

Location-based: Block sign-ins from non-business countries

Device-based: Require compliant/managed device

Risk-based: Block high-risk sign-ins automatically

Enable Azure AD Identity Protection risky sign-in policies

Deploy anti-phishing banner on external emails ('This email was sent from outside your organization')

Implement dual-approval process for wire transfers above $X threshold

Conduct executive-targeted security awareness training within 2 weeks

Fraudulent email(s) with full headers (.eml)
Azure AD sign-in logs for compromised account (full compromise window)
Inbox rules created by attacker (export before deletion)
List of emails sent by attacker from compromised account
Wire transfer details (amount, destination bank, account, SWIFT/BIC)
Bank correspondence regarding recall attempt
FBI IC3 complaint number

Internal Notification

SUBJECT: [URGENT] Business Email Compromise — Wire Fraud Alert

Finance/Accounting teams:

We have confirmed that [executive name]'s email account was compromised.
Any payment or wire transfer instructions received from this account 
between [date] and [date] should be treated as FRAUDULENT.

Action required:
- HALT all pending payments requested by this account
- Verify any completed transfers from this period with the requestor by PHONE
- Report any suspicious payment requests to [security contact]

DO NOT respond to any existing email threads with this account.

P1 SLAImmediate — active lateral movement is always P1

War Room#inc-lateral-YYYYMMDD

Notify:

CISO
Network Team Lead
AD Admin

External Contacts:

External IR retainer (if attacker has domain admin)

MTTD (Mean Time to Detect)< 15 minutes (from first lateral movement event to analyst alert)

MTTC (Mean Time to Contain)< 1 hour (from alert to full containment of attack path)

MTTR (Mean Time to Recover)< 24 hours (rebuild compromised hosts + credential rotation)

Industry BenchmarkMedian attacker breakout time (initial access to lateral movement): 62 minutes (CrowdStrike 2025)

Identify lateral movement technique in use:

RDP: `DeviceLogonEvents | where LogonType == 'RemoteInteractive' | where Timestamp > ago(1h) | where AccountName !in ('<known_admins>') | summarize by DeviceName, AccountName, RemoteIP`

PsExec/SMB: `DeviceProcessEvents | where FileName == 'PSEXESVC.exe' or ProcessCommandLine has 'psexec' | where Timestamp > ago(1h)`

WMI: `DeviceProcessEvents | where ProcessCommandLine has 'wmic' and ProcessCommandLine has '/node:' | where Timestamp > ago(1h)`

WinRM: `DeviceProcessEvents | where FileName == 'wsmprovhost.exe' | where Timestamp > ago(1h)`

Map the full attack path — which hosts and in what order:

Build timeline: `DeviceLogonEvents | where AccountName == '<compromised_account>' | where Timestamp > ago(24h) | project Timestamp, DeviceName, RemoteIP, LogonType | sort by Timestamp asc`

Identify the compromised credentials being used (check: service accounts, admin accounts, cached creds)

Determine attacker objective — what are they moving toward? (DC, database server, file server, backup)

Decision Tree

IF: Attacker has reached a domain controller

THEN: CRITICAL ESCALATION. Assume full domain compromise. Prepare for KRBTGT double-reset + full AD recovery.

IF: Attacker has domain admin credentials

THEN: They can access anything. Focus on containing exfiltration channels (block outbound). Prepare for full credential rotation.

IF: Movement is limited to user-level credentials

THEN: Contain affected hosts + disable compromised account. Attack path is limited.

Network-isolate ALL confirmed compromised hosts (every host in the attack path):

CrowdStrike: Bulk contain — Host Management > select hosts > Actions > Contain

Defender: Device page > Isolate device (for each host)

Network: ACL block at switch level — `ip access-list extended INCIDENT_BLOCK` → `deny ip host <ip> any` → apply to interface

Disable ALL compromised accounts:

`Get-ADUser -Filter {SamAccountName -like '<account>'} | Disable-ADAccount`

If service account: identify dependent systems first, but disable if attacker has it

If domain admin is compromised — KRBTGT reset:

Reset 1: `Set-ADUser krbtgt -ChangePasswordAtLogon $true` (or use AD Users & Computers)

Wait 12 hours (for replication across all DCs)

Reset 2: Repeat

WARNING: This invalidates ALL Kerberos tickets — all users will need to re-authenticate

Segment critical assets — emergency VLAN isolation for crown jewels:

Move critical servers to isolated VLAN via switch config

Block all traffic to critical VLAN except from known management IPs

Remove attacker tooling from every host in the attack path:

Common tools to hunt for: Cobalt Strike (beacons), Mimikatz, PsExec, BloodHound, SharpHound, Rubeus, Impacket

Check: `DeviceFileEvents | where FileName in~ ('mimikatz.exe','beacon.exe','psexec.exe','sharphound.exe','rubeus.exe') | where Timestamp > ago(7d)`

Check process injection: `DeviceProcessEvents | where ActionType == 'CreateRemoteThreadApiCall' | where Timestamp > ago(7d)`

Hunt for persistence on EVERY host in the attack path:

Autoruns: Run Sysinternals Autoruns on each host — compare against known-good baseline

Scheduled tasks: `schtasks /query /fo CSV /v | ConvertFrom-Csv | where {$_.TaskName -notlike '\Microsoft*'} | fl TaskName,TaskToRun,Author`

Services: `Get-WmiObject win32_service | where {$_.PathName -notlike '*Windows*' -and $_.StartMode -eq 'Auto'} | fl Name,PathName`

Credential rotation — reset in this order:

1. KRBTGT (if domain admin was compromised — see Contain phase)

2. All admin accounts used in the attack path

3. All service accounts on compromised hosts

4. Machine accounts of compromised hosts (`Reset-ComputerMachinePassword`)

Patch the initial access vulnerability before bringing anything back online

Rebuild all compromised hosts from golden images (do NOT trust remediated live systems)

Restore network connectivity in monitored stages — security infra first, then DCs, then servers, then workstations

Implement network segmentation improvements based on the attack path that was used

Deploy detection for the specific lateral movement technique on previously unmonitored network paths

Map the full attack chain to MITRE ATT&CK — document every technique observed

Conduct purple team exercise: replay the exact attack path to verify detection and containment

Review and tighten firewall rules between network segments

Implement tiered admin model if not already in place (Tier 0: DC, Tier 1: Servers, Tier 2: Workstations)

Deploy LAPS (Local Administrator Password Solution) to prevent credential reuse across hosts

EDR telemetry from every host in the attack path (process trees, network connections, file events)
AD logon event logs — Security Event ID 4624, 4625, 4768, 4769, 4776
Network flow data between compromised hosts
Memory dumps from compromised hosts (for process injection analysis)
List of attacker tools found with file hashes
Attack path diagram with timestamps

P1 SLA30 minutes to Cloud Security Lead

Escalate to P1If attacker created new IAM users/roles or modified security groups on production → immediate P1

Notify:

Cloud Security Lead
DevOps TL
FinOps (for cost anomalies)

External Contacts:

AWS Support (Enterprise)
Azure Support (if tenant-level compromise)
GCP Support

MTTD (Mean Time to Detect)< 15 minutes (from anomalous API call to alert)

MTTC (Mean Time to Contain)< 30 minutes (from alert to credential rotation + deny policy)

MTTR (Mean Time to Recover)< 4 hours (full resource audit + unauthorized resource cleanup)

Industry BenchmarkAverage cloud breach cost: $5.17M (IBM 2025). Crypto-mining costs can reach $10K/day.

Identify suspicious cloud activity:

AWS: `aws cloudtrail lookup-events --lookup-attributes AttributeKey=Username,AttributeValue=<compromised> --start-time <date> --max-results 50`

Azure: `AzureActivity | where Caller == '<compromised>' | where TimeGenerated > ago(24h) | summarize by OperationNameValue, ActivityStatusValue`

GCP: `gcloud logging read 'protoPayload.authenticationInfo.principalEmail="<compromised>"' --limit=50 --format=json`

Determine compromise type:

Access key leaked (GitHub, paste site, logs)

Console password compromised (phishing, credential stuffing)

Session token stolen (SSRF, metadata service exploit)

Scope the blast radius — what resources were accessed or modified:

AWS: `aws cloudtrail lookup-events --lookup-attributes AttributeKey=Username,AttributeValue=<user> --start-time <compromise_start> | jq '.Events[].EventName' | sort | uniq -c | sort -rn`

Look for: RunInstances, CreateUser, CreateAccessKey, PutBucketPolicy, CreateLoginProfile

Decision Tree

IF: Only access keys compromised (console access not affected)

THEN: Rotate keys immediately. Apply explicit deny. Scope is limited to API actions.

IF: Console access compromised (password or SSO)

THEN: Disable user + reset password + revoke sessions + check for MFA changes.

IF: Attacker created new IAM users or roles

THEN: CRITICAL: Attacker has backdoor access. Disable new identities immediately. Full IAM audit required.

IF: Crypto-mining instances detected

THEN: Terminate instances immediately. Check ALL regions — miners often spawn in regions you don't monitor.

Rotate or disable compromised access keys:

AWS: `aws iam update-access-key --user-name <user> --access-key-id <key> --status Inactive` + `aws iam create-access-key --user-name <user>`

Azure: Azure Portal > App registrations > Certificates & secrets > Delete old, create new

GCP: `gcloud iam service-accounts keys disable <key-id> --iam-account=<sa>@<project>.iam.gserviceaccount.com`

Revoke active sessions:

AWS: Attach inline policy to user: `{"Effect":"Deny","Action":"*","Resource":"*","Condition":{"DateLessThan":{"aws:TokenIssueTime":"<NOW>"}}}`

Azure: `Revoke-AzureADUserAllRefreshToken -ObjectId <user>`

Disable any NEW IAM users/roles created by attacker:

AWS: `aws iam list-users --query 'Users[?CreateDate>=`<date>`]'` → delete each

Check for new access keys on existing users: `aws iam list-access-keys --user-name <user>`

Block attacker IPs at security group / NSG / firewall level

Full audit of changes during compromise window:

IAM: New users, roles, policies, access keys, login profiles

Compute: New instances (check ALL regions), Lambda functions, containers

Storage: Bucket policy changes, public access, new buckets

Network: Security group changes, VPC peering, new VPN connections

Remove unauthorized resources:

AWS: `aws ec2 describe-instances --region <region> --filters 'Name=instance-state-name,Values=running' --query 'Reservations[].Instances[].{ID:InstanceId,Type:InstanceType,Launch:LaunchTime}'`

Check ALL regions: `for region in $(aws ec2 describe-regions --query 'Regions[].RegionName' --output text); do echo "==$region=="; aws ec2 describe-instances --region $region --query 'Reservations[].Instances[].InstanceId' --output text; done`

Revert security group and IAM policy changes from CloudTrail diff

Check for backdoor access: cross-account roles, SSO configurations, federation trust changes

Reset credentials and enforce MFA on all affected accounts

Restore modified resources from Infrastructure-as-Code (Terraform state, CloudFormation, etc.)

Verify billing dashboard — check for crypto-mining charges across ALL regions

AWS: Cost Explorer > filter by service = EC2 > group by region > last 7 days

Re-enable monitoring and alerting for affected scope

Set up billing alerts: SNS notification at $X threshold

Implement least-privilege IAM — review and tighten all policies

Enable threat detection services:

AWS: GuardDuty (all regions), CloudTrail (all regions + S3 data events)

Azure: Defender for Cloud (all subscriptions), Sentinel

GCP: Security Command Center, VPC Flow Logs

Implement SCPs (AWS) or Azure Policy to prevent risky actions (e.g., no public S3, no root access keys)

Set up billing anomaly detection — alert on spend > 150% of baseline

Scan all repos for leaked credentials: truffleHog, git-secrets, GitHub secret scanning

CloudTrail / Activity Log / Audit Log for full compromise window
IAM diff — what changed (users, roles, policies, keys)
Resource inventory diff — what was created/modified/deleted
VPC flow logs showing attacker IP connections
Billing data showing unauthorized resource usage
Source of credential leak (if access key — where was it exposed?)

Data Breach Response

critical

T1048 — Exfiltration Over Alternative Protocol

Full operational runbook for confirmed data exfiltration — technical containment, legal obligations, regulatory notification, and stakeholder communication.

⏱ Days to weeks Incident CommanderSOC Analyst L3Legal / DPOPR / CommunicationsExecutive Sponsor

data-breachexfiltrationgdprcompliancenotification

P1 SLAImmediate — all confirmed data breaches are P1

War Room#inc-breach-YYYYMMDD (restricted access — need-to-know only)

Notify:

CEO
CISO
General Counsel / DPO
Board (if material breach)

External Contacts:

External forensics firm (contract in advance)
Cyber insurance carrier (breach coach)
Outside legal counsel (privilege)
PR / crisis communications firm

MTTD (Mean Time to Detect)< 24 hours (from first exfiltration to detection)

MTTC (Mean Time to Contain)< 4 hours (from detection to exfiltration channel closed)

Notification SLAGDPR: 72 hours to DPA. HIPAA: 60 days. SEC: 4 business days (material). State laws: varies (15-60 days)

Industry BenchmarkAverage cost per breached record: $169 (IBM 2025). Average breach cost: $4.88M.

Confirm data exfiltration — identify what left and through what channel:

DLP alerts: Check DLP dashboard for policy violations (email, cloud storage, USB, web upload)

Network: `DeviceNetworkEvents | where RemoteUrl !in ('<known_domains>') | where ActionType == 'ConnectionSuccess' | summarize TotalBytes=sum(SentBytes) by DeviceName, RemoteUrl | where TotalBytes > 100000000 | sort by TotalBytes desc`

DNS: Look for DNS tunneling — `DnsEvents | where Name has_any ('<suspicious_domains>') | summarize count() by Name, Computer`

Determine volume and sensitivity classification of exfiltrated data:

PII (names, SSN, DOB) → regulatory notification required

PHI (medical records) → HIPAA breach notification required

Financial (credit cards, bank accounts) → PCI DSS notification required

Trade secrets / IP → legal action + competitive damage assessment

Identify affected data subjects — how many individuals are impacted?

Establish timeline: when did exfiltration start and when was it stopped?

Decision Tree

IF: Exfiltration is still active

THEN: IMMEDIATE CONTAINMENT — close the channel before any further scoping.

IF: PII/PHI of EU residents confirmed

THEN: 72-hour GDPR clock starts now. Engage DPO and Legal immediately.

IF: Publicly traded company + material impact

THEN: SEC 4-business-day clock. Engage outside counsel for privilege.

IF: Data posted on dark web / leak site

THEN: Engage crisis communications. Prepare public statement. Notify affected individuals immediately.

Close the exfiltration channel immediately:

Block destination IP/domain at firewall: `set security policy from trust to untrust match destination-address <ip> then deny`

Block at DNS: add to DNS sinkhole / RPZ

Block at web proxy: add URL to block list

If USB: disable USB mass storage via GPO: `Computer Config > Admin Templates > System > Removable Storage Access > Deny All`

Isolate affected systems — do NOT wipe (evidence preservation)

Preserve ALL evidence — establish chain of custody:

Move to forensic preservation immediately

NO system changes without IC approval

Engage legal counsel — all communications through counsel for attorney-client privilege

Activate incident communication team — all external communications through PR/legal only

Remove attacker access and all persistence mechanisms (follow Lateral Movement runbook if applicable)

Patch the vulnerability or close the gap that enabled exfiltration

Rotate ALL credentials that may have been exposed in the breach

If insider threat — coordinate with HR/Legal (see Insider Threat runbook)

Rebuild affected systems from clean images

Implement additional DLP controls on the exfiltration channel that was used

Enhanced monitoring on previously affected data stores

Deploy CASB (Cloud Access Security Broker) if cloud storage was the vector

Review and tighten data classification + access control policies

Regulatory notifications (with Legal guidance):

GDPR (EU): Notify supervisory authority within 72 hours of awareness. Notify data subjects 'without undue delay' if high risk.

HIPAA (US healthcare): Notify HHS within 60 days. Notify individuals within 60 days. If >500: notify media.

PCI DSS: Notify acquiring bank and card brands.

SEC (public companies): File 8-K within 4 business days if material.

State breach notification laws: varies by state (CA: notify AG if >500 residents)

Notify affected data subjects — content must include:

What happened (factual, reviewed by Legal)

What data was affected (specific data types)

What you're doing about it (remediation steps)

What they should do (credit monitoring, password reset, etc.)

Contact information for questions

Engage external forensics firm for formal investigation report

File cyber insurance claim with full documentation

Publish transparency report if appropriate

Forensic disk images of affected systems (chain of custody documented)
Network captures covering exfiltration period
DLP logs showing what data triggered alerts
Full EDR timeline from affected hosts
Data inventory — exact records/files exfiltrated
Firewall and proxy logs showing exfiltration destination
AD/IAM logs showing attacker access to data stores
All internal and external communications about the breach (for legal)

Affected Individuals Notification

SUBJECT: Important Security Notice from [Company]

Dear [Name],

We are writing to inform you of a security incident that may have affected your personal information.

What happened: On [date], we discovered that an unauthorized party accessed systems containing [type of data].

What information was involved: [Specific data types — e.g., name, email address, phone number]

What we are doing: We have contained the incident, engaged external cybersecurity experts, and notified relevant authorities. We are implementing additional security measures to prevent future incidents.

What you can do:
- Monitor your accounts for unusual activity
- [If credentials involved]: Change your password immediately
- [If financial data]: We are providing [X months] of complimentary credit monitoring through [provider]. Enroll at [URL] using code [CODE].

For questions, contact our dedicated support line: [phone] or [email]

We take the security of your information seriously and sincerely apologize for this incident.

P2 SLA4 hours to Security Manager

Escalate to P1If active data exfiltration confirmed → switch to Data Breach runbook

Notify:

Security Manager
HR Business Partner
Legal

External Contacts:

External forensics (if legal proceedings anticipated)
Law enforcement (if criminal activity confirmed — coordinate with Legal)

MTTD (Mean Time to Detect)Varies — insider threats average 85 days to detect (Ponemon 2025)

MTTC (Mean Time to Contain)< 24 hours from confirmed malicious intent to access revocation

Industry BenchmarkAverage insider threat incident cost: $16.2M per organization annually (Ponemon 2025)

Identify insider threat indicators (technical):

Mass file downloads: `DeviceFileEvents | where ActionType == 'FileCreated' and FolderPath startswith 'C:\Users\<user>\Downloads' | where Timestamp > ago(7d) | summarize count() by bin(Timestamp, 1h) | sort by Timestamp desc`

USB activity: `DeviceEvents | where ActionType == 'PnpDeviceConnected' and DeviceDescription has 'USB' | where Timestamp > ago(30d)`

Cloud uploads: DLP logs for uploads to personal cloud storage (Google Drive, Dropbox, OneDrive personal)

Email: Large attachments to personal email or unusual external recipients

Off-hours access: `SigninLogs | where UserPrincipalName == '<user>' | extend Hour=datetime_part('hour', TimeGenerated) | where Hour < 6 or Hour > 22 | summarize count() by bin(TimeGenerated, 1d)`

Correlate with HR signals:

Resignation notice submitted

Performance improvement plan (PIP)

Passed over for promotion

Workplace disputes or disciplinary action

Unusual financial pressure

IMPORTANT: Do NOT alert the subject. Investigation must remain covert until Legal/HR decide to act.

Decision Tree

IF: Suspicious activity but no confirmed malicious intent

THEN: Increase monitoring (with Legal approval). DO NOT confront the employee.

IF: Confirmed data exfiltration to personal accounts/devices

THEN: Coordinate with Legal + HR for immediate action plan. Prepare for same-day termination meeting.

IF: Employee has already resigned and is in notice period

THEN: Immediate access review. Consider garden leave (paid leave with no system access).

IF: Evidence of criminal activity (selling data, sabotage)

THEN: Engage Legal for law enforcement referral. Preserve evidence with forensic rigor.

CRITICAL: Engage Legal and HR BEFORE taking any technical action

Legal must approve monitoring scope and method

HR must be prepared for employee interaction

Document all approvals in writing

Increase monitoring on the suspected user (with Legal approval):

Enable enhanced DLP monitoring for the user

Enable mailbox audit logging if not already on: `Set-Mailbox -Identity <user> -AuditEnabled $true -AuditLogAgeLimit 365`

Review Azure AD conditional access — ensure user can't access from unmanaged devices

Restrict access to sensitive systems without alerting (if investigation is covert):

Remove from sensitive SharePoint sites / file shares (use 'permissions review' as cover)

Tighten DLP policies for the user's scope

Preserve all evidence with chain-of-custody documentation — every action logged with timestamp and who performed it

Upon Legal/HR decision to act — coordinate simultaneous access revocation:

Disable AD account: `Disable-ADAccount -Identity <user>`

Disable Azure AD: `Set-AzureADUser -ObjectId <user> -AccountEnabled $false`

Revoke all sessions: `Revoke-AzureADUserAllRefreshToken -ObjectId <user>`

Disable VPN access

Revoke badge / building access

Disable phone / MDM wipe of corporate data (NOT personal data)

All done within the SAME timeframe — usually during the HR termination meeting

Collect company devices — laptop, phone, external drives (with HR present, documented)

Forensic image of company devices BEFORE any wipe

Review all systems the user had access to for planted backdoors, logic bombs, or time-delayed scripts:

Check crontab / Task Scheduler for delayed-execution tasks

Review recent code commits for malicious changes

Check for personal SSH keys added to servers

Change shared credentials the user had access to (shared accounts, service accounts, API keys)

Review and reassign the user's responsibilities and access to another team member

Audit code, configurations, and infrastructure changes made by the user in last 90 days

Remove user from all groups, distribution lists, shared mailboxes

Document findings in format suitable for legal proceedings (coordinate with Legal)

Preserve forensic images for minimum retention period (coordinate with Legal on hold requirements)

Review and improve access control policies: least privilege, need-to-know

Implement / improve UEBA (User Entity Behavior Analytics) — Azure Sentinel UEBA, Exabeam, Securonix

Review offboarding checklist — ensure same-day access revocation is standard

Consider implementing DLP policies that trigger on behavioral patterns, not just content

P1 SLA15 minutes — if customer-facing services are down

P2 SLA30 minutes — if degraded but not fully down

Notify:

Network Team Lead
DevOps TL
VP Engineering (if customer-facing outage)

External Contacts:

DDoS mitigation provider (Cloudflare/Akamai/AWS Shield) — activate scrubbing
ISP upstream — request blackhole or traffic filtering

MTTD (Mean Time to Detect)< 5 minutes (automated monitoring should catch traffic anomalies)

MTTC (Mean Time to Contain)< 30 minutes (from alert to mitigation service activated)

MTTR (Mean Time to Recover)< 2 hours (full service restoration and attack terminated)

Industry BenchmarkAverage DDoS attack duration: 68 minutes (Cloudflare 2025). Peak attack sizes > 1 Tbps.

Confirm DDoS indicators — check for:

Bandwidth saturation: Network monitoring (PRTG, Grafana, Datadog) shows traffic spike

Connection exhaustion: `netstat -an | awk '{print $6}' | sort | uniq -c | sort -rn` — look for abnormal SYN_RECV, ESTABLISHED counts

Application layer: HTTP 503 rate spike, abnormal request patterns

Identify attack type:

Volumetric (UDP flood, DNS amplification): Massive bandwidth consumption. Check: `tcpdump -i eth0 -nn 'udp' -c 1000 | awk '{print $5}' | sort | uniq -c | sort -rn | head`

Protocol (SYN flood, ACK flood): Connection table exhaustion. Check: `ss -s` for socket state distribution

Application layer (HTTP flood, Slowloris): Normal bandwidth, high request rate. Check: WAF logs or `tail -f /var/log/nginx/access.log | awk '{print $1}' | sort | uniq -c | sort -rn | head`

Determine target — which IP/service/URL is being targeted

Check if this is a distraction attack (DDoS used to mask data exfiltration or other attack)

Decision Tree

IF: Attack volume exceeds your bandwidth capacity

THEN: Immediately activate upstream DDoS mitigation (Cloudflare/Akamai/Shield). On-premise mitigation will not help.

IF: Application layer attack (low bandwidth, high request rate)

THEN: WAF rate limiting + geo-blocking + challenge pages. No need for scrubbing service.

IF: Multiple services targeted simultaneously

THEN: Possible coordinated attack or distraction. Check for concurrent intrusion attempts.

IF: Ransom note received (RDoS)

THEN: DO NOT pay. Activate mitigation. Notify law enforcement.

Activate DDoS mitigation service:

Cloudflare: Security > DDoS > Enable 'Under Attack Mode' / or API: `curl -X PATCH 'https://api.cloudflare.com/client/v4/zones/<zone>/settings/security_level' -H 'Authorization: Bearer <token>' -d '{"value":"under_attack"}'`

AWS Shield Advanced: Automatically engaged if subscribed. Contact AWS Shield Response Team (SRT) for assistance.

Akamai: Contact SOC to activate scrubbing center

Enable rate limiting:

Nginx: `limit_req_zone $binary_remote_addr zone=ddos:10m rate=10r/s; limit_req zone=ddos burst=20 nodelay;`

Cloudflare: Security > WAF > Rate limiting rules > Create rule

AWS WAF: Add rate-based rule: `aws wafv2 create-rate-based-statement --rate-limit 1000 --aggregate-key-type IP`

Implement geo-blocking if attack originates from specific regions:

Cloudflare: Security > WAF > Custom rules > Block country

Nginx: Use GeoIP module to block country codes

If all else fails and service is sacrificial — blackhole route the targeted IP:

`ip route add blackhole <target_ip>/32` (last resort — drops ALL traffic including legitimate)

Scale infrastructure if possible: auto-scaling group adjustments, add instances behind LB

Work with ISP to filter attack traffic upstream (BGP blackhole, FlowSpec, or RTBH)

Fine-tune WAF rules based on attack signatures:

Identify common patterns: User-Agent, request rate, URI patterns, query strings

Block bot signatures at WAF level

Block identified attack source ranges (for targeted/non-botnet attacks):

`iptables -A INPUT -s <source_range> -j DROP` (or equivalent in cloud security groups)

Verify mitigation is effective — traffic should normalize while attack continues

Gradually restore service — disable 'Under Attack Mode' and aggressive rate limiting

Monitor for re-attack for 24-48 hours

Clear application caches and request queues: `systemctl restart nginx` or flush CDN cache

Verify service health across all endpoints and regions

Check for backlog in message queues, database connections, etc.

Document attack profile: type, volume, duration, source distribution, targeted service

Review and update DDoS response plan based on lessons learned

Evaluate DDoS mitigation service — was the response fast enough? Right provider?

Implement always-on DDoS protection if currently on-demand only

Set up automated DDoS detection + mitigation activation (many providers support API-triggered activation)

If ransom demand was received — report to law enforcement (FBI IC3, NCSC, etc.)

Conduct capacity planning review — can infrastructure handle future attacks?

Network traffic captures showing attack pattern
Bandwidth graphs for attack duration (before, during, after)
WAF/firewall logs showing blocked requests
Application logs showing error rates during attack
Mitigation provider incident report
Any ransom demands received (screenshot, preserve original)

P2 SLA30 minutes to SOC Lead

Escalate to P1If malware is self-propagating (worm) or involves data exfiltration → escalate to P1 immediately

Notify:

SOC Manager
Endpoint Team Lead

External Contacts:

Submit samples to VirusTotal, MalwareBazaar, ANY.RUN for community benefit

MTTD (Mean Time to Detect)< 15 minutes (from execution to EDR alert)

MTTC (Mean Time to Contain)< 1 hour (from alert to full containment of affected host)

MTTR (Mean Time to Recover)< 4 hours (re-image and restore user to operational state)

Industry BenchmarkEDR auto-containment reduces MTTC to under 1 minute for known threats

Confirm malware alert from EDR/AV — determine detection type:

Real-time block: Malware was blocked before execution → lower severity

Post-execution detection: Malware ran before detection → higher severity

Behavioral detection: Suspicious behavior matched → investigate thoroughly

Gather initial IOCs from EDR alert:

File hash (SHA256), file path, process tree, network connections, dropped files

CrowdStrike: Detection > View full detection details > Process tree

Defender: Incidents > Alert > Device timeline > Filter by alert time

Check file hash against threat intel:

VirusTotal: Search hash → check detection ratio and malware family

MalwareBazaar: Search hash → check tags, family, first seen date

Hybrid Analysis: Search hash → check sandbox report for behavior

Determine malware category: dropper, RAT, infostealer, ransomware, cryptominer, worm

Decision Tree

IF: Malware was blocked before execution (real-time prevention)

THEN: Lower severity. Verify block was successful. Check if other users received the same file.

IF: Malware executed and established persistence

THEN: Isolate host immediately. Full investigation required.

IF: Malware is spreading to other hosts

THEN: ESCALATE to P1. Switch to Lateral Movement playbook. Network-wide hunt required.

Network-isolate the infected host:

CrowdStrike: Host > Actions > Contain Host

Defender: Device page > Isolate device

Manual: Disable NIC or VLAN isolation via switch port shutdown

Block malware hash across the organization:

CrowdStrike: IOC Management > Add indicator > SHA256 > Action: Block

Defender: Settings > Indicators > Add indicator > File hashes

SIEM: Add hash to custom IOC watchlist for detection

Block C2 domains/IPs at perimeter:

DNS sinkhole the C2 domain

Firewall block rule for C2 IP addresses

Search for the same file across all endpoints:

CrowdStrike: Investigate > Custom IOC > File hash > Search all hosts

Defender: Advanced hunting > `DeviceFileEvents | where SHA256 == '<hash>'`

Kill malicious processes on the infected host

Remove malware files, persistence mechanisms, and dropped payloads

Remove any scheduled tasks, services, or registry entries created by the malware

Check for credential theft — if infostealer, rotate all credentials used on the device

Scan with multiple AV engines to ensure complete removal

If sophisticated or unknown malware — re-image the host instead of manual cleanup

Re-image the host from golden image if manual cleanup is not confidence-level high

Restore user data from backup (NOT from the infected disk)

Patch the vulnerability exploited for initial delivery (if known)

Monitor the recovered host for 48 hours for re-infection

Re-enable network connectivity after verification

Document malware family, IOCs, delivery method, and affected scope

Update email gateway / web proxy rules to block delivery vector

Submit malware sample to MalwareBazaar and VirusTotal for community sharing

If new malware variant — write custom detection rule for your SIEM

Review endpoint protection configuration — should this have been caught sooner?

Malware sample (quarantined copy with hash)
Full process tree from EDR showing execution chain
Network connections made by the malware (C2 traffic)
List of files dropped, modified, or deleted
Persistence mechanisms created (registry, tasks, services)
List of all hosts where the file was found

Supply Chain Attack Response

critical

T1195.002 — Supply Chain Compromise: Compromise Software Supply Chain

Response playbook for compromised software supply chain — when a trusted vendor, library, or update mechanism is weaponized.

⏱ 4-24 hours Incident CommanderSOC Analyst L3DevOps / EngineeringVendor ManagementLegal

supply-chainthird-partydependencyupdate-poisoning

P1 SLAImmediate — supply chain compromises affect the entire organization

War Room#inc-supply-chain-YYYYMMDD

Notify:

CISO
CTO
VP Engineering
Legal
Vendor Management

External Contacts:

CISA (cisa.gov/report)
Affected vendor security team
ISAC

MTTD (Mean Time to Detect)< 24 hours (from advisory/detection to confirmed impact assessment)

MTTC (Mean Time to Contain)< 4 hours (from confirmation to blocking the compromised component)

MTTR (Mean Time to Recover)< 48 hours (to patch/replace the compromised component across all systems)

Industry BenchmarkAverage supply chain attack goes undetected for 250+ days (Mandiant 2025)

Identify the compromised component — check threat intel and vendor advisories:

CISA KEV catalog (cisa.gov/known-exploited-vulnerabilities)

Vendor security advisory / CVE database

Community reporting (Twitter/X, Reddit r/netsec, security mailing lists)

Determine if the compromised version is in your environment:

Software inventory: Check CMDB/asset management for affected software version

Dependency scan: `npm audit`, `pip audit`, `dotnet list package --vulnerable`

Container images: `trivy image <image>`, `grype <image>`

Search for IOCs associated with the supply chain compromise:

Hashes, domains, IPs, behavioral indicators from vendor advisory

Assess blast radius — how many systems have the compromised component?

Decision Tree

IF: Compromised version is NOT in your environment

THEN: Document finding, block the version proactively, monitor for new information.

IF: Compromised version IS installed but no evidence of exploitation

THEN: Proceed to Contain — patch/remove urgently. Search for exploitation IOCs.

IF: Evidence of active exploitation found

THEN: FULL INCIDENT RESPONSE. Treat every system with the component as potentially compromised.

Block the compromised component immediately:

Block the specific version in artifact repositories (npm, PyPI mirrors, Docker registry)

Block update/download URLs at proxy and firewall

Revoke any API keys or tokens associated with the compromised component

Isolate systems showing signs of exploitation

Disable automatic updates for the affected software until a verified safe version is available

If the vendor is the compromise vector — disconnect vendor access (VPN tunnels, API integrations)

Remove or downgrade the compromised component to last-known-good version

For each system with the compromised component:

Check for persistence mechanisms added during compromise window

Verify file integrity of the software installation

Rotate any credentials that the software had access to

Rebuild systems with evidence of exploitation from clean images

Update package lock files / dependency pins to exclude compromised versions

Deploy the patched/clean version of the component

Verify deployment across all systems — zero instances of compromised version remain

Re-enable automatic updates once vendor confirms safe version

Restore any disabled integrations with new credentials

Monitor for re-compromise for 30 days

Implement Software Bill of Materials (SBOM) for all applications

Deploy dependency scanning in CI/CD pipeline (Snyk, Dependabot, Renovate)

Review vendor security assessment process — was this vendor properly vetted?

Implement software signing verification for all deployments

Share IOCs with ISACs and relevant community channels

Vendor advisory and CVE details
List of all systems with the compromised component (version matrix)
IOC search results across the environment
Network logs showing communication with compromised infrastructure
Build/deployment logs showing when the compromised version was introduced
Vendor communication timeline

P2 SLA15 minutes to Security Operations

Escalate to P1If executive account, admin account, or clear data access/exfiltration → escalate to P1

Notify:

SOC Manager
Identity Team Lead

MTTD (Mean Time to Detect)< 10 minutes (from anomalous sign-in to analyst alert)

MTTC (Mean Time to Contain)< 15 minutes (from alert to credential reset + session revocation)

MTTR (Mean Time to Recover)< 2 hours (full investigation + user remediation)

Industry Benchmark83% of breaches involve stolen credentials (Verizon DBIR 2025)

Identify compromise indicators:

Impossible travel / unusual location sign-in

Sign-in from anonymous proxy or Tor exit node

New device + new location combination

Anomalous application access patterns

User-reported suspicious activity (password reset they didn't request)

Check Azure AD / IdP sign-in logs:

`SigninLogs | where UserPrincipalName == '<user>' | where TimeGenerated > ago(7d) | project TimeGenerated, IPAddress, Location, AppDisplayName, DeviceDetail, RiskLevel | sort by TimeGenerated desc`

Check for suspicious activity from the account:

Inbox rules created, emails sent, files accessed/shared, groups joined

Azure AD: audit logs for the user in the past 7 days

Decision Tree

IF: Single anomalous sign-in, no suspicious activity after

THEN: Reset password, revoke sessions, interview user. May be false positive from VPN/travel.

IF: Confirmed unauthorized access with activity (rules, emails, file access)

THEN: Immediate lockout + full investigation. Check all activity during compromise window.

IF: Admin or privileged account compromised

THEN: ESCALATE to P1. Disable account, audit all admin actions, check for persistence.

Reset password immediately — use strong random password:

Azure AD: `Set-AzureADUserPassword -ObjectId <user> -Password (ConvertTo-SecureString (New-Guid).Guid.Substring(0,16) -AsPlainText -Force) -ForceChangePasswordNextLogin $true`

Google: Admin > Users > Reset password

Revoke ALL active sessions:

Azure AD: `Revoke-AzureADUserAllRefreshToken -ObjectId <user>`

Google: Admin > Users > Security > Sign out user from all sessions

Check and remove suspicious inbox rules:

`Get-InboxRule -Mailbox <user> | where {$_.ForwardTo -or $_.DeleteMessage} | fl`

Check and revoke suspicious OAuth app consents:

Azure AD: Enterprise Apps > User consent > Review and revoke

Check and remove any delegates added to the mailbox

Determine how the account was compromised:

Phishing: Check email for credential harvesting links

Password reuse: Check if the password matches known breaches (HIBP)

Token theft: Check for AiTM proxy indicators in sign-in logs

Malware: Check endpoint for keyloggers or infostealers

Force MFA re-enrollment — require the user to set up MFA again from scratch

If password reuse — advise user to change passwords on all personal accounts

If endpoint was compromised — run full EDR scan or re-image device

Provide user with new password via secure channel (phone call, in-person, not email)

Walk the user through MFA enrollment

Restore any deleted emails or modified data from backup

Notify anyone who received emails from the account during compromise window

Monitor account for re-compromise for 30 days

If phishing was the vector — check if other users received the same phishing email

Review Conditional Access policies — would location/device restrictions have prevented this?

Implement risk-based Conditional Access if not already in place

Schedule security awareness training for the affected user

Review password policy — enforce complexity and banned password list

Sign-in logs showing unauthorized access times, IPs, locations
Inbox rules created during compromise window
Emails sent from the account during compromise
Files accessed, modified, or shared during compromise
OAuth apps granted access during compromise
EDR logs if endpoint compromise suspected

P2 SLA30 minutes to SOC Lead

Escalate to P1If mining is running on production systems or cloud costs exceed $1K/day

Notify:

Cloud Security Lead
FinOps (for cost anomaly)

MTTD (Mean Time to Detect)< 1 hour (from mining start to detection)

MTTC (Mean Time to Contain)< 30 minutes (from detection to process termination + access revocation)

MTTR (Mean Time to Recover)< 2 hours (cleanup, patch, and credential rotation)

Industry BenchmarkAverage cryptojacking cloud cost: $4,600/incident. Can reach $100K+ if undetected.

Identify cryptojacking indicators:

High CPU/GPU utilization on endpoints or servers (sustained 90%+)

Cloud cost anomaly — unexpected compute charges (EC2, GCE, AKS)

Network connections to mining pools (*pool.*, *xmr.*, *nicehash.*, *f2pool.*)

EDR alert for mining-related process names (xmrig, minerd, cgminer, ethminer)

Confirm mining activity:

Linux: `top -bn1 | head -20` + `ps aux | grep -i 'xmr\|mine\|crypto'`

Windows: `Get-Process | Sort-Object CPU -Descending | Select -First 10`

Network: `netstat -an | grep ':3333\|:4444\|:8333\|:14444'` (common mining ports)

AWS: Check for unauthorized EC2 instances: `aws ec2 describe-instances --filters 'Name=instance-state-name,Values=running'`

Decision Tree

IF: Mining on endpoint — user installed intentionally

THEN: Policy violation. Remove software, document, escalate to HR/management if policy exists.

IF: Mining installed via malware/exploit

THEN: Treat as full compromise. Mining is the payload but access is the real problem.

IF: Mining on cloud infrastructure with spawned instances

THEN: Immediate termination of unauthorized instances. Rotate ALL cloud credentials. Check billing alerts.

Kill mining processes immediately:

Linux: `kill -9 $(pgrep -f 'xmr\|mine\|crypto')`

Windows: `Stop-Process -Name 'xmrig','minerd' -Force`

Terminate any unauthorized cloud instances:

AWS: `aws ec2 terminate-instances --instance-ids <id>`

Azure: `az vm delete --resource-group <rg> --name <vm> --yes`

Block mining pool domains and IPs at network perimeter

Block mining pool domains in DNS (sinkhole or blacklist)

Disable the compromised access — API keys, IAM roles, user accounts

Remove mining software and associated files

Remove persistence mechanisms (crontabs, systemd services, scheduled tasks)

Determine initial access vector — how did the miner get deployed?

Common vectors: exposed Docker API, vulnerable web app, leaked cloud credentials, SSH brute force

Patch the vulnerability used for access

Rotate all credentials on affected systems — cloud keys, SSH keys, service accounts

Verify mining processes are gone and CPU usage is normal

Review cloud billing for the full impact period — dispute charges if possible

Restore any modified configurations (crontab, systemd, etc.) from backup

Re-deploy affected containers/instances from clean images

Set up cloud billing alerts to catch future anomalies early

Implement cloud cost anomaly alerts (AWS Budgets, Azure Cost Alerts, GCP Budget Alerts)

Block known mining domains at DNS level organization-wide

Review cloud IAM — implement least-privilege access and remove unused credentials

Scan for exposed services (Docker API, Kubernetes API, Redis) on public IPs

Deploy network-level mining detection (Suricata rules for Stratum protocol)

Mining process details (binary hash, command line, configuration)
Network connections to mining pools (IPs, domains)
Cloud billing records showing cost impact
Timeline: when mining started, when detected, when stopped
Initial access evidence (CVE exploited, credential leaked)
List of unauthorized cloud resources created

P1 SLA30 minutes to Application Security Lead

Escalate to P1If RCE confirmed, data exfiltration evident, or active shell on web server

Notify:

Application Security Lead
DevOps Lead
Product Owner

External Contacts:

Web application pen testing firm (if persistent attacker)

MTTD (Mean Time to Detect)< 15 minutes (from attack to WAF/SIEM alert)

MTTC (Mean Time to Contain)< 1 hour (from alert to WAF rule + application patch)

MTTR (Mean Time to Recover)< 4 hours (full vulnerability remediation + validation)

Industry BenchmarkWeb application attacks account for 26% of all breaches (Verizon DBIR 2025)

Identify the attack type from WAF/SIEM alerts:

SQL Injection: Look for UNION SELECT, OR 1=1, single quotes in parameters

XSS: Look for <script>, javascript:, onerror= in inputs

RCE: Look for command injection patterns (;, |, &&, backticks) or deserialization payloads

Path Traversal: Look for ../ sequences in URL parameters

File Upload: Check for uploaded files with executable extensions (.php, .jsp, .aspx)

Review WAF logs for the attack pattern:

Cloudflare: Security > Events > Filter by action=challenge/block

AWS WAF: CloudWatch Logs > Filter matched rules

ModSecurity: `grep 'ModSecurity' /var/log/apache2/error.log | tail -50`

Determine if the attack was successful:

Check application logs for errors that indicate successful exploitation

Check for unexpected data in database (SQLi success indicator)

Check for webshells or new files on the web server

Decision Tree

IF: Attack was blocked by WAF — no evidence of success

THEN: Monitor and tune WAF rules. Patch the vulnerability in the next sprint.

IF: Attack bypassed WAF or there's evidence of successful exploitation

THEN: IMMEDIATE: Block attacker IP, patch vulnerability, check for data access.

IF: Webshell or reverse shell found on server

THEN: CRITICAL: Isolate web server. Assume full server compromise. Check for lateral movement.

Block the attacker's IP at WAF and firewall level

If attack is automated/distributed — implement rate limiting and CAPTCHA

Deploy emergency WAF rule to block the specific attack pattern:

Cloudflare: Security > WAF > Custom rule > Block requests matching pattern

ModSecurity: Add deny rule for the specific payload signature

If webshell found — isolate the web server and pre-serve evidence

If SQL injection — revoke the application's database user and create new credentials

Disable the vulnerable endpoint if possible while patching

Fix the vulnerability in the application code:

SQLi: Use parameterized queries / prepared statements

XSS: Implement output encoding and Content-Security-Policy headers

RCE: Sanitize all user input, disable dangerous functions, update frameworks

File Upload: Validate file types, store outside web root, rename uploaded files

Remove any webshells, backdoors, or modified files on the web server

If database was accessed — assess data exposure and rotate credentials

Deploy patched application version

Run application security scan to verify fix (DAST/SAST)

Verify the patched application is functioning correctly

Restore any modified data from backup (if SQLi modified records)

Re-enable the endpoint with WAF rules still in place

Monitor for re-exploitation attempts for 7 days

Run full vulnerability scan against the application

Add the vulnerability type to the secure coding training curriculum

Implement mandatory code review for all input-handling code

Deploy DAST scanning in CI/CD pipeline to catch similar issues

Review WAF rules — could the attack have been caught earlier?

If data breach occurred — follow Data Breach Response playbook for notifications

WAF logs showing attack requests (full HTTP requests)
Application server access logs for the attack period
Database query logs (if SQL injection)
Webshell or malicious file samples found on server
Evidence of data access or exfiltration
Patched code diff showing the vulnerability fix

Zero-Day / Vulnerability Exploitation

critical

T1190 — Exploit Public-Facing Application

Response playbook for active exploitation of unpatched vulnerabilities — emergency mitigation when no vendor patch is available.

⏱ 2-8 hours (initial mitigation), days to weeks (full remediation) Incident CommanderSOC Analyst L3Vulnerability ManagementIT OpsNetwork Team

zero-daycveunpatchedexploitvulnerability

P1 SLAImmediate — active zero-day exploitation is always P1

War Room#inc-zeroday-CVE-YYYY-XXXXX

Notify:

CISO
VP Engineering
Network Team Lead

External Contacts:

CISA
Vendor security team
ISAC
External IR retainer

MTTD (Mean Time to Detect)< 4 hours (from public disclosure to internal vulnerability assessment)

MTTC (Mean Time to Contain)< 8 hours (from assessment to workaround deployment)

MTTR (Mean Time to Recover)Vendor-dependent — track patch availability closely

Industry BenchmarkAverage time from CVE disclosure to first exploit: 15 days (Mandiant 2025). Zero-days exploited before disclosure.

Monitor threat intelligence feeds for zero-day announcements:

CISA KEV (cisa.gov/known-exploited-vulnerabilities) — updated daily

Vendor security advisories (Microsoft MSRC, Apache, Cisco PSIRT, etc.)

Twitter/X: @cikiLabs, @GossiTheDog, @kevbeaumont, @campuscodi

Determine if the vulnerable component exists in your environment:

Check CMDB/asset inventory for the specific software + version

Run vulnerability scan targeting the specific CVE

Check for the component in container images and deployments

Search for IOCs associated with known exploitation:

Apply vendor-provided detection signatures

Search logs for known exploitation patterns

Assess exposure — is the vulnerable service internet-facing?

Decision Tree

IF: Vulnerable component NOT in environment

THEN: Document, block proactively if possible, monitor for scope expansion.

IF: Vulnerable component exists but no evidence of exploitation

THEN: Implement workaround/mitigation immediately. Begin patching when available.

IF: Evidence of active exploitation found

THEN: FULL INCIDENT RESPONSE. Isolate affected systems. Treat as confirmed breach.

Implement vendor-recommended workaround immediately:

Disable vulnerable feature/protocol

Apply configuration change to mitigate exploitation

Deploy WAF rule to block exploitation patterns

If no workaround available — isolate or disable the vulnerable system:

Remove internet exposure (move behind VPN, restrict to internal only)

Implement network ACL to limit access to the service

If critical system — implement compensating controls (IPS signature, WAF, proxy)

Deploy IDS/IPS signatures for the specific exploit

If exploitation confirmed — isolate compromised systems and hunt for lateral movement

Apply vendor patch as soon as available

If no patch yet — maintain workaround and monitor

Verify the patch resolves the vulnerability (re-scan)

If exploitation occurred — full investigation of compromised systems

Hunt for any persistence established during exploitation window

Restore internet-facing exposure only after patch is confirmed applied

Remove temporary workarounds that may impact functionality

Continue monitoring for exploitation attempts for 30 days

Verify all instances of the vulnerable component are patched (scan again)

Review patch management process — why wasn't this caught earlier?

Implement automated vulnerability scanning for internet-facing assets

Subscribe to vendor security advisory feeds

Review network segmentation — could exploitation have been limited?

Update asset inventory with accurate software versions

CVE details and vendor advisory
List of all affected systems in the environment
Vulnerability scan results (before and after patching)
Log evidence of exploitation attempts
Workaround deployment confirmation (systems and timeline)
Patch deployment confirmation

P2 SLA30 minutes to IT Security

Escalate to P1If device is unencrypted, contains sensitive data, or belongs to privileged user

Notify:

IT Security
MDM Admin
User's Manager

MTTC (Mean Time to Contain)< 30 minutes (from report to remote wipe command issued)

MTTR (Mean Time to Recover)< 4 hours (new device provisioned and user operational)

Industry Benchmark68% of data breaches involve a human element including device loss (Verizon DBIR 2025)

User reports device lost or stolen — gather details:

When was it last seen? Where? Was it locked? Was it encrypted?

What type of device (laptop, phone, tablet)?

Does it have corporate data (email, files, code repos)?

Was the user logged into any sensitive applications?

Check device status in MDM:

Intune: Devices > All devices > Search > Check last check-in, compliance state

Jamf: Computers/Devices > Search > Last inventory update

Verify disk encryption is active:

Windows: BitLocker status in Intune

Mac: FileVault status in Jamf

Mobile: Device encryption in MDM compliance report

Decision Tree

IF: Device is encrypted and was locked

THEN: Lower risk. Proceed with remote wipe and credential rotation.

IF: Device is unencrypted or unlocked

THEN: HIGH RISK. Immediate remote wipe. Assume data is compromised. Full credential rotation.

IF: Device belongs to admin/executive with sensitive access

THEN: Escalate to P1. Full credential rotation, session revocation, and access audit.

Issue remote wipe immediately:

Intune: Device > Wipe (Full wipe, not retire)

Jamf: Device > Management > Send MDM command > Erase Device

Google: Admin > Devices > Wipe device

Apple: Find My > Erase device (if not MDM managed)

Revoke all active sessions for the user:

Azure AD: `Revoke-AzureADUserAllRefreshToken -ObjectId <user>`

Google: Admin > Users > Security > Sign out from all sessions

Disable the device in Azure AD / MDM:

Azure AD: Devices > Search > Disable device

Reset the user's password and require MFA re-enrollment

Revoke any certificate-based authentication from the device

Confirm remote wipe was executed (check MDM for wipe confirmation)

If wipe cannot be confirmed — treat all data on device as compromised

Rotate any credentials that were cached or saved on the device:

Email password, VPN credentials, Wi-Fi certificates

SSH keys stored on the device

Application-specific API tokens or service passwords

Code signing certificates if developer device

Report to local police if stolen (may be required for insurance)

Provision replacement device from clean corporate image

Enroll new device in MDM and verify compliance

Restore user data from cloud backup (OneDrive, iCloud, Google Drive)

Verify user can access all required applications with new credentials

Verify encryption is enforced on all devices via MDM compliance policy

Review auto-lock settings — ensure devices lock after 5 minutes of inactivity

Consider implementing geofencing alerts for managed devices

File insurance claim if applicable

Update asset inventory to reflect device status as lost/stolen

User report: when, where, circumstances of loss
MDM device status before and after wipe command
Remote wipe confirmation (if received)
Encryption status of the device
Last known location (if available from MDM/Find My)
Police report number (if stolen)

P1 SLA30 minutes to Platform Security Lead

Escalate to P1If API key provides write access to production data or customer-facing systems

Notify:

Platform Security Lead
API Team Lead
DevOps

MTTD (Mean Time to Detect)< 30 minutes (from anomalous API usage to detection)

MTTC (Mean Time to Contain)< 15 minutes (from detection to key revocation)

MTTR (Mean Time to Recover)< 2 hours (new key generation, rotation, and consumer notification)

Industry BenchmarkAPI attacks increased 681% in 2024-2025 (Salt Security)

Identify API compromise indicators:

Sudden spike in API request volume from a single key

API requests from unusual IP addresses or countries

API key found in public repository (GitHub, GitLab, Pastebin)

Error rate spike (testing stolen key against various endpoints)

Unusual data access patterns (bulk export, sequential enumeration)

Check API gateway/proxy logs:

Request volume, error rates, source IPs, accessed endpoints

OWASP API Security: check for BOLA, Broken Authentication, BFLA patterns

Determine scope of exposed key's permissions:

What endpoints can it access? Read-only or read-write?

What data can it reach? PII, financial data, internal services?

Decision Tree

IF: Key found in public repo but no evidence of abuse

THEN: Revoke key immediately. Create new key. Check logs for any usage from unknown IPs.

IF: Key is being actively abused

THEN: Revoke key immediately. Block abusing IPs. Audit all actions taken with the key.

IF: Customer data accessed via compromised API

THEN: ESCALATE to P1. Invoke Data Breach Response playbook for notification requirements.

Revoke the compromised API key immediately

Block the attacker's IP addresses at API gateway and WAF

If OAuth token — revoke the token and associated refresh tokens

Implement emergency rate limiting on affected endpoints

If key was in code repository — scan git history for other secrets:

`trufflehog git file://. --only-verified`

`gitleaks detect -v`

Generate new API key with minimum required permissions

If key was in code — remove from repository AND git history:

`git filter-branch` or BFG Repo-Cleaner to purge from history

Force-push to remote (coordinate with team)

Move secrets to a secrets manager (HashiCorp Vault, AWS Secrets Manager, Azure Key Vault)

Implement pre-commit hooks to prevent secrets in code (`pre-commit` + `detect-secrets`)

Review API permissions — apply least privilege to all API keys

Deploy new API key to all consuming applications

Verify all integrations work with the new key

Notify API consumers about the key rotation

Monitor for continued abuse attempts (IP blocks may not catch distributed attacks)

Implement automated secret scanning in CI/CD (GitHub Advanced Security, GitLab Secret Detection)

Implement API key rotation policy (90-day maximum lifetime)

Deploy API behavioral anomaly detection

Review API authentication — consider moving from API keys to OAuth 2.0 with short-lived tokens

Implement IP allowlisting for sensitive API endpoints

Compromised API key details (permissions, creation date, last used)
API gateway logs showing unauthorized access patterns
Source of key exposure (public repo, Pastebin, etc.)
List of all endpoints and data accessed with the key
IP addresses involved in abuse
Git history showing when the key was committed

OT/ICS Cyber Incident Response

critical

T826 — Impair Process Control

This playbook outlines the comprehensive steps and procedures for responding to a cyber incident affecting Operational Technology (OT) and Industrial Control Systems (ICS) environments. It prioritizes safety, operational continuity, and data integrity, recognizing the unique challenges and critical nature of these systems. This playbook is designed for incidents ranging from unauthorized access and data exfiltration to malware infection (e.g., ransomware) and direct process manipulation or disruption.

⏱ 24-72 hours Incident CommanderSOC Analyst L2/L3OT Engineer / Subject Matter Expert (SME)IT Network EngineerSecurity Engineer (Endpoint/Network/Cloud)Legal CounselCommunications LeadExecutive LeadershipVendor Representative (OEM, Integrator)

OTICSSCADACritical InfrastructureRansomwareMalwareDisruptionProcess ControlSafety

P1 SLA15 minutes

War Room

Dedicated Microsoft Teams Channel: #OT-Incident-WarRoom-[Date] OR Physical Command Center (if required for severe incidents).

Notify:

Incident Commander (On-Call)
Head of OT Operations
CISO
CIO
Legal Counsel
Communications Lead

External Contacts:

CISA (Cybersecurity and Infrastructure Security Agency)
FBI (Federal Bureau of Investigation)
Sector-specific ISAOs/ISACs
Relevant Regulatory Bodies (e.g., FERC, EPA, NERC-CIP)
Third-party Incident Response Firm (if retained)

MTTD (Mean Time to Detect)1 hour

MTTC (Mean Time to Contain)4 hours

MTTR (Mean Time to Recover)48 hours

Industry BenchmarkIndustry averages (e.g., IBM X-Force 2023 average breach lifecycle: 277 days; 204 days to identify, 73 days to contain) are significantly longer. Our targets for critical OT incidents are aggressively shorter due to the high impact on safety and operations.

1.1. Validate alert source: Confirm the integrity and authenticity of the detection system (e.g., SIEM, IDS/IPS, OT-specific anomaly detection).

1.2. Initial impact assessment: Determine if the anomaly is affecting physical processes, safety, or production. Engage OT Engineer immediately.

1.3. Review SIEM/SCADA logs for anomalous activity (e.g., unauthorized access, unusual commands, failed authentications, changes to PLC programs, unexpected process values).

1.4. Analyze network traffic for indicators of compromise (IOCs) such as C2 communication, lateral movement attempts, or unusual protocol usage (e.g., IT protocols in OT segments).

1.5. Check endpoint detection and response (EDR) logs on connected IT/OT devices (HMIs, engineering workstations) for malicious processes or file changes.

1.6. Correlate findings with threat intelligence feeds for known OT threats.

Decision Tree

IF: Anomaly detected in process control system (e.g., unexpected valve closure, motor speed change, abnormal temperature/pressure).

THEN: Immediately alert OT Engineer/SME. Validate with HMI and local controls. Initiate emergency shutdown procedures if safety is compromised or process integrity is at risk, following established safety protocols.

IF: Suspicious network traffic (e.g., C2 beaconing, SMB enumeration) observed originating from an OT segment device.

THEN: Isolate the suspected device's network port at the switch level. Notify IT Network Engineer and Security Engineer for further analysis. DO NOT disrupt critical process communication without OT Engineer's approval.

IF: EDR alert on an HMI/Engineering Workstation indicates malware execution or unauthorized access.

THEN: Contain the workstation immediately (network isolation via EDR or physical disconnection). Initiate forensic acquisition. Alert OT Engineer to assess potential impact on connected PLCs/RTUs.

2.1. Prioritize safety: Ensure all actions taken preserve human safety and environmental protection. Consult with OT Engineer before any network or system modification.

2.2. Isolate affected OT network segments: Use firewalls, managed switches (VLANs, port shutdown), or physical disconnection (air-gapping) as a last resort. Prioritize control plane isolation over data plane if possible.

2.3. Disable compromised accounts: Immediately revoke credentials for any accounts identified as compromised within IT and OT domains.

2.4. Block malicious IPs/domains: Update perimeter firewalls and IDS/IPS with known IOCs to prevent further ingress/egress.

2.5. Prevent lateral movement: Implement host-based firewalls, disable unnecessary services, and segment networks further where possible.

2.6. Secure remaining critical assets: Patch known vulnerabilities, apply temporary mitigations, and enforce strong authentication.

Decision Tree

IF: Malware/ransomware identified spreading laterally within multiple OT network segments, affecting critical assets.

THEN: Execute pre-approved segment isolation procedures. Physically disconnect network cables to air-gap affected segments if logical controls (firewalls, VLANs) are insufficient or compromised. Consult OT Engineer for process implications.

IF: Unauthorized remote access detected to an HMI or engineering workstation from an external IP address.

THEN: Immediately block the source IP at the perimeter firewall. Disconnect the affected HMI/workstation from the network. Force password reset for all associated accounts.

IF: PLC/RTU configuration has been modified without authorization, potentially causing unsafe conditions.

THEN: Disconnect the affected PLC/RTU from the network immediately. If safe to do so, switch the PLC to 'Program' mode (if applicable) to prevent further unauthorized changes. Do NOT attempt to revert configuration without a verified golden image.

3.1. Remove malware: Scan and clean infected systems using trusted antivirus/EDR solutions. For deeply embedded malware or firmware compromise, re-imaging or re-flashing may be required.

3.2. Patch vulnerabilities: Apply critical security patches to operating systems, applications, and firmware of affected and similar systems.

3.3. Rebuild systems from trusted sources: Restore compromised HMIs, engineering workstations, and servers from known good, clean backups or golden images.

3.4. Reconfigure and harden PLCs/RTUs: Re-flash PLCs/RTUs with verified, uncompromised firmware. Restore configurations from known good backups. Implement secure configurations (e.g., strong passwords, disabled unnecessary services).

3.5. Change all compromised credentials: Force password resets for all accounts potentially exposed, including service accounts and administrative credentials.

3.6. Perform thorough integrity checks: Verify the integrity of all restored and reconfigured systems and devices using checksums, hashes, and functional testing.

4.1. Validate system functionality and safety: Conduct thorough functional testing of all restored OT systems and processes in a controlled environment before reintroducing them to production.

4.2. Monitor for re-infection: Implement enhanced monitoring on recovered systems for any signs of residual compromise or new attack attempts.

4.3. Restore network connectivity systematically: Reconnect isolated segments and devices in a phased approach, starting with the most critical and least affected.

4.4. Verify data integrity: Ensure historian data, logs, and process values are accurate and complete post-recovery.

4.5. Document lessons learned for recovery: Note any challenges, successes, and areas for improvement during the recovery phase.

Decision Tree

IF: Restored OT system or process exhibits intermittent errors, unexpected behavior, or performance degradation during testing.

THEN: Immediately halt reconnection. Re-evaluate restoration source (backup integrity, golden image verification). Conduct root cause analysis on new symptoms. Do NOT reconnect to production until stability and expected behavior are fully confirmed. Escalate to OT Engineer and OEM.

IF: Post-recovery monitoring detects new IOCs or suspicious activity on a 'clean' system.

THEN: Immediately re-isolate the affected system. Re-initiate eradication steps, focusing on persistent threats. Review and strengthen containment measures. Consider a full forensic re-evaluation.

IF: Process parameters are within operational limits but inconsistent with pre-incident historical data.

THEN: Engage OT Engineer and process control specialists to validate data integrity and calibrate sensors/actuators. Verify no lingering manipulation or data corruption. Do not resume full operations until data integrity is assured.

5.1. Conduct a 'Lessons Learned' session: Involve all stakeholders to review the incident, identify root causes, evaluate the effectiveness of the response, and pinpoint areas for improvement.

5.2. Update playbooks and procedures: Incorporate findings from the lessons learned to enhance existing incident response plans, especially for OT/ICS environments.

5.3. Enhance security controls: Implement new or improved security measures to prevent recurrence, focusing on identified vulnerabilities and gaps (e.g., network segmentation, endpoint hardening, threat intelligence integration).

5.4. Review and update asset inventory: Ensure all OT/ICS assets are accurately documented with their configurations, software versions, and network topology.

5.5. Communicate findings: Share relevant non-confidential insights with industry peers and regulatory bodies to contribute to collective defense (e.g., ISACs, CISA).

5.6. Legal and regulatory review: Ensure all reporting and compliance obligations are met.

Internal Notification

Subject: URGENT: OT/ICS Incident - [Brief Description] - [Date/Time]

Team,

This is an urgent notification regarding a detected cyber incident impacting our Operational Technology (OT) / Industrial Control Systems (ICS) environment.

**Incident Status:** [e.g., 'Initial Detection', 'Containment in Progress', 'Recovery Phase']
**Affected Systems/Areas:** [e.g., 'PLC X in Plant Y', 'HMI Z in Control Room A', 'Segment B of Production Line C']
**Initial Impact Assessment:** [e.g., 'Minor disruption to production in Line C', 'Potential compromise of process data', 'Safety systems remain operational', 'Production halted in Plant Y']
**Current Actions:** The Incident Response Team, in collaboration with OT Engineering, is actively working to contain the threat and ensure safety. Specific actions include: [e.g., 'Network segmentation of affected area', 'Forensic analysis on HMI Z', 'Verification of PLC X integrity'].

All personnel are reminded to adhere strictly to incident response protocols. Please do not attempt to access affected systems or make any changes without explicit direction from the Incident Commander.

Further updates will be provided via [War Room Channel/Email] every [X hours/minutes] or as significant developments occur.

**Incident Commander:** [Name] ([Contact Info])
**OT Lead:** [Name] ([Contact Info])

Your cooperation and vigilance are critical.

[Company Leadership/CISO]

External / Customer Notification

Subject: [Company Name] - Critical Operational Technology Incident Notification

[Date]

FOR IMMEDIATE RELEASE / CONFIDENTIAL - FOR REGULATORY BODIES AND LAW ENFORCEMENT

[Company Name] is providing this notification regarding a detected cybersecurity incident impacting a portion of our Operational Technology (OT) environment.

Upon detection, our robust incident response protocols were immediately activated. Our internal teams, including cybersecurity experts and OT engineers, are actively engaged in managing the situation. We have also engaged [e.g., external cybersecurity specialists, law enforcement (FBI/CISA)] to assist in our efforts.

**Current Status:** Our primary focus remains on ensuring the safety of personnel, protecting the environment, and maintaining the integrity of our critical operations. We are diligently working to contain the incident and restore full operational capabilities securely and efficiently.

**Impact Assessment:** We are currently conducting a thorough investigation to determine the full scope and impact of this incident. At this time, [Company Name] is [e.g., 'experiencing limited operational disruption in X area', 'working to mitigate potential impacts on Y process']. We are taking all necessary measures to prevent further unauthorized activity.

**Our Commitment:** The security and reliability of our operations are paramount. We are committed to a comprehensive and transparent response. We will provide further updates as our investigation progresses and as appropriate.

For further inquiries, please contact:
[Media Contact Name/Email]
[Legal Counsel Name/Email]

[Company Name]

WarRoom

RUNBOOKS By Attack Type

Ransomware Incident Response

Phishing Response

Business Email Compromise (BEC)

Active Lateral Movement

Cloud Account Compromise

Data Breach Response

Insider Threat Investigation

DDoS Attack Response

Malware Infection Response

Supply Chain Attack Response

Account Takeover Response

Cryptojacking Response

Web Application Attack Response

Zero-Day / Vulnerability Exploitation

Stolen / Lost Device Response

API Security Breach Response

OT/ICS Cyber Incident Response

New runbooks ship regularly.