SCW TOOLS

WarRoom

Operational runbooks for every major attack type. Real tool commands, decision trees, escalation procedures, and communication templates โ€” ready to execute when it matters.

These aren't checklists โ€” they're operational runbooks built for SOC teams, IR leads, and security managers. Every phase has real tool commands (CrowdStrike, Sentinel KQL, Splunk, AWS CLI), decision trees for branching scenarios, escalation matrices with SLAs, and copy-paste communication templates.

All playbooks are free and open. Use them in your IR process today.


17 Runbooks
16 Attack Types
5 Response Phases

RUNBOOKS  By Attack Type

Ransomware Incident Response

critical

T1486 โ€” Data Encrypted for Impact

Operational runbook for active ransomware attacks โ€” real tool commands, decision trees, and escalation procedures.

โฑ 2-8 hours Incident CommanderSOC Analyst L2/L3IT Ops / SysadminLegal CounselCommunications Lead
ransomwareencryptionbusiness-continuityextortion
P1 SLA15 minutes to Incident Commander
War RoomCreate dedicated Slack/Teams channel: #inc-ransomware-YYYYMMDD

Notify:

  • CISO
  • VP Engineering
  • Legal
  • CEO (if data exfil confirmed)

External Contacts:

  • Cyber insurance carrier hotline
  • External IR retainer (if contracted)
  • Law enforcement (FBI IC3 / local CERT)
MTTD (Mean Time to Detect)< 30 minutes (from first encryption to confirmed alert)
MTTC (Mean Time to Contain)< 2 hours (from alert to full network isolation of affected hosts)
MTTR (Mean Time to Recover)< 48 hours (to restore critical business operations)
Industry BenchmarkIndustry median MTTC for ransomware: 4.5 hours (IBM X-Force 2025)
Confirm ransomware indicators โ€” check for: mass file renames (.encrypted, .locked, .crypt), ransom notes (README.txt, DECRYPT_FILES.html), process killing (vssadmin, bcdedit, wbadmin)
Identify patient zero and blast radius:
CrowdStrike: Go to Investigate > Host Search > sort by 'First Seen' encryption IOC
Sentinel KQL: `DeviceFileEvents | where FileName endswith '.encrypted' | summarize count() by DeviceName | sort by count_ desc`
Splunk: `index=edr sourcetype=file_events (file_extension=encrypted OR file_extension=locked) | stats count by host | sort -count`
Determine the variant โ€” upload ransom note or encrypted sample to id-ransomware.malwarehunterteam.com
Check if decryptor exists: nomoreransom.org/en/decryption-tools

Decision Tree

IF: Encryption is actively spreading
THEN: SKIP to Contain โ†’ Emergency Isolation immediately. Do NOT wait for full scoping.
IF: Encryption stopped / single host only
THEN: Proceed with methodical scoping before containment.
IF: Threat actor is in chat / making demands
THEN: Do NOT engage. Notify Legal + Insurance immediately. They will handle negotiation if needed.
Network-isolate all confirmed infected hosts:
CrowdStrike: Host > Actions > Contain Host (or API: `containment-action v2 POST /devices/entities/devices-actions/v2?action_name=contain`)
Defender for Endpoint: Device page > 'Isolate device' (allows Defender comms only)
Manual: Disable switch port โ€” `interface gi1/0/X` โ†’ `shutdown` / or disable Wi-Fi via MDM
Disable compromised accounts in AD:
PowerShell: `Get-ADUser -Filter {Name -like '*compromised_user*'} | Disable-ADAccount`
Azure AD: `Set-AzureADUser -ObjectId user@domain.com -AccountEnabled $false`
Block C2 at perimeter:
Palo Alto: `set security profiles anti-spyware block-ip` / Objects > External Dynamic Lists > add C2 IPs
Sentinel: Create TI indicator โ€” Threat Intelligence > Add indicator > type: IP / domain
Suspend automated backups to prevent encryption of backup storage
Preserve evidence โ€” do NOT reboot or wipe infected machines. Memory forensics first.

Decision Tree

IF: Domain admin account is compromised
THEN: Initiate emergency KRBTGT reset โ€” double reset 12h apart: `krbtgt` password reset via AD Users & Computers (replication will take up to 10h per reset)
IF: Backup infrastructure is accessible from compromised network
THEN: Air-gap backups immediately โ€” disconnect backup server NIC or disable backup agent service
Identify initial access vector โ€” common entry points to check:
1. Email: Search email gateway for attachments/links delivered in last 7 days to patient zero
2. RDP: `DeviceLogonEvents | where LogonType == 'RemoteInteractive' | where AccountName == '<compromised>' | sort by Timestamp asc`
3. VPN: Check VPN logs for unusual geo/time patterns pre-incident
4. Vulnerable edge device: Check Shodan/Censys for exposed services on your IP range
Remove persistence mechanisms on every host in attack path:
Scheduled tasks: `schtasks /query /fo LIST /v | findstr /i '<suspicious>'` โ†’ `schtasks /delete /tn <name> /f`
Services: `Get-Service | Where-Object {$_.StartType -eq 'Automatic' -and $_.Status -eq 'Running'} | fl Name,DisplayName,BinaryPathName`
Registry Run keys: `reg query HKLM\Software\Microsoft\Windows\CurrentVersion\Run`
WMI subscriptions: `Get-WMIObject -Namespace root\subscription -Class __EventFilter`
Scan all hosts for dormant payloads using variant-specific IOCs
Reset ALL potentially compromised credentials (prioritize: service accounts > admin accounts > user accounts)
Patch the vulnerability used for initial access before bringing systems back online
Verify backup integrity BEFORE restoring:
Mount backup in isolated environment โ†’ check for ransom notes / encrypted files โ†’ scan with AV
Restore from clean backups or rebuild from golden images
Re-enable network connectivity in stages โ€” start with monitoring/security infrastructure first:
Stage 1: SIEM, EDR, network monitoring
Stage 2: Domain controllers, DNS, DHCP
Stage 3: Critical business applications
Stage 4: User workstations (batches of 10-20, monitor between batches)
Full credential rotation for affected scope โ€” force password change at next logon for all users in affected OUs
Resume backup operations โ€” verify backup jobs complete successfully for 48h

Decision Tree

IF: No clean backup available
THEN: Rebuild from scratch using golden images. DO NOT pay ransom without Legal + Insurance + Law Enforcement guidance.
IF: Recovery time exceeds business tolerance
THEN: Activate business continuity plan โ€” switch to manual/paper processes for critical operations.
Document full timeline: initial access โ†’ lateral movement โ†’ encryption โ†’ detection โ†’ containment โ†’ recovery
Conduct lessons-learned meeting within 48h โ€” ALL responders must attend
Update detection rules in SIEM based on IOCs and TTPs discovered
Report to authorities:
US: FBI IC3 (ic3.gov), CISA (cisa.gov/report)
EU: National CERT + data protection authority within 72h (GDPR Art. 33)
Publicly traded: SEC notification requirements (4 business days)
File cyber insurance claim with full incident timeline

Internal Notification

SUBJECT: [CRITICAL] Active Security Incident โ€” Ransomware

Team โ€” We are responding to an active ransomware incident affecting [X] systems.

Current status: [Detecting / Containing / Eradicating / Recovering]
War room: #inc-ransomware-YYYYMMDD
Incident Commander: [Name]

DO:
- Report any unusual file activity to the war room immediately
- Continue using approved devices only

DO NOT:
- Attempt to open encrypted files or ransom notes
- Connect personal devices to the corporate network
- Discuss the incident on social media or with external parties

Next update: [time]

External / Customer Notification

SUBJECT: Security Incident Notification

Dear [Customer/Partner],

We are writing to inform you that [Company] experienced a security incident on [date]. 
We detected the incident promptly and our security team contained it within [X hours].

What happened: [Brief factual description โ€” reviewed by Legal]
What data was affected: [Specific data types โ€” reviewed by Legal]
What we're doing: [Remediation steps taken]
What you should do: [Specific guidance for the recipient]

We will provide updates as our investigation progresses. For questions, contact [security contact].

Phishing Response

high

T1566 โ€” Phishing

Full operational runbook for phishing triage โ€” from email analysis through org-wide purge and user remediation.

โฑ 30-90 minutes SOC Analyst L1/L2Email AdminSecurity Awareness Lead
phishingemailsocial-engineeringcredential-theft
P2 SLA30 minutes to SOC Lead
Escalate to P1If credentials were entered AND MFA was bypassed โ†’ escalate to Account Compromise runbook

Notify:

  • SOC Manager
  • Email Admin

External Contacts:

  • Anti-Phishing Working Group (APWG) โ€” reportphishing@apwg.org
MTTD (Mean Time to Detect)< 5 minutes (from user report to analyst triage)
MTTC (Mean Time to Contain)< 30 minutes (from triage to org-wide email purge)
MTTR (Mean Time to Recover)< 60 minutes (for credential-entry cases: password reset + session revoke)
Industry BenchmarkTarget < 10% click rate on phishing simulations
Triage the phishing report โ€” check source: user report, SEG alert, or automated detection
Analyze email headers (extract from .eml or message trace):
Check sender: SPF pass/fail, DKIM signature, DMARC alignment
Look for reply-to mismatch (sender โ‰  reply-to is a strong phishing indicator)
Check X-Originating-IP against threat intel
Inspect URLs WITHOUT clicking:
urlscan.io: submit URL โ†’ check redirects, final domain, screenshots
VirusTotal: submit URL โ†’ check detection ratio
ANY.RUN: submit in sandbox โ†’ see full execution chain
Inspect attachments in sandbox:
Hybrid Analysis (hybrid-analysis.com) or Joe Sandbox for file detonation
Check file hash on VirusTotal before opening anything

Decision Tree

IF: Confirmed phishing โ€” no user clicked
THEN: Proceed to Contain (purge email). Skip Eradicate.
IF: User clicked link but did NOT enter credentials
THEN: Contain + check endpoint for drive-by download. Run EDR scan on user device.
IF: User entered credentials
THEN: URGENT: Contain + Eradicate immediately. Reset password + revoke sessions within 15 min.
IF: User entered credentials AND MFA token was intercepted (AiTM/EvilProxy)
THEN: ESCALATE to P1. Treat as full account compromise. Switch to Account Compromise runbook.
Find all recipients of the same email:
Exchange Online: `Get-MessageTrace -SenderAddress '<phishing_sender>' -StartDate '<date>' -EndDate '<date>'`
O365 Threat Explorer: Email & collaboration > Explorer > Sender = '<address>' โ†’ Select all โ†’ Purge
Google Workspace: Admin Console > Investigation Tool > Gmail log events > sender = '<address>'
Purge/quarantine from all mailboxes:
O365: Threat Explorer > Select messages > Move to: Soft Delete / Hard Delete
Exchange: `New-ComplianceSearch -Name 'PhishPurge' -ExchangeLocation All -ContentMatchQuery 'subject:<subject> AND from:<sender>'`
Then: `New-ComplianceSearchAction -SearchName 'PhishPurge' -Purge -PurgeType SoftDelete`
Google: Admin > Investigation > Select emails > Delete
Block sender domain and malicious URLs:
O365: Email & collaboration > Policies > Anti-phishing > Block sender/domain
SEG (Proofpoint/Mimecast): Add sender/domain to block list
Web proxy: Block phishing URL/domain at proxy level
If user clicked โ€” isolate endpoint for scanning:
CrowdStrike: Contain Host
Defender: Isolate Device
If credentials were submitted:
Azure AD: `Revoke-AzureADUserAllRefreshToken -ObjectId <user>` + `Set-AzureADUser -ObjectId <user> -PasswordProfile @{Password='Temp@' + (Get-Random); ForceChangePasswordNextLogin=$true}`
Google: Admin > Users > Reset password + check 'Require password change'
Check and remove OAuth app consent grants: `Get-AzureADAuditSignInLogs | where AppDisplayName eq '<suspicious app>'`
If attachment was opened โ€” check for malware execution:
Run full EDR scan on affected endpoint
Check process tree in EDR for child processes of Office/PDF apps
Sentinel: `DeviceProcessEvents | where InitiatingProcessFileName in ('WINWORD.EXE','EXCEL.EXE','POWERPNT.EXE','AcroRd32.exe') | where Timestamp > ago(2h) | where DeviceName == '<host>'`
Add IOCs to blocklists โ€” sender domain, phishing URL, file hash, C2 domain
Confirm all instances of the phishing email are purged (re-run search to verify zero results)
Verify affected users have reset credentials and MFA is active
If malware was found โ€” rebuild endpoint from clean image
Update email filtering rules with new indicators
Send org-wide awareness notification about this specific campaign
Log incident with IOCs: sender address, reply-to, subject line, URLs, file hashes, C2 domains
Share indicators with ISACs (FS-ISAC, H-ISAC, etc.) if applicable
Submit to APWG: reportphishing@apwg.org
Schedule targeted phishing simulation for users who clicked (within 2 weeks)
Review email gateway rules โ€” could this have been caught automatically?

Business Email Compromise (BEC)

critical

T1534 โ€” Internal Spearphishing

Operational runbook for BEC attacks โ€” account compromise, wire fraud prevention, and evidence preservation for law enforcement.

โฑ 2-6 hours Incident CommanderSOC Analyst L2/L3Finance ControllerLegal Counsel
becemailfraudfinancialwire-transfer
P1 SLA15 minutes โ€” BEC with active wire transfer is ALWAYS P1
War Room#inc-bec-YYYYMMDD

Notify:

  • CFO
  • CISO
  • Legal
  • Bank fraud department

External Contacts:

  • FBI IC3 (ic3.gov) โ€” file within 48h for wire recall
  • Bank fraud department โ€” call within 30 min of discovery
MTTD (Mean Time to Detect)< 1 hour (from first fraudulent email to detection)
MTTC (Mean Time to Contain)< 30 minutes (from detection to account disable + bank notification)
MTTR (Mean Time to Recover)Wire recall success rate drops 50% after 24h โ€” speed is everything
Industry BenchmarkFBI IC3 Recovery Asset Team recovered $538M in BEC losses in 2024 โ€” but only when reported within 48h
Identify BEC indicators:
- Wire transfer / payment request from executive with unusual urgency
- Reply-to address differs from sender display name
- Mailbox rules created to hide replies (auto-forward, auto-delete)
- Login from unusual location / impossible travel
Verify with the alleged sender via OUT-OF-BAND communication:
Call them directly (phone number from contacts, NOT from the email)
Walk to their desk if same office
DO NOT reply to the email or use chat โ€” attacker may control those
Check Azure AD / O365 sign-in logs for compromised account:
Sentinel: `SigninLogs | where UserPrincipalName == '<user>' | where TimeGenerated > ago(30d) | summarize by IPAddress, Location, AppDisplayName | sort by TimeGenerated desc`
Azure Portal: Azure AD > Users > <user> > Sign-in logs > Look for: unusual IP, new device, risky sign-in
Check for malicious inbox rules:
PowerShell: `Get-InboxRule -Mailbox <user> | where {$_.ForwardTo -or $_.ForwardAsAttachmentTo -or $_.DeleteMessage -eq $true} | fl Name,ForwardTo,DeleteMessage,MoveToFolder`
O365 Admin: Exchange Admin > Mailboxes > <user> > Manage email apps > Check mail flow rules

Decision Tree

IF: Wire transfer has been initiated but NOT yet completed
THEN: IMMEDIATE: Call bank fraud department to halt transfer. Time is critical โ€” every minute counts.
IF: Wire transfer completed
THEN: File FBI IC3 complaint immediately (ic3.gov). Contact bank for recall attempt. Success rate drops rapidly after 24h.
IF: Email compromise confirmed but no financial impact yet
THEN: Disable account, proceed with standard containment.
Disable the compromised account immediately:
Azure AD: `Set-AzureADUser -ObjectId <user> -AccountEnabled $false`
Revoke ALL sessions: `Revoke-AzureADUserAllRefreshToken -ObjectId <user>`
Google: Admin > Users > Suspend user
Contact bank/finance to halt pending wire transfers โ€” use phone, not email
Search for ALL emails sent from the compromised account during attacker's access window:
`Get-MessageTrace -SenderAddress '<compromised>' -StartDate '<attacker_first_access>' -EndDate (Get-Date)`
Remove malicious inbox rules:
`Get-InboxRule -Mailbox <user> | where {$_.ForwardTo -or $_.ForwardAsAttachmentTo} | Remove-InboxRule -Confirm:$false`
Remove delegates and connected apps:
`Get-MailboxPermission -Identity <user> | where {$_.IsInherited -eq $false} | Remove-MailboxPermission`
Azure AD > Enterprise apps > User consent > Revoke suspicious app permissions
Determine initial compromise method:
Password spray: Check Azure AD > Risky sign-ins for multiple failed attempts before success
Phishing/AiTM: Check email for credential harvesting link clicked before compromise date
Token theft: Check for suspicious OAuth app grants in Azure AD > Enterprise applications
Reset password with strong random + enforce MFA re-enrollment:
`Set-AzureADUserPassword -ObjectId <user> -Password (ConvertTo-SecureString '<random>' -AsPlainText -Force) -ForceChangePasswordNextLogin $true`
Require MFA re-registration: Azure AD > Users > Authentication methods > Require re-register
Audit ALL inbox rules, delegates, mail forwarding, and connected applications
Check for data exfiltration: mail forwarding to external addresses, large attachment sends
Re-enable account with new credentials + enforced MFA
Notify ALL recipients of fraudulent emails sent during compromise
Provide Finance with list of any payment instructions sent from compromised account
Work with Legal on any completed fraudulent transactions
File FBI IC3 report (ic3.gov) โ€” include: wire details, bank info, email headers, timeline
Implement conditional access policies:
Location-based: Block sign-ins from non-business countries
Device-based: Require compliant/managed device
Risk-based: Block high-risk sign-ins automatically
Enable Azure AD Identity Protection risky sign-in policies
Deploy anti-phishing banner on external emails ('This email was sent from outside your organization')
Implement dual-approval process for wire transfers above $X threshold
Conduct executive-targeted security awareness training within 2 weeks

Internal Notification

SUBJECT: [URGENT] Business Email Compromise โ€” Wire Fraud Alert

Finance/Accounting teams:

We have confirmed that [executive name]'s email account was compromised.
Any payment or wire transfer instructions received from this account 
between [date] and [date] should be treated as FRAUDULENT.

Action required:
- HALT all pending payments requested by this account
- Verify any completed transfers from this period with the requestor by PHONE
- Report any suspicious payment requests to [security contact]

DO NOT respond to any existing email threads with this account.

Active Lateral Movement

critical

T1021 โ€” Remote Services

Time-critical runbook for containing an attacker moving between systems โ€” real-time hunt and containment procedures.

โฑ 1-4 hours Incident CommanderSOC Analyst L3 / Threat HunterNetwork TeamAD Admin
lateral-movementnetworkcontainmentactive-threat
P1 SLAImmediate โ€” active lateral movement is always P1
War Room#inc-lateral-YYYYMMDD

Notify:

  • CISO
  • Network Team Lead
  • AD Admin

External Contacts:

  • External IR retainer (if attacker has domain admin)
MTTD (Mean Time to Detect)< 15 minutes (from first lateral movement event to analyst alert)
MTTC (Mean Time to Contain)< 1 hour (from alert to full containment of attack path)
MTTR (Mean Time to Recover)< 24 hours (rebuild compromised hosts + credential rotation)
Industry BenchmarkMedian attacker breakout time (initial access to lateral movement): 62 minutes (CrowdStrike 2025)
Identify lateral movement technique in use:
RDP: `DeviceLogonEvents | where LogonType == 'RemoteInteractive' | where Timestamp > ago(1h) | where AccountName !in ('<known_admins>') | summarize by DeviceName, AccountName, RemoteIP`
PsExec/SMB: `DeviceProcessEvents | where FileName == 'PSEXESVC.exe' or ProcessCommandLine has 'psexec' | where Timestamp > ago(1h)`
WMI: `DeviceProcessEvents | where ProcessCommandLine has 'wmic' and ProcessCommandLine has '/node:' | where Timestamp > ago(1h)`
WinRM: `DeviceProcessEvents | where FileName == 'wsmprovhost.exe' | where Timestamp > ago(1h)`
Map the full attack path โ€” which hosts and in what order:
Build timeline: `DeviceLogonEvents | where AccountName == '<compromised_account>' | where Timestamp > ago(24h) | project Timestamp, DeviceName, RemoteIP, LogonType | sort by Timestamp asc`
Identify the compromised credentials being used (check: service accounts, admin accounts, cached creds)
Determine attacker objective โ€” what are they moving toward? (DC, database server, file server, backup)

Decision Tree

IF: Attacker has reached a domain controller
THEN: CRITICAL ESCALATION. Assume full domain compromise. Prepare for KRBTGT double-reset + full AD recovery.
IF: Attacker has domain admin credentials
THEN: They can access anything. Focus on containing exfiltration channels (block outbound). Prepare for full credential rotation.
IF: Movement is limited to user-level credentials
THEN: Contain affected hosts + disable compromised account. Attack path is limited.
Network-isolate ALL confirmed compromised hosts (every host in the attack path):
CrowdStrike: Bulk contain โ€” Host Management > select hosts > Actions > Contain
Defender: Device page > Isolate device (for each host)
Network: ACL block at switch level โ€” `ip access-list extended INCIDENT_BLOCK` โ†’ `deny ip host <ip> any` โ†’ apply to interface
Disable ALL compromised accounts:
`Get-ADUser -Filter {SamAccountName -like '<account>'} | Disable-ADAccount`
If service account: identify dependent systems first, but disable if attacker has it
If domain admin is compromised โ€” KRBTGT reset:
Reset 1: `Set-ADUser krbtgt -ChangePasswordAtLogon $true` (or use AD Users & Computers)
Wait 12 hours (for replication across all DCs)
Reset 2: Repeat
WARNING: This invalidates ALL Kerberos tickets โ€” all users will need to re-authenticate
Segment critical assets โ€” emergency VLAN isolation for crown jewels:
Move critical servers to isolated VLAN via switch config
Block all traffic to critical VLAN except from known management IPs
Remove attacker tooling from every host in the attack path:
Common tools to hunt for: Cobalt Strike (beacons), Mimikatz, PsExec, BloodHound, SharpHound, Rubeus, Impacket
Check: `DeviceFileEvents | where FileName in~ ('mimikatz.exe','beacon.exe','psexec.exe','sharphound.exe','rubeus.exe') | where Timestamp > ago(7d)`
Check process injection: `DeviceProcessEvents | where ActionType == 'CreateRemoteThreadApiCall' | where Timestamp > ago(7d)`
Hunt for persistence on EVERY host in the attack path:
Autoruns: Run Sysinternals Autoruns on each host โ€” compare against known-good baseline
Scheduled tasks: `schtasks /query /fo CSV /v | ConvertFrom-Csv | where {$_.TaskName -notlike '\Microsoft*'} | fl TaskName,TaskToRun,Author`
Services: `Get-WmiObject win32_service | where {$_.PathName -notlike '*Windows*' -and $_.StartMode -eq 'Auto'} | fl Name,PathName`
Credential rotation โ€” reset in this order:
1. KRBTGT (if domain admin was compromised โ€” see Contain phase)
2. All admin accounts used in the attack path
3. All service accounts on compromised hosts
4. Machine accounts of compromised hosts (`Reset-ComputerMachinePassword`)
Patch the initial access vulnerability before bringing anything back online
Rebuild all compromised hosts from golden images (do NOT trust remediated live systems)
Restore network connectivity in monitored stages โ€” security infra first, then DCs, then servers, then workstations
Implement network segmentation improvements based on the attack path that was used
Deploy detection for the specific lateral movement technique on previously unmonitored network paths
Map the full attack chain to MITRE ATT&CK โ€” document every technique observed
Conduct purple team exercise: replay the exact attack path to verify detection and containment
Review and tighten firewall rules between network segments
Implement tiered admin model if not already in place (Tier 0: DC, Tier 1: Servers, Tier 2: Workstations)
Deploy LAPS (Local Administrator Password Solution) to prevent credential reuse across hosts

Cloud Account Compromise

critical

T1078.004 โ€” Valid Accounts: Cloud Accounts

Runbook for compromised AWS/Azure/GCP identities โ€” IAM lockdown, resource audit, and cost containment.

โฑ 2-6 hours SOC Analyst L2/L3Cloud Security EngineerDevOps / Platform TeamFinOps
cloudawsazuregcpiamaccess-keys
P1 SLA30 minutes to Cloud Security Lead
Escalate to P1If attacker created new IAM users/roles or modified security groups on production โ†’ immediate P1

Notify:

  • Cloud Security Lead
  • DevOps TL
  • FinOps (for cost anomalies)

External Contacts:

  • AWS Support (Enterprise)
  • Azure Support (if tenant-level compromise)
  • GCP Support
MTTD (Mean Time to Detect)< 15 minutes (from anomalous API call to alert)
MTTC (Mean Time to Contain)< 30 minutes (from alert to credential rotation + deny policy)
MTTR (Mean Time to Recover)< 4 hours (full resource audit + unauthorized resource cleanup)
Industry BenchmarkAverage cloud breach cost: $5.17M (IBM 2025). Crypto-mining costs can reach $10K/day.
Identify suspicious cloud activity:
AWS: `aws cloudtrail lookup-events --lookup-attributes AttributeKey=Username,AttributeValue=<compromised> --start-time <date> --max-results 50`
Azure: `AzureActivity | where Caller == '<compromised>' | where TimeGenerated > ago(24h) | summarize by OperationNameValue, ActivityStatusValue`
GCP: `gcloud logging read 'protoPayload.authenticationInfo.principalEmail="<compromised>"' --limit=50 --format=json`
Determine compromise type:
Access key leaked (GitHub, paste site, logs)
Console password compromised (phishing, credential stuffing)
Session token stolen (SSRF, metadata service exploit)
Scope the blast radius โ€” what resources were accessed or modified:
AWS: `aws cloudtrail lookup-events --lookup-attributes AttributeKey=Username,AttributeValue=<user> --start-time <compromise_start> | jq '.Events[].EventName' | sort | uniq -c | sort -rn`
Look for: RunInstances, CreateUser, CreateAccessKey, PutBucketPolicy, CreateLoginProfile

Decision Tree

IF: Only access keys compromised (console access not affected)
THEN: Rotate keys immediately. Apply explicit deny. Scope is limited to API actions.
IF: Console access compromised (password or SSO)
THEN: Disable user + reset password + revoke sessions + check for MFA changes.
IF: Attacker created new IAM users or roles
THEN: CRITICAL: Attacker has backdoor access. Disable new identities immediately. Full IAM audit required.
IF: Crypto-mining instances detected
THEN: Terminate instances immediately. Check ALL regions โ€” miners often spawn in regions you don't monitor.
Rotate or disable compromised access keys:
AWS: `aws iam update-access-key --user-name <user> --access-key-id <key> --status Inactive` + `aws iam create-access-key --user-name <user>`
Azure: Azure Portal > App registrations > Certificates & secrets > Delete old, create new
GCP: `gcloud iam service-accounts keys disable <key-id> --iam-account=<sa>@<project>.iam.gserviceaccount.com`
Revoke active sessions:
AWS: Attach inline policy to user: `{"Effect":"Deny","Action":"*","Resource":"*","Condition":{"DateLessThan":{"aws:TokenIssueTime":"<NOW>"}}}`
Azure: `Revoke-AzureADUserAllRefreshToken -ObjectId <user>`
Disable any NEW IAM users/roles created by attacker:
AWS: `aws iam list-users --query 'Users[?CreateDate>=`<date>`]'` โ†’ delete each
Check for new access keys on existing users: `aws iam list-access-keys --user-name <user>`
Block attacker IPs at security group / NSG / firewall level
Full audit of changes during compromise window:
IAM: New users, roles, policies, access keys, login profiles
Compute: New instances (check ALL regions), Lambda functions, containers
Storage: Bucket policy changes, public access, new buckets
Network: Security group changes, VPC peering, new VPN connections
Remove unauthorized resources:
AWS: `aws ec2 describe-instances --region <region> --filters 'Name=instance-state-name,Values=running' --query 'Reservations[].Instances[].{ID:InstanceId,Type:InstanceType,Launch:LaunchTime}'`
Check ALL regions: `for region in $(aws ec2 describe-regions --query 'Regions[].RegionName' --output text); do echo "==$region=="; aws ec2 describe-instances --region $region --query 'Reservations[].Instances[].InstanceId' --output text; done`
Revert security group and IAM policy changes from CloudTrail diff
Check for backdoor access: cross-account roles, SSO configurations, federation trust changes
Reset credentials and enforce MFA on all affected accounts
Restore modified resources from Infrastructure-as-Code (Terraform state, CloudFormation, etc.)
Verify billing dashboard โ€” check for crypto-mining charges across ALL regions
AWS: Cost Explorer > filter by service = EC2 > group by region > last 7 days
Re-enable monitoring and alerting for affected scope
Set up billing alerts: SNS notification at $X threshold
Implement least-privilege IAM โ€” review and tighten all policies
Enable threat detection services:
AWS: GuardDuty (all regions), CloudTrail (all regions + S3 data events)
Azure: Defender for Cloud (all subscriptions), Sentinel
GCP: Security Command Center, VPC Flow Logs
Implement SCPs (AWS) or Azure Policy to prevent risky actions (e.g., no public S3, no root access keys)
Set up billing anomaly detection โ€” alert on spend > 150% of baseline
Scan all repos for leaked credentials: truffleHog, git-secrets, GitHub secret scanning

Data Breach Response

critical

T1048 โ€” Exfiltration Over Alternative Protocol

Full operational runbook for confirmed data exfiltration โ€” technical containment, legal obligations, regulatory notification, and stakeholder communication.

โฑ Days to weeks Incident CommanderSOC Analyst L3Legal / DPOPR / CommunicationsExecutive Sponsor
data-breachexfiltrationgdprcompliancenotification
P1 SLAImmediate โ€” all confirmed data breaches are P1
War Room#inc-breach-YYYYMMDD (restricted access โ€” need-to-know only)

Notify:

  • CEO
  • CISO
  • General Counsel / DPO
  • Board (if material breach)

External Contacts:

  • External forensics firm (contract in advance)
  • Cyber insurance carrier (breach coach)
  • Outside legal counsel (privilege)
  • PR / crisis communications firm
MTTD (Mean Time to Detect)< 24 hours (from first exfiltration to detection)
MTTC (Mean Time to Contain)< 4 hours (from detection to exfiltration channel closed)
Notification SLAGDPR: 72 hours to DPA. HIPAA: 60 days. SEC: 4 business days (material). State laws: varies (15-60 days)
Industry BenchmarkAverage cost per breached record: $169 (IBM 2025). Average breach cost: $4.88M.
Confirm data exfiltration โ€” identify what left and through what channel:
DLP alerts: Check DLP dashboard for policy violations (email, cloud storage, USB, web upload)
Network: `DeviceNetworkEvents | where RemoteUrl !in ('<known_domains>') | where ActionType == 'ConnectionSuccess' | summarize TotalBytes=sum(SentBytes) by DeviceName, RemoteUrl | where TotalBytes > 100000000 | sort by TotalBytes desc`
DNS: Look for DNS tunneling โ€” `DnsEvents | where Name has_any ('<suspicious_domains>') | summarize count() by Name, Computer`
Determine volume and sensitivity classification of exfiltrated data:
PII (names, SSN, DOB) โ†’ regulatory notification required
PHI (medical records) โ†’ HIPAA breach notification required
Financial (credit cards, bank accounts) โ†’ PCI DSS notification required
Trade secrets / IP โ†’ legal action + competitive damage assessment
Identify affected data subjects โ€” how many individuals are impacted?
Establish timeline: when did exfiltration start and when was it stopped?

Decision Tree

IF: Exfiltration is still active
THEN: IMMEDIATE CONTAINMENT โ€” close the channel before any further scoping.
IF: PII/PHI of EU residents confirmed
THEN: 72-hour GDPR clock starts now. Engage DPO and Legal immediately.
IF: Publicly traded company + material impact
THEN: SEC 4-business-day clock. Engage outside counsel for privilege.
IF: Data posted on dark web / leak site
THEN: Engage crisis communications. Prepare public statement. Notify affected individuals immediately.
Close the exfiltration channel immediately:
Block destination IP/domain at firewall: `set security policy from trust to untrust match destination-address <ip> then deny`
Block at DNS: add to DNS sinkhole / RPZ
Block at web proxy: add URL to block list
If USB: disable USB mass storage via GPO: `Computer Config > Admin Templates > System > Removable Storage Access > Deny All`
Isolate affected systems โ€” do NOT wipe (evidence preservation)
Preserve ALL evidence โ€” establish chain of custody:
Move to forensic preservation immediately
NO system changes without IC approval
Engage legal counsel โ€” all communications through counsel for attorney-client privilege
Activate incident communication team โ€” all external communications through PR/legal only
Remove attacker access and all persistence mechanisms (follow Lateral Movement runbook if applicable)
Patch the vulnerability or close the gap that enabled exfiltration
Rotate ALL credentials that may have been exposed in the breach
If insider threat โ€” coordinate with HR/Legal (see Insider Threat runbook)
Rebuild affected systems from clean images
Implement additional DLP controls on the exfiltration channel that was used
Enhanced monitoring on previously affected data stores
Deploy CASB (Cloud Access Security Broker) if cloud storage was the vector
Review and tighten data classification + access control policies
Regulatory notifications (with Legal guidance):
GDPR (EU): Notify supervisory authority within 72 hours of awareness. Notify data subjects 'without undue delay' if high risk.
HIPAA (US healthcare): Notify HHS within 60 days. Notify individuals within 60 days. If >500: notify media.
PCI DSS: Notify acquiring bank and card brands.
SEC (public companies): File 8-K within 4 business days if material.
State breach notification laws: varies by state (CA: notify AG if >500 residents)
Notify affected data subjects โ€” content must include:
What happened (factual, reviewed by Legal)
What data was affected (specific data types)
What you're doing about it (remediation steps)
What they should do (credit monitoring, password reset, etc.)
Contact information for questions
Engage external forensics firm for formal investigation report
File cyber insurance claim with full documentation
Publish transparency report if appropriate

Affected Individuals Notification

SUBJECT: Important Security Notice from [Company]

Dear [Name],

We are writing to inform you of a security incident that may have affected your personal information.

What happened: On [date], we discovered that an unauthorized party accessed systems containing [type of data].

What information was involved: [Specific data types โ€” e.g., name, email address, phone number]

What we are doing: We have contained the incident, engaged external cybersecurity experts, and notified relevant authorities. We are implementing additional security measures to prevent future incidents.

What you can do:
- Monitor your accounts for unusual activity
- [If credentials involved]: Change your password immediately
- [If financial data]: We are providing [X months] of complimentary credit monitoring through [provider]. Enroll at [URL] using code [CODE].

For questions, contact our dedicated support line: [phone] or [email]

We take the security of your information seriously and sincerely apologize for this incident.

Insider Threat Investigation

high

T1078 โ€” Valid Accounts

Runbook for investigating suspected malicious insider activity โ€” balancing security operations with legal and HR requirements.

โฑ Days to weeks SOC Analyst L3HR Business PartnerLegal CounselIncident CommanderEmployee Relations
insider-threatinvestigationhrlegaldlp
P2 SLA4 hours to Security Manager
Escalate to P1If active data exfiltration confirmed โ†’ switch to Data Breach runbook

Notify:

  • Security Manager
  • HR Business Partner
  • Legal

External Contacts:

  • External forensics (if legal proceedings anticipated)
  • Law enforcement (if criminal activity confirmed โ€” coordinate with Legal)
MTTD (Mean Time to Detect)Varies โ€” insider threats average 85 days to detect (Ponemon 2025)
MTTC (Mean Time to Contain)< 24 hours from confirmed malicious intent to access revocation
Industry BenchmarkAverage insider threat incident cost: $16.2M per organization annually (Ponemon 2025)
Identify insider threat indicators (technical):
Mass file downloads: `DeviceFileEvents | where ActionType == 'FileCreated' and FolderPath startswith 'C:\Users\<user>\Downloads' | where Timestamp > ago(7d) | summarize count() by bin(Timestamp, 1h) | sort by Timestamp desc`
USB activity: `DeviceEvents | where ActionType == 'PnpDeviceConnected' and DeviceDescription has 'USB' | where Timestamp > ago(30d)`
Cloud uploads: DLP logs for uploads to personal cloud storage (Google Drive, Dropbox, OneDrive personal)
Email: Large attachments to personal email or unusual external recipients
Off-hours access: `SigninLogs | where UserPrincipalName == '<user>' | extend Hour=datetime_part('hour', TimeGenerated) | where Hour < 6 or Hour > 22 | summarize count() by bin(TimeGenerated, 1d)`
Correlate with HR signals:
Resignation notice submitted
Performance improvement plan (PIP)
Passed over for promotion
Workplace disputes or disciplinary action
Unusual financial pressure
IMPORTANT: Do NOT alert the subject. Investigation must remain covert until Legal/HR decide to act.

Decision Tree

IF: Suspicious activity but no confirmed malicious intent
THEN: Increase monitoring (with Legal approval). DO NOT confront the employee.
IF: Confirmed data exfiltration to personal accounts/devices
THEN: Coordinate with Legal + HR for immediate action plan. Prepare for same-day termination meeting.
IF: Employee has already resigned and is in notice period
THEN: Immediate access review. Consider garden leave (paid leave with no system access).
IF: Evidence of criminal activity (selling data, sabotage)
THEN: Engage Legal for law enforcement referral. Preserve evidence with forensic rigor.
CRITICAL: Engage Legal and HR BEFORE taking any technical action
Legal must approve monitoring scope and method
HR must be prepared for employee interaction
Document all approvals in writing
Increase monitoring on the suspected user (with Legal approval):
Enable enhanced DLP monitoring for the user
Enable mailbox audit logging if not already on: `Set-Mailbox -Identity <user> -AuditEnabled $true -AuditLogAgeLimit 365`
Review Azure AD conditional access โ€” ensure user can't access from unmanaged devices
Restrict access to sensitive systems without alerting (if investigation is covert):
Remove from sensitive SharePoint sites / file shares (use 'permissions review' as cover)
Tighten DLP policies for the user's scope
Preserve all evidence with chain-of-custody documentation โ€” every action logged with timestamp and who performed it
Upon Legal/HR decision to act โ€” coordinate simultaneous access revocation:
Disable AD account: `Disable-ADAccount -Identity <user>`
Disable Azure AD: `Set-AzureADUser -ObjectId <user> -AccountEnabled $false`
Revoke all sessions: `Revoke-AzureADUserAllRefreshToken -ObjectId <user>`
Disable VPN access
Revoke badge / building access
Disable phone / MDM wipe of corporate data (NOT personal data)
All done within the SAME timeframe โ€” usually during the HR termination meeting
Collect company devices โ€” laptop, phone, external drives (with HR present, documented)
Forensic image of company devices BEFORE any wipe
Review all systems the user had access to for planted backdoors, logic bombs, or time-delayed scripts:
Check crontab / Task Scheduler for delayed-execution tasks
Review recent code commits for malicious changes
Check for personal SSH keys added to servers
Change shared credentials the user had access to (shared accounts, service accounts, API keys)
Review and reassign the user's responsibilities and access to another team member
Audit code, configurations, and infrastructure changes made by the user in last 90 days
Remove user from all groups, distribution lists, shared mailboxes
Document findings in format suitable for legal proceedings (coordinate with Legal)
Preserve forensic images for minimum retention period (coordinate with Legal on hold requirements)
Review and improve access control policies: least privilege, need-to-know
Implement / improve UEBA (User Entity Behavior Analytics) โ€” Azure Sentinel UEBA, Exabeam, Securonix
Review offboarding checklist โ€” ensure same-day access revocation is standard
Consider implementing DLP policies that trigger on behavioral patterns, not just content

DDoS Attack Response

high

T1498 โ€” Network Denial of Service

Operational runbook for mitigating active DDoS attacks โ€” from traffic analysis through scrubbing activation and service restoration.

โฑ 1-4 hours SOC Analyst L2Network EngineerDevOps / SRECommunications Lead
ddosavailabilitynetworkcdnwaf
P1 SLA15 minutes โ€” if customer-facing services are down
P2 SLA30 minutes โ€” if degraded but not fully down

Notify:

  • Network Team Lead
  • DevOps TL
  • VP Engineering (if customer-facing outage)

External Contacts:

  • DDoS mitigation provider (Cloudflare/Akamai/AWS Shield) โ€” activate scrubbing
  • ISP upstream โ€” request blackhole or traffic filtering
MTTD (Mean Time to Detect)< 5 minutes (automated monitoring should catch traffic anomalies)
MTTC (Mean Time to Contain)< 30 minutes (from alert to mitigation service activated)
MTTR (Mean Time to Recover)< 2 hours (full service restoration and attack terminated)
Industry BenchmarkAverage DDoS attack duration: 68 minutes (Cloudflare 2025). Peak attack sizes > 1 Tbps.
Confirm DDoS indicators โ€” check for:
Bandwidth saturation: Network monitoring (PRTG, Grafana, Datadog) shows traffic spike
Connection exhaustion: `netstat -an | awk '{print $6}' | sort | uniq -c | sort -rn` โ€” look for abnormal SYN_RECV, ESTABLISHED counts
Application layer: HTTP 503 rate spike, abnormal request patterns
Identify attack type:
Volumetric (UDP flood, DNS amplification): Massive bandwidth consumption. Check: `tcpdump -i eth0 -nn 'udp' -c 1000 | awk '{print $5}' | sort | uniq -c | sort -rn | head`
Protocol (SYN flood, ACK flood): Connection table exhaustion. Check: `ss -s` for socket state distribution
Application layer (HTTP flood, Slowloris): Normal bandwidth, high request rate. Check: WAF logs or `tail -f /var/log/nginx/access.log | awk '{print $1}' | sort | uniq -c | sort -rn | head`
Determine target โ€” which IP/service/URL is being targeted
Check if this is a distraction attack (DDoS used to mask data exfiltration or other attack)

Decision Tree

IF: Attack volume exceeds your bandwidth capacity
THEN: Immediately activate upstream DDoS mitigation (Cloudflare/Akamai/Shield). On-premise mitigation will not help.
IF: Application layer attack (low bandwidth, high request rate)
THEN: WAF rate limiting + geo-blocking + challenge pages. No need for scrubbing service.
IF: Multiple services targeted simultaneously
THEN: Possible coordinated attack or distraction. Check for concurrent intrusion attempts.
IF: Ransom note received (RDoS)
THEN: DO NOT pay. Activate mitigation. Notify law enforcement.
Activate DDoS mitigation service:
Cloudflare: Security > DDoS > Enable 'Under Attack Mode' / or API: `curl -X PATCH 'https://api.cloudflare.com/client/v4/zones/<zone>/settings/security_level' -H 'Authorization: Bearer <token>' -d '{"value":"under_attack"}'`
AWS Shield Advanced: Automatically engaged if subscribed. Contact AWS Shield Response Team (SRT) for assistance.
Akamai: Contact SOC to activate scrubbing center
Enable rate limiting:
Nginx: `limit_req_zone $binary_remote_addr zone=ddos:10m rate=10r/s; limit_req zone=ddos burst=20 nodelay;`
Cloudflare: Security > WAF > Rate limiting rules > Create rule
AWS WAF: Add rate-based rule: `aws wafv2 create-rate-based-statement --rate-limit 1000 --aggregate-key-type IP`
Implement geo-blocking if attack originates from specific regions:
Cloudflare: Security > WAF > Custom rules > Block country
Nginx: Use GeoIP module to block country codes
If all else fails and service is sacrificial โ€” blackhole route the targeted IP:
`ip route add blackhole <target_ip>/32` (last resort โ€” drops ALL traffic including legitimate)
Scale infrastructure if possible: auto-scaling group adjustments, add instances behind LB
Work with ISP to filter attack traffic upstream (BGP blackhole, FlowSpec, or RTBH)
Fine-tune WAF rules based on attack signatures:
Identify common patterns: User-Agent, request rate, URI patterns, query strings
Block bot signatures at WAF level
Block identified attack source ranges (for targeted/non-botnet attacks):
`iptables -A INPUT -s <source_range> -j DROP` (or equivalent in cloud security groups)
Verify mitigation is effective โ€” traffic should normalize while attack continues
Gradually restore service โ€” disable 'Under Attack Mode' and aggressive rate limiting
Monitor for re-attack for 24-48 hours
Clear application caches and request queues: `systemctl restart nginx` or flush CDN cache
Verify service health across all endpoints and regions
Check for backlog in message queues, database connections, etc.
Document attack profile: type, volume, duration, source distribution, targeted service
Review and update DDoS response plan based on lessons learned
Evaluate DDoS mitigation service โ€” was the response fast enough? Right provider?
Implement always-on DDoS protection if currently on-demand only
Set up automated DDoS detection + mitigation activation (many providers support API-triggered activation)
If ransom demand was received โ€” report to law enforcement (FBI IC3, NCSC, etc.)
Conduct capacity planning review โ€” can infrastructure handle future attacks?

Malware Infection Response

high

T1204 โ€” User Execution

Full response playbook for confirmed malware infections โ€” endpoint isolation, malware analysis, remediation, and re-imaging.

โฑ 1-4 hours SOC Analyst L2Endpoint EngineerIT Ops
malwaretrojandropperinfostealerworm
P2 SLA30 minutes to SOC Lead
Escalate to P1If malware is self-propagating (worm) or involves data exfiltration โ†’ escalate to P1 immediately

Notify:

  • SOC Manager
  • Endpoint Team Lead

External Contacts:

  • Submit samples to VirusTotal, MalwareBazaar, ANY.RUN for community benefit
MTTD (Mean Time to Detect)< 15 minutes (from execution to EDR alert)
MTTC (Mean Time to Contain)< 1 hour (from alert to full containment of affected host)
MTTR (Mean Time to Recover)< 4 hours (re-image and restore user to operational state)
Industry BenchmarkEDR auto-containment reduces MTTC to under 1 minute for known threats
Confirm malware alert from EDR/AV โ€” determine detection type:
Real-time block: Malware was blocked before execution โ†’ lower severity
Post-execution detection: Malware ran before detection โ†’ higher severity
Behavioral detection: Suspicious behavior matched โ†’ investigate thoroughly
Gather initial IOCs from EDR alert:
File hash (SHA256), file path, process tree, network connections, dropped files
CrowdStrike: Detection > View full detection details > Process tree
Defender: Incidents > Alert > Device timeline > Filter by alert time
Check file hash against threat intel:
VirusTotal: Search hash โ†’ check detection ratio and malware family
MalwareBazaar: Search hash โ†’ check tags, family, first seen date
Hybrid Analysis: Search hash โ†’ check sandbox report for behavior
Determine malware category: dropper, RAT, infostealer, ransomware, cryptominer, worm

Decision Tree

IF: Malware was blocked before execution (real-time prevention)
THEN: Lower severity. Verify block was successful. Check if other users received the same file.
IF: Malware executed and established persistence
THEN: Isolate host immediately. Full investigation required.
IF: Malware is spreading to other hosts
THEN: ESCALATE to P1. Switch to Lateral Movement playbook. Network-wide hunt required.
Network-isolate the infected host:
CrowdStrike: Host > Actions > Contain Host
Defender: Device page > Isolate device
Manual: Disable NIC or VLAN isolation via switch port shutdown
Block malware hash across the organization:
CrowdStrike: IOC Management > Add indicator > SHA256 > Action: Block
Defender: Settings > Indicators > Add indicator > File hashes
SIEM: Add hash to custom IOC watchlist for detection
Block C2 domains/IPs at perimeter:
DNS sinkhole the C2 domain
Firewall block rule for C2 IP addresses
Search for the same file across all endpoints:
CrowdStrike: Investigate > Custom IOC > File hash > Search all hosts
Defender: Advanced hunting > `DeviceFileEvents | where SHA256 == '<hash>'`
Kill malicious processes on the infected host
Remove malware files, persistence mechanisms, and dropped payloads
Remove any scheduled tasks, services, or registry entries created by the malware
Check for credential theft โ€” if infostealer, rotate all credentials used on the device
Scan with multiple AV engines to ensure complete removal
If sophisticated or unknown malware โ€” re-image the host instead of manual cleanup
Re-image the host from golden image if manual cleanup is not confidence-level high
Restore user data from backup (NOT from the infected disk)
Patch the vulnerability exploited for initial delivery (if known)
Monitor the recovered host for 48 hours for re-infection
Re-enable network connectivity after verification
Document malware family, IOCs, delivery method, and affected scope
Update email gateway / web proxy rules to block delivery vector
Submit malware sample to MalwareBazaar and VirusTotal for community sharing
If new malware variant โ€” write custom detection rule for your SIEM
Review endpoint protection configuration โ€” should this have been caught sooner?

Supply Chain Attack Response

critical

T1195.002 โ€” Supply Chain Compromise: Compromise Software Supply Chain

Response playbook for compromised software supply chain โ€” when a trusted vendor, library, or update mechanism is weaponized.

โฑ 4-24 hours Incident CommanderSOC Analyst L3DevOps / EngineeringVendor ManagementLegal
supply-chainthird-partydependencyupdate-poisoning
P1 SLAImmediate โ€” supply chain compromises affect the entire organization
War Room#inc-supply-chain-YYYYMMDD

Notify:

  • CISO
  • CTO
  • VP Engineering
  • Legal
  • Vendor Management

External Contacts:

  • CISA (cisa.gov/report)
  • Affected vendor security team
  • ISAC
MTTD (Mean Time to Detect)< 24 hours (from advisory/detection to confirmed impact assessment)
MTTC (Mean Time to Contain)< 4 hours (from confirmation to blocking the compromised component)
MTTR (Mean Time to Recover)< 48 hours (to patch/replace the compromised component across all systems)
Industry BenchmarkAverage supply chain attack goes undetected for 250+ days (Mandiant 2025)
Identify the compromised component โ€” check threat intel and vendor advisories:
CISA KEV catalog (cisa.gov/known-exploited-vulnerabilities)
Vendor security advisory / CVE database
Community reporting (Twitter/X, Reddit r/netsec, security mailing lists)
Determine if the compromised version is in your environment:
Software inventory: Check CMDB/asset management for affected software version
Dependency scan: `npm audit`, `pip audit`, `dotnet list package --vulnerable`
Container images: `trivy image <image>`, `grype <image>`
Search for IOCs associated with the supply chain compromise:
Hashes, domains, IPs, behavioral indicators from vendor advisory
Assess blast radius โ€” how many systems have the compromised component?

Decision Tree

IF: Compromised version is NOT in your environment
THEN: Document finding, block the version proactively, monitor for new information.
IF: Compromised version IS installed but no evidence of exploitation
THEN: Proceed to Contain โ€” patch/remove urgently. Search for exploitation IOCs.
IF: Evidence of active exploitation found
THEN: FULL INCIDENT RESPONSE. Treat every system with the component as potentially compromised.
Block the compromised component immediately:
Block the specific version in artifact repositories (npm, PyPI mirrors, Docker registry)
Block update/download URLs at proxy and firewall
Revoke any API keys or tokens associated with the compromised component
Isolate systems showing signs of exploitation
Disable automatic updates for the affected software until a verified safe version is available
If the vendor is the compromise vector โ€” disconnect vendor access (VPN tunnels, API integrations)
Remove or downgrade the compromised component to last-known-good version
For each system with the compromised component:
Check for persistence mechanisms added during compromise window
Verify file integrity of the software installation
Rotate any credentials that the software had access to
Rebuild systems with evidence of exploitation from clean images
Update package lock files / dependency pins to exclude compromised versions
Deploy the patched/clean version of the component
Verify deployment across all systems โ€” zero instances of compromised version remain
Re-enable automatic updates once vendor confirms safe version
Restore any disabled integrations with new credentials
Monitor for re-compromise for 30 days
Implement Software Bill of Materials (SBOM) for all applications
Deploy dependency scanning in CI/CD pipeline (Snyk, Dependabot, Renovate)
Review vendor security assessment process โ€” was this vendor properly vetted?
Implement software signing verification for all deployments
Share IOCs with ISACs and relevant community channels

Account Takeover Response

high

T1078 โ€” Valid Accounts

Response playbook for compromised user accounts โ€” password reset, session revocation, damage assessment, and credential hygiene.

โฑ 30-120 minutes SOC Analyst L1/L2Identity TeamHelp Desk
account-compromiseidentitycredential-theftsession-hijack
P2 SLA15 minutes to Security Operations
Escalate to P1If executive account, admin account, or clear data access/exfiltration โ†’ escalate to P1

Notify:

  • SOC Manager
  • Identity Team Lead
MTTD (Mean Time to Detect)< 10 minutes (from anomalous sign-in to analyst alert)
MTTC (Mean Time to Contain)< 15 minutes (from alert to credential reset + session revocation)
MTTR (Mean Time to Recover)< 2 hours (full investigation + user remediation)
Industry Benchmark83% of breaches involve stolen credentials (Verizon DBIR 2025)
Identify compromise indicators:
Impossible travel / unusual location sign-in
Sign-in from anonymous proxy or Tor exit node
New device + new location combination
Anomalous application access patterns
User-reported suspicious activity (password reset they didn't request)
Check Azure AD / IdP sign-in logs:
`SigninLogs | where UserPrincipalName == '<user>' | where TimeGenerated > ago(7d) | project TimeGenerated, IPAddress, Location, AppDisplayName, DeviceDetail, RiskLevel | sort by TimeGenerated desc`
Check for suspicious activity from the account:
Inbox rules created, emails sent, files accessed/shared, groups joined
Azure AD: audit logs for the user in the past 7 days

Decision Tree

IF: Single anomalous sign-in, no suspicious activity after
THEN: Reset password, revoke sessions, interview user. May be false positive from VPN/travel.
IF: Confirmed unauthorized access with activity (rules, emails, file access)
THEN: Immediate lockout + full investigation. Check all activity during compromise window.
IF: Admin or privileged account compromised
THEN: ESCALATE to P1. Disable account, audit all admin actions, check for persistence.
Reset password immediately โ€” use strong random password:
Azure AD: `Set-AzureADUserPassword -ObjectId <user> -Password (ConvertTo-SecureString (New-Guid).Guid.Substring(0,16) -AsPlainText -Force) -ForceChangePasswordNextLogin $true`
Google: Admin > Users > Reset password
Revoke ALL active sessions:
Azure AD: `Revoke-AzureADUserAllRefreshToken -ObjectId <user>`
Google: Admin > Users > Security > Sign out user from all sessions
Check and remove suspicious inbox rules:
`Get-InboxRule -Mailbox <user> | where {$_.ForwardTo -or $_.DeleteMessage} | fl`
Check and revoke suspicious OAuth app consents:
Azure AD: Enterprise Apps > User consent > Review and revoke
Check and remove any delegates added to the mailbox
Determine how the account was compromised:
Phishing: Check email for credential harvesting links
Password reuse: Check if the password matches known breaches (HIBP)
Token theft: Check for AiTM proxy indicators in sign-in logs
Malware: Check endpoint for keyloggers or infostealers
Force MFA re-enrollment โ€” require the user to set up MFA again from scratch
If password reuse โ€” advise user to change passwords on all personal accounts
If endpoint was compromised โ€” run full EDR scan or re-image device
Provide user with new password via secure channel (phone call, in-person, not email)
Walk the user through MFA enrollment
Restore any deleted emails or modified data from backup
Notify anyone who received emails from the account during compromise window
Monitor account for re-compromise for 30 days
If phishing was the vector โ€” check if other users received the same phishing email
Review Conditional Access policies โ€” would location/device restrictions have prevented this?
Implement risk-based Conditional Access if not already in place
Schedule security awareness training for the affected user
Review password policy โ€” enforce complexity and banned password list

Cryptojacking Response

high

T1496 โ€” Resource Hijacking

Response playbook for unauthorized cryptocurrency mining โ€” identifying mining processes, cloud resource abuse, and cost containment.

โฑ 1-3 hours SOC Analyst L2Cloud Security EngineerFinOps
cryptojackingcrypto-miningresource-hijackingcloud-abuse
P2 SLA30 minutes to SOC Lead
Escalate to P1If mining is running on production systems or cloud costs exceed $1K/day

Notify:

  • Cloud Security Lead
  • FinOps (for cost anomaly)
MTTD (Mean Time to Detect)< 1 hour (from mining start to detection)
MTTC (Mean Time to Contain)< 30 minutes (from detection to process termination + access revocation)
MTTR (Mean Time to Recover)< 2 hours (cleanup, patch, and credential rotation)
Industry BenchmarkAverage cryptojacking cloud cost: $4,600/incident. Can reach $100K+ if undetected.
Identify cryptojacking indicators:
High CPU/GPU utilization on endpoints or servers (sustained 90%+)
Cloud cost anomaly โ€” unexpected compute charges (EC2, GCE, AKS)
Network connections to mining pools (*pool.*, *xmr.*, *nicehash.*, *f2pool.*)
EDR alert for mining-related process names (xmrig, minerd, cgminer, ethminer)
Confirm mining activity:
Linux: `top -bn1 | head -20` + `ps aux | grep -i 'xmr\|mine\|crypto'`
Windows: `Get-Process | Sort-Object CPU -Descending | Select -First 10`
Network: `netstat -an | grep ':3333\|:4444\|:8333\|:14444'` (common mining ports)
AWS: Check for unauthorized EC2 instances: `aws ec2 describe-instances --filters 'Name=instance-state-name,Values=running'`

Decision Tree

IF: Mining on endpoint โ€” user installed intentionally
THEN: Policy violation. Remove software, document, escalate to HR/management if policy exists.
IF: Mining installed via malware/exploit
THEN: Treat as full compromise. Mining is the payload but access is the real problem.
IF: Mining on cloud infrastructure with spawned instances
THEN: Immediate termination of unauthorized instances. Rotate ALL cloud credentials. Check billing alerts.
Kill mining processes immediately:
Linux: `kill -9 $(pgrep -f 'xmr\|mine\|crypto')`
Windows: `Stop-Process -Name 'xmrig','minerd' -Force`
Terminate any unauthorized cloud instances:
AWS: `aws ec2 terminate-instances --instance-ids <id>`
Azure: `az vm delete --resource-group <rg> --name <vm> --yes`
Block mining pool domains and IPs at network perimeter
Block mining pool domains in DNS (sinkhole or blacklist)
Disable the compromised access โ€” API keys, IAM roles, user accounts
Remove mining software and associated files
Remove persistence mechanisms (crontabs, systemd services, scheduled tasks)
Determine initial access vector โ€” how did the miner get deployed?
Common vectors: exposed Docker API, vulnerable web app, leaked cloud credentials, SSH brute force
Patch the vulnerability used for access
Rotate all credentials on affected systems โ€” cloud keys, SSH keys, service accounts
Verify mining processes are gone and CPU usage is normal
Review cloud billing for the full impact period โ€” dispute charges if possible
Restore any modified configurations (crontab, systemd, etc.) from backup
Re-deploy affected containers/instances from clean images
Set up cloud billing alerts to catch future anomalies early
Implement cloud cost anomaly alerts (AWS Budgets, Azure Cost Alerts, GCP Budget Alerts)
Block known mining domains at DNS level organization-wide
Review cloud IAM โ€” implement least-privilege access and remove unused credentials
Scan for exposed services (Docker API, Kubernetes API, Redis) on public IPs
Deploy network-level mining detection (Suricata rules for Stratum protocol)

Web Application Attack Response

critical

T1190 โ€” Exploit Public-Facing Application

Response playbook for attacks against web applications โ€” SQL injection, XSS, RCE, file upload, and other OWASP Top 10 attacks.

โฑ 2-8 hours SOC Analyst L2Application Security EngineerDevOpsDBA (if SQL injection)
web-attacksqlixssrceowaspapplication-security
P1 SLA30 minutes to Application Security Lead
Escalate to P1If RCE confirmed, data exfiltration evident, or active shell on web server

Notify:

  • Application Security Lead
  • DevOps Lead
  • Product Owner

External Contacts:

  • Web application pen testing firm (if persistent attacker)
MTTD (Mean Time to Detect)< 15 minutes (from attack to WAF/SIEM alert)
MTTC (Mean Time to Contain)< 1 hour (from alert to WAF rule + application patch)
MTTR (Mean Time to Recover)< 4 hours (full vulnerability remediation + validation)
Industry BenchmarkWeb application attacks account for 26% of all breaches (Verizon DBIR 2025)
Identify the attack type from WAF/SIEM alerts:
SQL Injection: Look for UNION SELECT, OR 1=1, single quotes in parameters
XSS: Look for <script>, javascript:, onerror= in inputs
RCE: Look for command injection patterns (;, |, &&, backticks) or deserialization payloads
Path Traversal: Look for ../ sequences in URL parameters
File Upload: Check for uploaded files with executable extensions (.php, .jsp, .aspx)
Review WAF logs for the attack pattern:
Cloudflare: Security > Events > Filter by action=challenge/block
AWS WAF: CloudWatch Logs > Filter matched rules
ModSecurity: `grep 'ModSecurity' /var/log/apache2/error.log | tail -50`
Determine if the attack was successful:
Check application logs for errors that indicate successful exploitation
Check for unexpected data in database (SQLi success indicator)
Check for webshells or new files on the web server

Decision Tree

IF: Attack was blocked by WAF โ€” no evidence of success
THEN: Monitor and tune WAF rules. Patch the vulnerability in the next sprint.
IF: Attack bypassed WAF or there's evidence of successful exploitation
THEN: IMMEDIATE: Block attacker IP, patch vulnerability, check for data access.
IF: Webshell or reverse shell found on server
THEN: CRITICAL: Isolate web server. Assume full server compromise. Check for lateral movement.
Block the attacker's IP at WAF and firewall level
If attack is automated/distributed โ€” implement rate limiting and CAPTCHA
Deploy emergency WAF rule to block the specific attack pattern:
Cloudflare: Security > WAF > Custom rule > Block requests matching pattern
ModSecurity: Add deny rule for the specific payload signature
If webshell found โ€” isolate the web server and pre-serve evidence
If SQL injection โ€” revoke the application's database user and create new credentials
Disable the vulnerable endpoint if possible while patching
Fix the vulnerability in the application code:
SQLi: Use parameterized queries / prepared statements
XSS: Implement output encoding and Content-Security-Policy headers
RCE: Sanitize all user input, disable dangerous functions, update frameworks
File Upload: Validate file types, store outside web root, rename uploaded files
Remove any webshells, backdoors, or modified files on the web server
If database was accessed โ€” assess data exposure and rotate credentials
Deploy patched application version
Run application security scan to verify fix (DAST/SAST)
Verify the patched application is functioning correctly
Restore any modified data from backup (if SQLi modified records)
Re-enable the endpoint with WAF rules still in place
Monitor for re-exploitation attempts for 7 days
Run full vulnerability scan against the application
Add the vulnerability type to the secure coding training curriculum
Implement mandatory code review for all input-handling code
Deploy DAST scanning in CI/CD pipeline to catch similar issues
Review WAF rules โ€” could the attack have been caught earlier?
If data breach occurred โ€” follow Data Breach Response playbook for notifications

Zero-Day / Vulnerability Exploitation

critical

T1190 โ€” Exploit Public-Facing Application

Response playbook for active exploitation of unpatched vulnerabilities โ€” emergency mitigation when no vendor patch is available.

โฑ 2-8 hours (initial mitigation), days to weeks (full remediation) Incident CommanderSOC Analyst L3Vulnerability ManagementIT OpsNetwork Team
zero-daycveunpatchedexploitvulnerability
P1 SLAImmediate โ€” active zero-day exploitation is always P1
War Room#inc-zeroday-CVE-YYYY-XXXXX

Notify:

  • CISO
  • VP Engineering
  • Network Team Lead

External Contacts:

  • CISA
  • Vendor security team
  • ISAC
  • External IR retainer
MTTD (Mean Time to Detect)< 4 hours (from public disclosure to internal vulnerability assessment)
MTTC (Mean Time to Contain)< 8 hours (from assessment to workaround deployment)
MTTR (Mean Time to Recover)Vendor-dependent โ€” track patch availability closely
Industry BenchmarkAverage time from CVE disclosure to first exploit: 15 days (Mandiant 2025). Zero-days exploited before disclosure.
Monitor threat intelligence feeds for zero-day announcements:
CISA KEV (cisa.gov/known-exploited-vulnerabilities) โ€” updated daily
Vendor security advisories (Microsoft MSRC, Apache, Cisco PSIRT, etc.)
Twitter/X: @cikiLabs, @GossiTheDog, @kevbeaumont, @campuscodi
Determine if the vulnerable component exists in your environment:
Check CMDB/asset inventory for the specific software + version
Run vulnerability scan targeting the specific CVE
Check for the component in container images and deployments
Search for IOCs associated with known exploitation:
Apply vendor-provided detection signatures
Search logs for known exploitation patterns
Assess exposure โ€” is the vulnerable service internet-facing?

Decision Tree

IF: Vulnerable component NOT in environment
THEN: Document, block proactively if possible, monitor for scope expansion.
IF: Vulnerable component exists but no evidence of exploitation
THEN: Implement workaround/mitigation immediately. Begin patching when available.
IF: Evidence of active exploitation found
THEN: FULL INCIDENT RESPONSE. Isolate affected systems. Treat as confirmed breach.
Implement vendor-recommended workaround immediately:
Disable vulnerable feature/protocol
Apply configuration change to mitigate exploitation
Deploy WAF rule to block exploitation patterns
If no workaround available โ€” isolate or disable the vulnerable system:
Remove internet exposure (move behind VPN, restrict to internal only)
Implement network ACL to limit access to the service
If critical system โ€” implement compensating controls (IPS signature, WAF, proxy)
Deploy IDS/IPS signatures for the specific exploit
If exploitation confirmed โ€” isolate compromised systems and hunt for lateral movement
Apply vendor patch as soon as available
If no patch yet โ€” maintain workaround and monitor
Verify the patch resolves the vulnerability (re-scan)
If exploitation occurred โ€” full investigation of compromised systems
Hunt for any persistence established during exploitation window
Restore internet-facing exposure only after patch is confirmed applied
Remove temporary workarounds that may impact functionality
Continue monitoring for exploitation attempts for 30 days
Verify all instances of the vulnerable component are patched (scan again)
Review patch management process โ€” why wasn't this caught earlier?
Implement automated vulnerability scanning for internet-facing assets
Subscribe to vendor security advisory feeds
Review network segmentation โ€” could exploitation have been limited?
Update asset inventory with accurate software versions

Stolen / Lost Device Response

high

T1078 โ€” Valid Accounts

Response playbook for stolen or lost corporate devices โ€” remote wipe, session revocation, credential rotation, and data exposure assessment.

โฑ 30-90 minutes SOC Analyst L1IT Help DeskMDM Admin
lost-devicestolen-devicemobilelaptopremote-wipe
P2 SLA30 minutes to IT Security
Escalate to P1If device is unencrypted, contains sensitive data, or belongs to privileged user

Notify:

  • IT Security
  • MDM Admin
  • User's Manager
MTTC (Mean Time to Contain)< 30 minutes (from report to remote wipe command issued)
MTTR (Mean Time to Recover)< 4 hours (new device provisioned and user operational)
Industry Benchmark68% of data breaches involve a human element including device loss (Verizon DBIR 2025)
User reports device lost or stolen โ€” gather details:
When was it last seen? Where? Was it locked? Was it encrypted?
What type of device (laptop, phone, tablet)?
Does it have corporate data (email, files, code repos)?
Was the user logged into any sensitive applications?
Check device status in MDM:
Intune: Devices > All devices > Search > Check last check-in, compliance state
Jamf: Computers/Devices > Search > Last inventory update
Verify disk encryption is active:
Windows: BitLocker status in Intune
Mac: FileVault status in Jamf
Mobile: Device encryption in MDM compliance report

Decision Tree

IF: Device is encrypted and was locked
THEN: Lower risk. Proceed with remote wipe and credential rotation.
IF: Device is unencrypted or unlocked
THEN: HIGH RISK. Immediate remote wipe. Assume data is compromised. Full credential rotation.
IF: Device belongs to admin/executive with sensitive access
THEN: Escalate to P1. Full credential rotation, session revocation, and access audit.
Issue remote wipe immediately:
Intune: Device > Wipe (Full wipe, not retire)
Jamf: Device > Management > Send MDM command > Erase Device
Google: Admin > Devices > Wipe device
Apple: Find My > Erase device (if not MDM managed)
Revoke all active sessions for the user:
Azure AD: `Revoke-AzureADUserAllRefreshToken -ObjectId <user>`
Google: Admin > Users > Security > Sign out from all sessions
Disable the device in Azure AD / MDM:
Azure AD: Devices > Search > Disable device
Reset the user's password and require MFA re-enrollment
Revoke any certificate-based authentication from the device
Confirm remote wipe was executed (check MDM for wipe confirmation)
If wipe cannot be confirmed โ€” treat all data on device as compromised
Rotate any credentials that were cached or saved on the device:
Email password, VPN credentials, Wi-Fi certificates
SSH keys stored on the device
Application-specific API tokens or service passwords
Code signing certificates if developer device
Report to local police if stolen (may be required for insurance)
Provision replacement device from clean corporate image
Enroll new device in MDM and verify compliance
Restore user data from cloud backup (OneDrive, iCloud, Google Drive)
Verify user can access all required applications with new credentials
Verify encryption is enforced on all devices via MDM compliance policy
Review auto-lock settings โ€” ensure devices lock after 5 minutes of inactivity
Consider implementing geofencing alerts for managed devices
File insurance claim if applicable
Update asset inventory to reflect device status as lost/stolen

API Security Breach Response

critical

T1106 โ€” Native API

Response playbook for compromised API keys, unauthorized API access, or API abuse โ€” rate limiting, key rotation, and access audit.

โฑ 1-4 hours SOC Analyst L2API/Platform EngineerDevOps
api-keyapi-abuserate-limitingkey-rotationoauth
P1 SLA30 minutes to Platform Security Lead
Escalate to P1If API key provides write access to production data or customer-facing systems

Notify:

  • Platform Security Lead
  • API Team Lead
  • DevOps
MTTD (Mean Time to Detect)< 30 minutes (from anomalous API usage to detection)
MTTC (Mean Time to Contain)< 15 minutes (from detection to key revocation)
MTTR (Mean Time to Recover)< 2 hours (new key generation, rotation, and consumer notification)
Industry BenchmarkAPI attacks increased 681% in 2024-2025 (Salt Security)
Identify API compromise indicators:
Sudden spike in API request volume from a single key
API requests from unusual IP addresses or countries
API key found in public repository (GitHub, GitLab, Pastebin)
Error rate spike (testing stolen key against various endpoints)
Unusual data access patterns (bulk export, sequential enumeration)
Check API gateway/proxy logs:
Request volume, error rates, source IPs, accessed endpoints
OWASP API Security: check for BOLA, Broken Authentication, BFLA patterns
Determine scope of exposed key's permissions:
What endpoints can it access? Read-only or read-write?
What data can it reach? PII, financial data, internal services?

Decision Tree

IF: Key found in public repo but no evidence of abuse
THEN: Revoke key immediately. Create new key. Check logs for any usage from unknown IPs.
IF: Key is being actively abused
THEN: Revoke key immediately. Block abusing IPs. Audit all actions taken with the key.
IF: Customer data accessed via compromised API
THEN: ESCALATE to P1. Invoke Data Breach Response playbook for notification requirements.
Revoke the compromised API key immediately
Block the attacker's IP addresses at API gateway and WAF
If OAuth token โ€” revoke the token and associated refresh tokens
Implement emergency rate limiting on affected endpoints
If key was in code repository โ€” scan git history for other secrets:
`trufflehog git file://. --only-verified`
`gitleaks detect -v`
Generate new API key with minimum required permissions
If key was in code โ€” remove from repository AND git history:
`git filter-branch` or BFG Repo-Cleaner to purge from history
Force-push to remote (coordinate with team)
Move secrets to a secrets manager (HashiCorp Vault, AWS Secrets Manager, Azure Key Vault)
Implement pre-commit hooks to prevent secrets in code (`pre-commit` + `detect-secrets`)
Review API permissions โ€” apply least privilege to all API keys
Deploy new API key to all consuming applications
Verify all integrations work with the new key
Notify API consumers about the key rotation
Monitor for continued abuse attempts (IP blocks may not catch distributed attacks)
Implement automated secret scanning in CI/CD (GitHub Advanced Security, GitLab Secret Detection)
Implement API key rotation policy (90-day maximum lifetime)
Deploy API behavioral anomaly detection
Review API authentication โ€” consider moving from API keys to OAuth 2.0 with short-lived tokens
Implement IP allowlisting for sensitive API endpoints

OT/ICS Cyber Incident Response

critical

T826 โ€” Impair Process Control

This playbook outlines the comprehensive steps and procedures for responding to a cyber incident affecting Operational Technology (OT) and Industrial Control Systems (ICS) environments. It prioritizes safety, operational continuity, and data integrity, recognizing the unique challenges and critical nature of these systems. This playbook is designed for incidents ranging from unauthorized access and data exfiltration to malware infection (e.g., ransomware) and direct process manipulation or disruption.

โฑ 24-72 hours Incident CommanderSOC Analyst L2/L3OT Engineer / Subject Matter Expert (SME)IT Network EngineerSecurity Engineer (Endpoint/Network/Cloud)Legal CounselCommunications LeadExecutive LeadershipVendor Representative (OEM, Integrator)
OTICSSCADACritical InfrastructureRansomwareMalwareDisruptionProcess ControlSafety
P1 SLA15 minutes
War RoomDedicated Microsoft Teams Channel: #OT-Incident-WarRoom-[Date] OR Physical Command Center (if required for severe incidents).

Notify:

  • Incident Commander (On-Call)
  • Head of OT Operations
  • CISO
  • CIO
  • Legal Counsel
  • Communications Lead

External Contacts:

  • CISA (Cybersecurity and Infrastructure Security Agency)
  • FBI (Federal Bureau of Investigation)
  • Sector-specific ISAOs/ISACs
  • Relevant Regulatory Bodies (e.g., FERC, EPA, NERC-CIP)
  • Third-party Incident Response Firm (if retained)
MTTD (Mean Time to Detect)1 hour
MTTC (Mean Time to Contain)4 hours
MTTR (Mean Time to Recover)48 hours
Industry BenchmarkIndustry averages (e.g., IBM X-Force 2023 average breach lifecycle: 277 days; 204 days to identify, 73 days to contain) are significantly longer. Our targets for critical OT incidents are aggressively shorter due to the high impact on safety and operations.
1.1. Validate alert source: Confirm the integrity and authenticity of the detection system (e.g., SIEM, IDS/IPS, OT-specific anomaly detection).
1.2. Initial impact assessment: Determine if the anomaly is affecting physical processes, safety, or production. Engage OT Engineer immediately.
1.3. Review SIEM/SCADA logs for anomalous activity (e.g., unauthorized access, unusual commands, failed authentications, changes to PLC programs, unexpected process values).
1.4. Analyze network traffic for indicators of compromise (IOCs) such as C2 communication, lateral movement attempts, or unusual protocol usage (e.g., IT protocols in OT segments).
1.5. Check endpoint detection and response (EDR) logs on connected IT/OT devices (HMIs, engineering workstations) for malicious processes or file changes.
1.6. Correlate findings with threat intelligence feeds for known OT threats.

Decision Tree

IF: Anomaly detected in process control system (e.g., unexpected valve closure, motor speed change, abnormal temperature/pressure).
THEN: Immediately alert OT Engineer/SME. Validate with HMI and local controls. Initiate emergency shutdown procedures if safety is compromised or process integrity is at risk, following established safety protocols.
IF: Suspicious network traffic (e.g., C2 beaconing, SMB enumeration) observed originating from an OT segment device.
THEN: Isolate the suspected device's network port at the switch level. Notify IT Network Engineer and Security Engineer for further analysis. DO NOT disrupt critical process communication without OT Engineer's approval.
IF: EDR alert on an HMI/Engineering Workstation indicates malware execution or unauthorized access.
THEN: Contain the workstation immediately (network isolation via EDR or physical disconnection). Initiate forensic acquisition. Alert OT Engineer to assess potential impact on connected PLCs/RTUs.
2.1. Prioritize safety: Ensure all actions taken preserve human safety and environmental protection. Consult with OT Engineer before any network or system modification.
2.2. Isolate affected OT network segments: Use firewalls, managed switches (VLANs, port shutdown), or physical disconnection (air-gapping) as a last resort. Prioritize control plane isolation over data plane if possible.
2.3. Disable compromised accounts: Immediately revoke credentials for any accounts identified as compromised within IT and OT domains.
2.4. Block malicious IPs/domains: Update perimeter firewalls and IDS/IPS with known IOCs to prevent further ingress/egress.
2.5. Prevent lateral movement: Implement host-based firewalls, disable unnecessary services, and segment networks further where possible.
2.6. Secure remaining critical assets: Patch known vulnerabilities, apply temporary mitigations, and enforce strong authentication.

Decision Tree

IF: Malware/ransomware identified spreading laterally within multiple OT network segments, affecting critical assets.
THEN: Execute pre-approved segment isolation procedures. Physically disconnect network cables to air-gap affected segments if logical controls (firewalls, VLANs) are insufficient or compromised. Consult OT Engineer for process implications.
IF: Unauthorized remote access detected to an HMI or engineering workstation from an external IP address.
THEN: Immediately block the source IP at the perimeter firewall. Disconnect the affected HMI/workstation from the network. Force password reset for all associated accounts.
IF: PLC/RTU configuration has been modified without authorization, potentially causing unsafe conditions.
THEN: Disconnect the affected PLC/RTU from the network immediately. If safe to do so, switch the PLC to 'Program' mode (if applicable) to prevent further unauthorized changes. Do NOT attempt to revert configuration without a verified golden image.
3.1. Remove malware: Scan and clean infected systems using trusted antivirus/EDR solutions. For deeply embedded malware or firmware compromise, re-imaging or re-flashing may be required.
3.2. Patch vulnerabilities: Apply critical security patches to operating systems, applications, and firmware of affected and similar systems.
3.3. Rebuild systems from trusted sources: Restore compromised HMIs, engineering workstations, and servers from known good, clean backups or golden images.
3.4. Reconfigure and harden PLCs/RTUs: Re-flash PLCs/RTUs with verified, uncompromised firmware. Restore configurations from known good backups. Implement secure configurations (e.g., strong passwords, disabled unnecessary services).
3.5. Change all compromised credentials: Force password resets for all accounts potentially exposed, including service accounts and administrative credentials.
3.6. Perform thorough integrity checks: Verify the integrity of all restored and reconfigured systems and devices using checksums, hashes, and functional testing.
4.1. Validate system functionality and safety: Conduct thorough functional testing of all restored OT systems and processes in a controlled environment before reintroducing them to production.
4.2. Monitor for re-infection: Implement enhanced monitoring on recovered systems for any signs of residual compromise or new attack attempts.
4.3. Restore network connectivity systematically: Reconnect isolated segments and devices in a phased approach, starting with the most critical and least affected.
4.4. Verify data integrity: Ensure historian data, logs, and process values are accurate and complete post-recovery.
4.5. Document lessons learned for recovery: Note any challenges, successes, and areas for improvement during the recovery phase.

Decision Tree

IF: Restored OT system or process exhibits intermittent errors, unexpected behavior, or performance degradation during testing.
THEN: Immediately halt reconnection. Re-evaluate restoration source (backup integrity, golden image verification). Conduct root cause analysis on new symptoms. Do NOT reconnect to production until stability and expected behavior are fully confirmed. Escalate to OT Engineer and OEM.
IF: Post-recovery monitoring detects new IOCs or suspicious activity on a 'clean' system.
THEN: Immediately re-isolate the affected system. Re-initiate eradication steps, focusing on persistent threats. Review and strengthen containment measures. Consider a full forensic re-evaluation.
IF: Process parameters are within operational limits but inconsistent with pre-incident historical data.
THEN: Engage OT Engineer and process control specialists to validate data integrity and calibrate sensors/actuators. Verify no lingering manipulation or data corruption. Do not resume full operations until data integrity is assured.
5.1. Conduct a 'Lessons Learned' session: Involve all stakeholders to review the incident, identify root causes, evaluate the effectiveness of the response, and pinpoint areas for improvement.
5.2. Update playbooks and procedures: Incorporate findings from the lessons learned to enhance existing incident response plans, especially for OT/ICS environments.
5.3. Enhance security controls: Implement new or improved security measures to prevent recurrence, focusing on identified vulnerabilities and gaps (e.g., network segmentation, endpoint hardening, threat intelligence integration).
5.4. Review and update asset inventory: Ensure all OT/ICS assets are accurately documented with their configurations, software versions, and network topology.
5.5. Communicate findings: Share relevant non-confidential insights with industry peers and regulatory bodies to contribute to collective defense (e.g., ISACs, CISA).
5.6. Legal and regulatory review: Ensure all reporting and compliance obligations are met.

Internal Notification

Subject: URGENT: OT/ICS Incident - [Brief Description] - [Date/Time]

Team,

This is an urgent notification regarding a detected cyber incident impacting our Operational Technology (OT) / Industrial Control Systems (ICS) environment.

**Incident Status:** [e.g., 'Initial Detection', 'Containment in Progress', 'Recovery Phase']
**Affected Systems/Areas:** [e.g., 'PLC X in Plant Y', 'HMI Z in Control Room A', 'Segment B of Production Line C']
**Initial Impact Assessment:** [e.g., 'Minor disruption to production in Line C', 'Potential compromise of process data', 'Safety systems remain operational', 'Production halted in Plant Y']
**Current Actions:** The Incident Response Team, in collaboration with OT Engineering, is actively working to contain the threat and ensure safety. Specific actions include: [e.g., 'Network segmentation of affected area', 'Forensic analysis on HMI Z', 'Verification of PLC X integrity'].

All personnel are reminded to adhere strictly to incident response protocols. Please do not attempt to access affected systems or make any changes without explicit direction from the Incident Commander.

Further updates will be provided via [War Room Channel/Email] every [X hours/minutes] or as significant developments occur.

**Incident Commander:** [Name] ([Contact Info])
**OT Lead:** [Name] ([Contact Info])

Your cooperation and vigilance are critical.

[Company Leadership/CISO]

External / Customer Notification

Subject: [Company Name] - Critical Operational Technology Incident Notification

[Date]

FOR IMMEDIATE RELEASE / CONFIDENTIAL - FOR REGULATORY BODIES AND LAW ENFORCEMENT

[Company Name] is providing this notification regarding a detected cybersecurity incident impacting a portion of our Operational Technology (OT) environment.

Upon detection, our robust incident response protocols were immediately activated. Our internal teams, including cybersecurity experts and OT engineers, are actively engaged in managing the situation. We have also engaged [e.g., external cybersecurity specialists, law enforcement (FBI/CISA)] to assist in our efforts.

**Current Status:** Our primary focus remains on ensuring the safety of personnel, protecting the environment, and maintaining the integrity of our critical operations. We are diligently working to contain the incident and restore full operational capabilities securely and efficiently.

**Impact Assessment:** We are currently conducting a thorough investigation to determine the full scope and impact of this incident. At this time, [Company Name] is [e.g., 'experiencing limited operational disruption in X area', 'working to mitigate potential impacts on Y process']. We are taking all necessary measures to prevent further unauthorized activity.

**Our Commitment:** The security and reliability of our operations are paramount. We are committed to a comprehensive and transparent response. We will provide further updates as our investigation progresses and as appropriate.

For further inquiries, please contact:
[Media Contact Name/Email]
[Legal Counsel Name/Email]

[Company Name]

New runbooks ship regularly.

17 operational runbooks and growing. Built for real incident response teams.