Security Automation & SOAR — Deep Dive
Security Automation & SOAR — Deep Dive
CIPHER Training Module — Security Orchestration, Automation, and Response Generated: 2026-03-14
Table of Contents
- SOAR Architecture Patterns
- Playbook Patterns
- Automated IOC Processing Pipelines
- osquery Fleet Queries
- Velociraptor Hunting at Scale
- Security ChatOps Patterns
- Infrastructure-as-Code Security Automation
- Platform Reference Matrix
1. SOAR Architecture Patterns
Core SOAR Components
Every SOAR deployment implements four layers:
+---------------------+
| TRIGGER LAYER | Webhooks, SIEM alerts, email listeners, scheduled polls
+---------------------+
| ORCHESTRATION LAYER| Workflow engine, decision trees, conditional branching
+---------------------+
| INTEGRATION LAYER | API connectors to security tools (200+ typical)
+---------------------+
| ACTION LAYER | Enrichment, containment, notification, remediation
+---------------------+
Open-Source SOAR Platforms
Shuffle (github.com/Shuffle/Shuffle)
- Architecture: Go backend, React frontend, Orborus/Worker execution layer
- Integration model: OpenAPI-based app creation — any tool with a REST API integrates
- Deployment: Docker self-hosted or Google Cloud Marketplace
- Strengths: Visual workflow editor, multi-tenant (org/sub-org), hybrid cloud/on-prem execution
- Use case: MSSPs needing per-client playbook isolation
Cortex XSOAR (github.com/demisto/content)
- Content model: Pack-based — each pack bundles playbooks, integrations, scripts, layouts, and incident types
- Playbook engine: Visual DAG editor with conditional branching, loops, sub-playbook calls, and manual approval gates
- Script runtime: Python/JavaScript automation scripts for custom logic
- Strengths: Largest community content library, mature incident lifecycle management
- Key packs: Phishing, Malware, Vulnerability Management, Access Investigation, MITRE ATT&CK
WALKOFF (NSA, archived 2023)
- Architecture: Microservices on Docker Swarm — API gateway, workers, umpire (orchestrator), socketio (real-time)
- App model: Containerized plugins via App SDK, internal Docker registry
- Execution: Event-driven triggers, worker-distributed task execution
- Status: Archived — useful as reference architecture, not for new deployments
TheHive + Cortex
- TheHive: Case management with alerts, tasks, observables, and TTPs (now commercial via StrangeBee)
- Cortex: Observable analysis engine — analyzers enrich, responders act
- ~39+ public analyzers (VirusTotal, PassiveDNS, sandbox integrations)
- Responders trigger containment actions (block IP, disable account)
- REST API for programmatic bulk analysis
- Python-based analyzer/responder development
- Integration: TheHive sends observables to Cortex for enrichment; Cortex returns structured reports; MISP bidirectional sync
SOAR Design Principles
- Idempotent actions — every automated action must be safe to retry
- Human-in-the-loop gates — destructive actions (quarantine, block, delete) require analyst approval
- Audit trail — every automated decision logged with timestamp, input data, and action taken
- Graceful degradation — if an enrichment API is down, the playbook continues with reduced context rather than failing
- Timeout handling — sandbox detonation, API calls, and approval gates all need configurable timeouts
- Metric collection — track MTTD, MTTR, false positive rate, and analyst intervention rate per playbook
2. Playbook Patterns
2.1 Phishing Response Playbook
Reference: Cortex XSOAR Phishing Pack
TRIGGER: Email reported by user / mail listener / email gateway alert
|
v
[INGEST] Retrieve email from mailbox (EWS, Gmail API, IMAP)
|-- Extract sender, subject, headers, body, attachments, URLs
|-- Create incident record with initial classification
|
v
[ENRICH] Parallel enrichment of all indicators
|-- URLs: VirusTotal, URLhaus, urlscan.io, SSL cert check
|-- Attachments: Sandbox detonation (Cuckoo, Joe Sandbox, ANY.RUN)
|-- Sender domain: WHOIS, passive DNS, domain age, squatting check
|-- Email headers: SPF/DKIM/DMARC validation
|-- Embedded IPs: Reputation check, geolocation, ASN lookup
|-- Screenshot generation of email body and linked pages
|
v
[ANALYZE] Score and classify
|-- Calculate severity from: indicator reputation + email auth results + critical asset involvement
|-- Identify campaign membership (correlate with similar incidents)
|-- Flag known phishing kit indicators (HTML similarity, favicon hash)
|
v
[DECIDE] Severity-based routing
|-- LOW: Auto-close with user notification
|-- MEDIUM: Queue for analyst review with enrichment summary
|-- HIGH/CRITICAL: Escalate + begin containment
|
v
[CONTAIN] (requires approval for HIGH actions)
|-- Block sender domain/IP at email gateway
|-- Block malicious URLs at proxy/firewall
|-- Search-and-destroy: find all instances of the email across mailboxes
|-- Reset credentials if user clicked/submitted credentials
|
v
[NOTIFY]
|-- Inform reporting user (thank them — reinforces reporting behavior)
|-- Alert affected users if credentials compromised
|-- Update threat intel platform (MISP, OpenCTI)
|
v
[CLOSE] Document findings, update detection rules, close incident
Key automations:
- SPF/DKIM/DMARC check scripts (pure Python, no external API needed)
- Domain age calculation (< 30 days = suspicious)
- Levenshtein distance for domain squatting detection
- Attachment hash lookup before sandbox submission (save detonation resources)
2.2 Malware Triage Playbook
TRIGGER: EDR alert / AV detection / sandbox verdict / file submission
|
v
[COLLECT]
|-- Retrieve file hash (MD5, SHA1, SHA256)
|-- Acquire sample if not already available
|-- Collect process tree, parent process, command line, network connections
|-- Snapshot memory of affected endpoint (if high severity)
|
v
[REPUTATION CHECK] (fast path — seconds)
|-- Hash lookup: VirusTotal, Hybrid Analysis, MalwareBazaar
|-- If known malicious (>5 AV detections) → skip to CLASSIFY
|-- If known clean (0 detections, signed by trusted publisher) → auto-close
|-- If unknown → proceed to detonation
|
v
[DETONATE] (slow path — minutes)
|-- Submit to sandbox (Cuckoo, ANY.RUN, Joe Sandbox)
|-- Configure sandbox: appropriate OS version, internet connectivity, analysis time
|-- Extract: dropped files, registry changes, network IOCs, MITRE TTPs
|-- YARA rule matching against sample
|
v
[CLASSIFY]
|-- Map to malware family (if identifiable)
|-- Map behaviors to MITRE ATT&CK techniques
|-- Determine: ransomware, RAT, infostealer, loader, wiper, cryptominer
|-- Assess lateral movement risk and data exfiltration indicators
|
v
[SCOPE] Determine blast radius
|-- Search fleet for same hash (osquery, Velociraptor, EDR)
|-- Search for behavioral IOCs: mutexes, registry keys, C2 domains
|-- Identify patient zero and propagation path
|
v
[CONTAIN]
|-- Isolate affected endpoints (EDR network isolation)
|-- Block C2 domains/IPs at firewall and DNS
|-- Block hash at EDR/AV policy
|-- Disable compromised service accounts
|
v
[ERADICATE]
|-- Remove malware artifacts (files, registry, scheduled tasks, services)
|-- Patch exploitation vector
|-- Rotate compromised credentials
|
v
[RECOVER + REPORT]
|-- Restore from clean backup if needed
|-- Verify endpoint integrity post-remediation
|-- Update YARA rules, Sigma rules, EDR custom detections
|-- Publish IOCs to MISP/OpenCTI
2.3 IOC Enrichment Playbook
TRIGGER: New IOC ingested (from feed, analyst submission, or alert extraction)
|
v
[CLASSIFY IOC TYPE]
|-- IP → reputation, geolocation, ASN, passive DNS, Shodan
|-- Domain → WHOIS, DNS history, certificate transparency, categorization
|-- Hash → VT, MalwareBazaar, Hybrid Analysis, YARA match
|-- URL → urlscan.io, Google Safe Browsing, PhishTank
|-- Email → breach databases, WHOIS on domain, header analysis
|
v
[MULTI-SOURCE ENRICHMENT] (parallel API calls)
|-- Primary: VirusTotal, AbuseIPDB, Shodan, urlscan.io
|-- Secondary: PassiveTotal, DomainTools, GreyNoise, OTX
|-- Internal: SIEM correlation, historical incident match, asset lookup
|
v
[SCORE]
|-- Aggregate confidence scores across sources
|-- Weight by source reliability (internal > curated feed > community)
|-- Apply TLP classification based on source restrictions
|
v
[CORRELATE]
|-- Link to existing incidents/campaigns
|-- Map to threat actor (if attributable)
|-- Identify related IOCs via pivot (shared infrastructure, registration patterns)
|
v
[DISTRIBUTE]
|-- HIGH confidence → auto-push to blocking (firewall, proxy, DNS sinkhole)
|-- MEDIUM confidence → stage for analyst review, push to monitoring
|-- LOW confidence → store in TIP for future correlation only
|-- Update MISP events, OpenCTI entities, SIEM watchlists
2.4 Vulnerability Remediation Playbook
TRIGGER: New vulnerability scan results / CVE advisory / Trivy-operator finding
|
v
[INGEST + DEDUPLICATE]
|-- Parse scan results (Nessus, Qualys, Trivy, Grype)
|-- Deduplicate across scanners and previous scans
|-- Normalize to common schema (CVE, CVSS, affected asset, remediation)
|
v
[PRIORITIZE] (not all criticals are equal)
|-- CVSS score (base)
|-- EPSS score (probability of exploitation)
|-- KEV catalog membership (CISA Known Exploited Vulnerabilities)
|-- Asset criticality (crown jewel, internet-facing, PII-processing)
|-- Compensating controls in place (WAF, network segmentation, EDR)
|-- Exploit maturity (POC available, weaponized, actively exploited)
|
|-- Priority = f(CVSS, EPSS, KEV, asset_criticality, compensating_controls, exploit_maturity)
|
v
[ASSIGN + TRACK]
|-- Route to asset owner via CMDB lookup
|-- Create ticket (Jira, ServiceNow) with SLA based on priority:
| P1 (Critical, exploited): 24 hours
| P2 (High, exploit available): 7 days
| P3 (High, no exploit): 30 days
| P4 (Medium): 90 days
|-- Attach remediation guidance (vendor patch, workaround, config change)
|
v
[VERIFY]
|-- Trigger targeted rescan after remediation window
|-- Confirm vulnerability resolved
|-- If unresolved → escalate to asset owner's manager
|-- If accepted risk → document risk acceptance with expiry date
|
v
[REPORT]
|-- Track mean-time-to-remediate by team, severity, and asset type
|-- Identify systemic issues (same vuln across fleet = process problem)
|-- Generate compliance evidence (PCI DSS 11.3, SOC 2 CC7.1)
3. Automated IOC Processing Pipelines
Pipeline Architecture
+-----------+ +----------+ +---------+ +----------+ +-----------+
| COLLECTORS| --> | PARSERS | --> | ENRICHERS| --> | SCORERS | --> | EXPORTERS |
+-----------+ +----------+ +---------+ +----------+ +-----------+
RSS feeds Normalize to API-based Confidence MISP events
Twitter STIX2/internal enrichment scoring Firewall rules
Email schema (VT, Shodan) Dedup SIEM watchlists
Paste sites Geolocation Correlation DNS sinkhole
GitHub WHOIS Slack alerts
SQS queues
Threat feeds
ThreatIngestor (github.com/InQuest/ThreatIngestor)
Source-to-operator pipeline for automated IOC extraction:
Sources (collectors):
- Twitter feeds (threat researcher accounts)
- RSS/Atom feeds (vendor advisories, threat blogs)
- GitHub searches and Gists (malware config dumps)
- Web pages (generic scraping)
- SQS queues (cloud event integration)
- XML sitemaps
Extraction engine:
- Regex-based IOC identification: IPs, domains, URLs, email addresses, hashes
- YARA rule extraction from blog posts and reports
- OCR-enabled extraction from screenshots/images (critical — threat actors share IOCs as images to evade automated collection)
Operators (outputs):
- MISP (direct event creation)
- ThreatKB (structured storage)
- MySQL/SQLite/CSV (local persistence)
- SQS (downstream pipeline integration)
Configuration pattern:
# threatingestor.yml
sources:
- name: twitter-researchers
module: twitter
credentials: ...
q: "#malware OR #phishing"
- name: abuse-ch-feeds
module: rss
url: https://feodotracker.abuse.ch/browse/
feed_type: messy
operators:
- name: misp-export
module: misp
url: https://misp.internal
key: $MISP_API_KEY
tags: ["automated", "threatingestor"]
- name: local-db
module: sqlite
database: /var/lib/threatingestor/iocs.db
IntelMQ (github.com/certtools/intelmq)
Enterprise-grade bot-based threat intelligence processing:
Bot taxonomy:
Collector Bots → Parser Bots → Expert Bots → Output Bots
- Collector bots: Ingest from feeds (CERT feeds, abuse.ch, Shadowserver, custom)
- Parser bots: Normalize heterogeneous formats into IntelMQ's harmonized JSON schema
- Expert bots: Enrich (GeoIP, ASN, RDAP, DNS), deduplicate, filter, transform
- Output bots: Route to PostgreSQL, Elasticsearch, Splunk, MISP, email, REST API, files
Key design properties:
- Persistent message queue (survives crashes)
- JSON message format throughout
- Harmonized data standard across all feeds
- Pipeline topology managed via IntelMQ Manager (web GUI) or API
Deployment pattern:
[ShadowServer Feed] → [Collector] → [Parser] → [Dedup Expert] → [GeoIP Expert] → [PostgreSQL Output]
→ [MISP Output]
[Abuse.ch URLhaus] → [Collector] → [Parser] ─────────────────┘
[Custom CSV Feed] → [Collector] → [Parser] ─────────────────┘
MISP Modules (github.com/MISP/misp-modules)
Modular enrichment/import/export for threat intelligence platforms:
Module types:
| Type | Count | Function | Examples |
|---|---|---|---|
| Expansion | 150+ | Enrich observables via APIs | VirusTotal, PassiveDNS, Shodan, CIRCL hashlookup |
| Import | 15+ | Ingest external formats | CSV, TAXII 2.1, Cuckoo reports, Joe Sandbox, email |
| Export | 10+ | Generate detection artifacts | YARA rules, CEF/syslog, KQL (Defender), osquery packs |
| Action | 5+ | Event-driven automation | Slack/Mattermost notifications, Sentinel/Defender export |
Architecture:
- Standalone REST API (independent of MISP installation)
- Auto-generated OpenAPI spec at
/openapi.json+ Swagger UI at/openapi - Python 3 modules — straightforward to develop custom modules:
# Minimal MISP expansion module structure
misp_attributes = {
'input': ['ip-src', 'ip-dst'],
'output': ['text', 'freetext']
}
def handler(q=False):
request = json.loads(q)
ip = request['attribute']['value']
# Enrichment logic here
result = enrich_ip(ip)
return {'results': [{'types': ['text'], 'values': [result]}]}
def introspection():
return misp_attributes
def version():
return {'version': '1.0', 'author': 'CIPHER'}
OpenCTI Connectors (github.com/opencti-platform/opencti)
Structured threat intelligence platform with automated connector ecosystem:
- Data model: STIX2 native — all entities and relationships stored as STIX bundles
- Import connectors: MISP, MITRE ATT&CK, CVE/NVD, AlienVault OTX, abuse.ch, VirusTotal
- Export connectors: MISP sync, Elastic, Splunk, custom webhook
- Enrichment connectors: Automated relationship inference from existing data
- Stream connectors: Real-time data flow to downstream consumers
- Key feature: Automated relationship inference — "new relations may be inferred from existing ones" once analysts process data
4. osquery Fleet Queries
Configuration Architecture
{
"options": {
"logger_plugin": "tls",
"config_plugin": "tls",
"host_identifier": "uuid",
"schedule_splay_percent": 10,
"events_expiry": 3600
},
"schedule": {
"query_name": {
"query": "SELECT ...",
"interval": 3600,
"removed": false,
"snapshot": true,
"platform": "linux"
}
},
"decorators": {
"load": [
"SELECT hostname AS hostname FROM system_info;",
"SELECT uuid AS host_uuid FROM osquery_info;"
]
},
"packs": {
"security-baseline": "/etc/osquery/packs/security-baseline.conf"
}
}
Differential logging: By default only changes are logged after the first run. Set "snapshot": true for full result sets (compliance audits). Set "removed": false to suppress deletion events (noisy tables).
Sharding: "shard": 10 runs the query on ~10% of fleet — use for expensive queries during rollout.
Discovery queries: Conditionally activate packs (e.g., only monitor MySQL tables if mysqld is running).
Threat Detection Queries
Persistence Mechanisms
-- Linux: Cron-based persistence
SELECT * FROM crontab
WHERE command NOT IN (SELECT command FROM crontab_whitelist)
AND command LIKE '%curl%' OR command LIKE '%wget%'
OR command LIKE '%python%' OR command LIKE '%bash -c%';
-- Linux: Systemd unit persistence
SELECT * FROM systemd_units
WHERE active_state = 'active'
AND source_path NOT LIKE '/usr/lib/systemd/%'
AND source_path NOT LIKE '/lib/systemd/%';
-- macOS: LaunchDaemon persistence
SELECT * FROM launchd
WHERE run_at_load = 1
AND path NOT LIKE '/System/%'
AND path NOT LIKE '/Library/Apple/%';
-- Windows: Run key persistence
SELECT * FROM registry
WHERE key LIKE 'HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows\CurrentVersion\Run%'
OR key LIKE 'HKEY_CURRENT_USER\SOFTWARE\Microsoft\Windows\CurrentVersion\Run%';
-- Scheduled tasks (Windows)
SELECT name, action, path, enabled, last_run_time
FROM scheduled_tasks
WHERE enabled = 1
AND hidden = 1;
Process Anomalies
-- Processes running from tmp/unusual directories
SELECT p.pid, p.name, p.path, p.cmdline, p.uid, u.username,
p.parent, pp.name AS parent_name
FROM processes p
LEFT JOIN users u ON p.uid = u.uid
LEFT JOIN processes pp ON p.parent = pp.pid
WHERE p.path LIKE '/tmp/%'
OR p.path LIKE '/dev/shm/%'
OR p.path LIKE '/var/tmp/%'
OR p.path LIKE '%/.%'; -- hidden directories
-- Processes with deleted binaries (classic indicator of code injection or fileless malware)
SELECT pid, name, path, cmdline, on_disk
FROM processes
WHERE on_disk = 0;
-- Processes with abnormal parent-child relationships
-- (e.g., cmd.exe spawned by Excel — T1204.002)
SELECT p.pid, p.name, p.cmdline, pp.name AS parent_name, pp.cmdline AS parent_cmdline
FROM processes p
JOIN processes pp ON p.parent = pp.pid
WHERE pp.name IN ('excel.exe', 'winword.exe', 'powerpnt.exe', 'outlook.exe')
AND p.name IN ('cmd.exe', 'powershell.exe', 'wscript.exe', 'cscript.exe', 'mshta.exe');
Network Anomalies
-- Listening ports with process context
SELECT lp.port, lp.protocol, lp.address,
p.pid, p.name, p.path, p.cmdline, h.sha256
FROM listening_ports lp
JOIN processes p ON lp.pid = p.pid
LEFT JOIN hash h ON p.path = h.path
WHERE lp.port NOT IN (22, 80, 443, 8080, 8443, 3306, 5432)
AND lp.address != '127.0.0.1';
-- Active connections to known-bad TLDs
SELECT pa.pid, p.name, p.cmdline, pa.remote_address, pa.remote_port, pa.local_port
FROM process_open_sockets pa
JOIN processes p ON pa.pid = p.pid
WHERE pa.remote_address != '127.0.0.1'
AND pa.remote_address != '::1'
AND pa.remote_port IN (4444, 5555, 8888, 1337, 9001, 6667); -- common C2 ports
-- DNS cache anomalies (Windows)
SELECT * FROM dns_cache
WHERE type = 'A'
AND (name LIKE '%.top' OR name LIKE '%.xyz' OR name LIKE '%.buzz'
OR name LIKE '%.club' OR name LIKE '%.work');
Integrity Monitoring
-- SUID/SGID binaries (privilege escalation surface)
SELECT path, filename, mode, uid, gid, sha256
FROM hash
JOIN file ON hash.path = file.path
WHERE file.directory IN ('/usr/bin', '/usr/sbin', '/usr/local/bin')
AND (file.mode LIKE '%s%');
-- File integrity baseline deviation
SELECT f.path, f.filename, f.size, f.mtime, h.sha256
FROM file f
JOIN hash h ON f.path = h.path
WHERE f.directory = '/etc'
AND f.mtime > (SELECT CAST(strftime('%s', 'now', '-24 hours') AS INTEGER));
-- Kernel modules (rootkit detection)
SELECT name, size, used_by, status
FROM kernel_modules
WHERE status = 'Live'
AND name NOT IN (SELECT name FROM baseline_kernel_modules);
Container Security
-- Privileged containers
SELECT id, name, image, privileged, security_options
FROM docker_containers
WHERE privileged = 1;
-- Containers with host network
SELECT id, name, image, network_mode
FROM docker_containers
WHERE network_mode = 'host';
-- Container processes running as root
SELECT c.name AS container_name, p.pid, p.name AS process_name, p.uid
FROM docker_container_processes cp
JOIN docker_containers c ON cp.id = c.id
JOIN processes p ON cp.pid = p.pid
WHERE p.uid = 0;
Fleet Management Patterns
Fleet managers: Fleet (fleetdm.com), Kolide, OSCTRL, Zentral
Deployment automation (systemd):
[Unit]
Description=osquery daemon
After=network.target
[Service]
ExecStart=/usr/bin/osqueryd \
--config_path=/etc/osquery/osquery.conf \
--config_plugin=tls \
--tls_hostname=fleet.internal:8412 \
--enroll_secret_path=/etc/osquery/enroll_secret \
--logger_plugin=tls \
--host_identifier=uuid
Restart=on-failure
RestartSec=5
[Install]
WantedBy=multi-user.target
Query pack distribution strategy:
- Base pack: runs everywhere (process inventory, listening ports, user accounts)
- Server pack: web server configs, database processes, certificate expiry
- Workstation pack: browser extensions, USB devices, screen lock policy
- Incident pack: activated during IR — high-frequency queries for specific IOCs
- Compliance pack: CIS benchmark checks, mapped to control IDs
5. Velociraptor Hunting at Scale
VQL Fundamentals
VQL is the query language powering Velociraptor's collection and hunting capabilities. Unlike osquery's SQL, VQL supports:
- Event queries: Non-terminating queries that emit rows asynchronously as events occur
- Artifact packaging: Queries bundled with metadata, parameters, and descriptions
- Plugin ecosystem: Extensible via Go/Python plugins
- Server-side and client-side execution: Queries run on endpoints or server
Core VQL patterns:
-- Basic process listing with hash
SELECT Pid, Name, Exe, CommandLine,
hash(path=Exe) AS Hash
FROM pslist()
-- File search with YARA
SELECT FullPath, Size, Mtime
FROM glob(globs="/tmp/**")
WHERE yara(rules="rule test { strings: $a = \"malware\" condition: $a }", filename=FullPath)
-- Event monitoring (non-terminating — runs until cancelled)
SELECT * FROM watch_etw(guid="{some-etw-provider-guid}")
WHERE EventID = 1 -- process creation
Artifact Structure
Artifacts are the fundamental unit of Velociraptor automation:
name: Custom.Windows.Detection.SuspiciousProcess
description: |
Detect processes spawned from unusual locations.
Maps to MITRE T1059 (Command and Scripting Interpreter).
parameters:
- name: SuspiciousPaths
type: csv
default: |
Path
C:\Users\*\AppData\Local\Temp\*
C:\Windows\Temp\*
C:\PerfLogs\*
sources:
- query: |
LET suspicious_paths <= SELECT Path FROM parse_csv(
filename=SuspiciousPaths, accessor="data")
SELECT Pid, Name, Exe, CommandLine, Username,
hash(path=Exe) AS Hash,
timestamp(epoch=CreateTime) AS StartTime
FROM pslist()
WHERE Exe =~ join(array=suspicious_paths.Path, sep="|")
Hunt Patterns
Lateral Movement Detection
-- WMI process creation (T1047)
SELECT EventTime, Computer, UserName, CommandLine, ParentImage
FROM source(artifact="Windows.EventLogs.EvtxHunter")
WHERE Channel = "Microsoft-Windows-Sysmon/Operational"
AND EventID = 1
AND ParentImage =~ "WmiPrvSE"
-- PsExec detection (T1570)
SELECT Name, PathName, StartMode, StartName
FROM wmi(query="SELECT * FROM Win32_Service")
WHERE Name =~ "PSEXESVC"
OR PathName =~ "psexe"
-- RDP session enumeration (T1021.001)
SELECT * FROM source(artifact="Windows.EventLogs.RDPAuth")
WHERE LogonType = 10
Credential Access Detection
-- LSASS access (T1003.001)
SELECT EventTime, SourceImage, TargetImage, GrantedAccess
FROM source(artifact="Windows.EventLogs.EvtxHunter")
WHERE Channel = "Microsoft-Windows-Sysmon/Operational"
AND EventID = 10
AND TargetImage =~ "lsass.exe"
AND GrantedAccess =~ "0x1010|0x1038|0x1FFFFF"
-- SAM registry hive access (T1003.002)
SELECT EventTime, Image, TargetObject
FROM source(artifact="Windows.EventLogs.EvtxHunter")
WHERE Channel = "Microsoft-Windows-Sysmon/Operational"
AND EventID IN (12, 13)
AND TargetObject =~ "SAM|SECURITY|SYSTEM"
Persistence Detection
-- Scheduled task creation (T1053.005)
SELECT * FROM source(artifact="Windows.System.TaskScheduler")
WHERE ActionType = "Exec"
AND (ActionArgs =~ "powershell|cmd|wscript|cscript|mshta|rundll32"
OR TaskPath NOT LIKE '\Microsoft\%')
-- WMI event subscription persistence (T1546.003)
SELECT * FROM wmi(
query="SELECT * FROM __EventConsumer",
namespace="root/subscription"
)
-- Startup folder monitoring
SELECT FullPath, Size, Mtime, hash(path=FullPath) AS Hash
FROM glob(globs="C:/Users/*/AppData/Roaming/Microsoft/Windows/Start Menu/Programs/Startup/*")
Event Monitoring (Real-Time Detection)
VQL event queries power continuous endpoint monitoring:
-- Real-time process creation monitoring
SELECT * FROM watch_syslog()
WHERE Facility = "auth"
AND Message =~ "failed password"
-- File integrity monitoring via event queries
SELECT * FROM watch_monitoring(artifact="Windows.Events.FileCreation")
WHERE FullPath =~ "System32|SysWOW64"
AND NOT FullPath =~ "\\.log$|\\.tmp$"
-- USB device insertion monitoring
SELECT * FROM watch_monitoring(artifact="Windows.Events.USBDevices")
Event query properties:
- Non-terminating: run indefinitely until cancelled or timeout
- Asynchronous: rows emitted immediately as events occur
- Batched delivery: client buffers rows before sending to server (network efficiency)
- Client-side execution: reduces server load, enables offline detection
Fleet Hunt Workflow
1. Create artifact (or use community artifact exchange)
2. Launch hunt targeting specific labels/OS/criteria
3. Monitor progress in real-time via hunt dashboard
4. Review results, pivot on findings
5. Remediate via response artifacts (kill process, quarantine file, collect forensic image)
Scale considerations:
- Use labels to segment fleet (production, development, DMZ, crown jewels)
- Stagger hunts to avoid endpoint resource exhaustion
- Use
LIMITclauses in VQL to cap per-endpoint results - Schedule hunts during low-activity periods for resource-intensive collection
6. Security ChatOps Patterns
Architecture
+-------------------+ +------------------+ +------------------+
| CHAT PLATFORM | <-> | BOT / WEBHOOK | <-> | SOAR / TOOLING |
| (Slack, Teams, | | (custom bot, | | (Shuffle, XSOAR, |
| Mattermost, | | n8n, Errbot) | | Wazuh, osquery) |
| Discord) | | | | |
+-------------------+ +------------------+ +------------------+
Command Patterns
Enrichment commands (read-only, safe to automate fully):
/ioc lookup 8.8.8.8 → VirusTotal + AbuseIPDB + Shodan summary
/hash check abc123def... → VT + MalwareBazaar + sandbox status
/domain info evil.example → WHOIS + DNS + cert transparency + reputation
/cve info CVE-2024-1234 → CVSS, EPSS, KEV status, affected products
Investigation commands (read-only, analyst context):
/hunt process --name "mimikatz" --fleet production
/osquery "SELECT * FROM processes WHERE name = 'nc'" --label webservers
/siem search "src_ip=10.1.2.3 AND action=blocked" --last 24h
/case create --title "Suspicious auth" --severity medium --assignee @analyst
Response commands (require approval/RBAC):
/block ip 203.0.113.50 --duration 24h --reason "C2 activity" # requires SOC-L2 role
/isolate host WORKSTATION-42 --reason "malware confirmed" # requires SOC-L2 role
/disable-user jsmith --reason "credential compromise" # requires SOC-L3 role
/quarantine hash abc123 --scope enterprise # requires SOC-L3 role
n8n Security Workflow Patterns
n8n (github.com/n8n-io/n8n) as lightweight security automation:
Alert enrichment webhook:
Webhook trigger (SIEM alert JSON)
→ Extract IOCs (Function node, regex)
→ Parallel HTTP requests (VT, AbuseIPDB, Shodan)
→ Aggregate results (Function node)
→ Decision node (severity threshold)
→ HIGH: Create Jira ticket + Slack alert + block at firewall
→ LOW: Log to Elasticsearch + weekly digest email
Scheduled threat feed sync:
Cron trigger (every 6 hours)
→ HTTP Request: fetch abuse.ch URLhaus CSV
→ CSV Parse node
→ Deduplicate against existing IOCs (database lookup)
→ New IOCs → push to MISP API + update firewall blocklist
→ Slack notification: "Added X new IOCs from URLhaus"
Security ChatOps bot via n8n:
Slack trigger (slash command /security-lookup)
→ Parse user input (IP, domain, hash)
→ Route to appropriate enrichment chain
→ Format results as Slack block kit message
→ Reply in thread
ChatOps Security Best Practices
- RBAC enforcement — map chat platform roles to action permissions (viewer, analyst, responder, admin)
- Audit logging — log every command: who, when, what, from which channel
- Approval workflows — destructive actions require second analyst confirmation via reaction/button
- Rate limiting — prevent abuse of enrichment APIs via bot commands
- Channel separation — dedicated channels per severity/team; don't mix alerts with discussion
- Redaction — auto-redact sensitive data (credentials, PII) from bot responses
- Ephemeral responses — sensitive lookups return ephemeral messages (visible only to requester)
- Runbook links — every alert message includes link to relevant response runbook
7. Infrastructure-as-Code Security Automation
Wazuh Active Response Automation
Wazuh (github.com/wazuh/wazuh) provides rule-triggered automated response:
Response types:
- Stateful: Execute action, automatically revert after timeout (e.g., block IP for 1 hour)
- Stateless: One-time action, no automatic reversion (e.g., collect forensic snapshot)
Configuration pattern:
<!-- ossec.conf — server side -->
<command>
<name>firewall-drop</name>
<executable>firewall-drop</executable>
<timeout_allowed>yes</timeout_allowed>
</command>
<active-response>
<command>firewall-drop</command>
<location>local</location>
<rules_id>5712</rules_id> <!-- SSH brute force rule -->
<timeout>3600</timeout> <!-- Block for 1 hour -->
<repeated_offenders>30,60,120</repeated_offenders> <!-- Escalating blocks -->
</active-response>
Custom active response script (Python):
#!/usr/bin/env python3
"""Custom Wazuh active response: isolate host via EDR API."""
import sys
import json
import requests
def main():
# Wazuh passes alert data via stdin
alert = json.loads(sys.stdin.read())
src_ip = alert.get("parameters", {}).get("alert", {}).get("data", {}).get("srcip")
agent_id = alert.get("parameters", {}).get("alert", {}).get("agent", {}).get("id")
if not src_ip:
sys.exit(1)
# Call EDR API to isolate the endpoint
response = requests.post(
"https://edr.internal/api/v1/isolate",
json={"agent_id": agent_id, "reason": "Wazuh active response"},
headers={"Authorization": f"Bearer {EDR_TOKEN}"}
)
# Log action for audit trail
with open("/var/ossec/logs/active-responses.log", "a") as f:
f.write(f"Isolated agent {agent_id} (IP: {src_ip}) - Status: {response.status_code}\n")
if __name__ == "__main__":
main()
Common automated responses:
| Trigger | Action | Timeout |
|---|---|---|
| SSH brute force (rule 5712) | firewall-drop |
1 hour, escalating |
| Web attack (rule 31100+) | firewall-drop + WAF rule |
24 hours |
| Malware detection (rule 554) | Isolate endpoint + collect artifacts | Manual revert |
| File integrity violation (rule 550+) | Alert + snapshot changed files | N/A |
| Unauthorized port scan (rule 581) | firewall-drop |
6 hours |
| Rootkit detection (rule 510+) | Isolate endpoint + forensic collection | Manual revert |
Container Security Automation
apko — Distroless Image Pipeline (github.com/chainguard-dev/apko)
Declarative, reproducible container builds with minimal attack surface:
# apko.yaml — security-hardened base image
contents:
repositories:
- https://packages.wolfi.dev/os
packages:
- wolfi-baselayout
- ca-certificates-bundle
- python-3.12
# No shell, no package manager, no debug tools
accounts:
groups:
- groupname: app
gid: 65532
users:
- username: app
uid: 65532
gid: 65532
run-as: 65532 # Non-root
entrypoint:
command: /usr/bin/python3
archs:
- x86_64
- aarch64
Security properties:
- Reproducible: identical inputs produce bitwise-identical images (supply chain integrity)
- Distroless: no shell, no package manager — drastically reduced attack surface
- SBOM auto-generation: every build produces SPDX/CycloneDX SBOM
- Non-root by default: user/group configuration enforced at build time
CI/CD integration pattern:
# GitHub Actions — build + sign + scan
- name: Build image
run: apko build apko.yaml app:latest app.tar
- name: Generate SBOM
run: apko build --sbom-path sbom.spdx.json apko.yaml app:latest app.tar
- name: Scan with Trivy
run: trivy image --input app.tar --severity HIGH,CRITICAL --exit-code 1
- name: Sign with Cosign
run: cosign sign --key cosign.key app:latest
- name: Attest SBOM
run: cosign attest --key cosign.key --predicate sbom.spdx.json --type spdx app:latest
Trivy-Operator — Kubernetes Continuous Security (github.com/aquasecurity/trivy-operator)
Kubernetes-native security scanning automation:
What it scans (automatically, continuously):
- Vulnerability reports: Container images in running pods + control plane components
- ConfigAudit reports: Kubernetes resource configurations against best practices / OPA policies
- RBAC assessment: Access rights analysis across cluster resources
- Secret scanning: Exposed secrets in cluster resources
- Compliance reports: NSA/CISA hardening guide, CIS Kubernetes Benchmark v1.23, Pod Security Standards
- SBOM generation: Software bill of materials for running workloads
- Infra assessment: etcd, API server, scheduler, controller-manager scanning
How it works:
- Watches Kubernetes API for state changes (pod creation, deployment updates)
- Triggers Trivy scans automatically on changes
- Stores results as Kubernetes CRDs (queryable via
kubectl) - Integrates with Prometheus for metrics and alerting
Deployment:
helm repo add aqua https://aquasecurity.github.io/helm-charts/
helm install trivy-operator aqua/trivy-operator \
--namespace trivy-system \
--create-namespace \
--set trivy.severity=HIGH,CRITICAL \
--set compliance.cron="0 */6 * * *"
Query results:
# List all vulnerability reports
kubectl get vulnerabilityreports -A -o wide
# Find critical vulnerabilities
kubectl get vulnerabilityreports -A -o json | \
jq '.items[] | select(.report.summary.criticalCount > 0) |
{namespace: .metadata.namespace, name: .metadata.name,
critical: .report.summary.criticalCount}'
# Check compliance status
kubectl get clustercompliancereports -o wide
IaC Security Scanning Pipeline
# GitLab CI / GitHub Actions pattern
stages:
- scan-iac
- scan-secrets
- scan-containers
- scan-dependencies
- gate
scan-terraform:
stage: scan-iac
script:
- tfsec . --format json --out tfsec-results.json
- checkov -d . --framework terraform --output json > checkov-results.json
- trivy config . --severity HIGH,CRITICAL --exit-code 1
scan-secrets:
stage: scan-secrets
script:
- trufflehog git file://. --only-verified --json > secrets.json
- gitleaks detect --source . --report-path gitleaks.json
# Fail pipeline if verified secrets found
- "[ $(jq length secrets.json) -eq 0 ]"
scan-containers:
stage: scan-containers
script:
- trivy image $CI_REGISTRY_IMAGE:$CI_COMMIT_SHA --severity HIGH,CRITICAL
- grype $CI_REGISTRY_IMAGE:$CI_COMMIT_SHA --fail-on high
scan-dependencies:
stage: scan-dependencies
script:
- trivy fs . --scanners vuln --severity HIGH,CRITICAL
- pip-audit --format json --output pip-audit.json || true
- npm audit --json > npm-audit.json || true
security-gate:
stage: gate
script:
- python3 scripts/aggregate-scan-results.py # Custom: aggregate, deduplicate, enforce policy
# Policy: 0 critical, <=5 high (with exceptions list)
8. Platform Reference Matrix
| Platform | Type | Language | Deployment | Best For | Status |
|---|---|---|---|---|---|
| Shuffle | SOAR | Go/React | Docker, Cloud | Open-source SOAR, MSSPs | Active |
| Cortex XSOAR | SOAR | Python/JS | Commercial | Enterprise SOC, largest content library | Active (commercial) |
| TheHive/Cortex | Case Mgmt + Enrichment | Scala/Python | Docker | IR teams, observable analysis | Commercial (StrangeBee) |
| Wazuh | HIDS + Response | C/Python | Packages, Docker, K8s | Fleet monitoring, compliance, active response | Active |
| n8n | Workflow Automation | TypeScript | Docker, npm | Lightweight security automation, ChatOps | Active |
| osquery | Endpoint Visibility | C++ | Agent | Fleet-wide SQL queries, compliance | Active |
| Velociraptor | DFIR + Hunting | Go/VQL | Binary, Docker | Threat hunting at scale, IR | Active |
| MISP | TIP | Python/PHP | Packages, Docker | Threat intel sharing, IOC management | Active |
| OpenCTI | TIP | TypeScript/Python | Docker | Structured CTI, STIX2-native | Active |
| IntelMQ | TI Processing | Python | Packages, Docker | Automated feed processing, CERTs | Active |
| ThreatIngestor | IOC Extraction | Python | pip | Automated IOC collection from OSINT | Maintenance mode |
| Trivy-Operator | K8s Security | Go | Helm | Continuous K8s security scanning | Active |
| apko | Container Build | Go | Binary, CI | Distroless, supply chain security | Active |
| WALKOFF | SOAR | Python/TS | Docker Swarm | Reference architecture (archived) | Archived (2023) |
Integration Topology
┌─────────────┐
│ CHAT OPS │
│ Slack/Teams │
└──────┬──────┘
│
┌──────────┐ ┌──────────┐ ┌──────────┴──────────┐ ┌──────────┐
│ THREAT │ │ SIEM │ │ SOAR │ │ TICKET │
│ INTEL │◄──►│ Elastic │───►│ Shuffle / XSOAR │───►│ SYSTEM │
│ MISP │ │ Splunk │ │ - Playbooks │ │ Jira │
│ OpenCTI │ │ Wazuh │ │ - Enrichment │ │ SNOW │
└────┬─────┘ └──────────┘ │ - Response │ └──────────┘
│ └──────────┬───────────┘
│ │
│ ┌──────────────────────────┴───────────────────┐
│ │ │
┌────┴─────┐ │ ┌──────────┐ ┌──────────┐ ┌───────────┐ │
│ IOC │ │ │ osquery │ │ Veloci- │ │ EDR │ │
│ PIPELINE │ │ │ Fleet │ │ raptor │ │ Crowdstrike│ │
│ IntelMQ │ │ │ Queries │ │ Hunts │ │ Defender │ │
│ Ingestor │ │ └──────────┘ └──────────┘ └───────────┘ │
└──────────┘ │ ENDPOINTS │
└───────────────────────────────────────────────┘
Key Takeaways
- Playbooks are DAGs, not scripts — design for parallel enrichment, conditional branching, and human-in-the-loop gates
- Enrich before you block — automated blocking on low-confidence IOCs causes self-inflicted denial of service
- Differential response by confidence — HIGH auto-blocks, MEDIUM stages for review, LOW stores for correlation
- Event-driven > scheduled polling — Velociraptor event queries and webhook-triggered SOAR beat cron-based checks
- Measure automation ROI — track MTTD, MTTR, analyst hours saved, false positive rate per playbook
- Start with enrichment, graduate to response — build trust in automation before enabling auto-containment
- Pipeline resilience — every integration point needs timeout handling, retry logic, and graceful degradation
- Audit everything — automated actions without audit trails are operational and legal liabilities