Detect Gather Victim Host Information in Sumo Logic CSE
This detection identifies adversary attempts to enumerate victim host information during pre-compromise reconnaissance. Because T1592 is a PRE-ATT&CK technique occurring outside the victim network, direct detection is impossible — this rule targets second-order indicators visible from the defender side: automated scanning tools and fingerprinting bots making requests to internet-facing web servers, User-Agent rotation patterns consistent with OS/browser profiling, and rapid enumeration of host-revealing paths such as /robots.txt, /.env, /phpinfo.php, and similar disclosure endpoints. The primary data source is web server access logs (IIS W3C or common log format), which record client IP, User-Agent, and requested paths — the exact data an adversary harvests to profile target host configurations before launching phishing, supply chain, or watering hole operations.
MITRE ATT&CK
- Tactic
- Reconnaissance
- Technique
- T1592 Gather Victim Host Information
- Canonical reference
- https://attack.mitre.org/techniques/T1592/
Sumo Detection Query
_sourceCategory=*web* OR _sourceCategory=*iis* OR _sourceCategory=*apache*
| parse regex "(?<client_ip>\\d{1,3}\\.\\d{1,3}\\.\\d{1,3}\\.\\d{1,3})"
| where useragent matches "*masscan*" or useragent matches "*zgrab*" or useragent matches "*shodan*" or useragent matches "*nuclei*" or uri matches "*/.env*" or uri matches "*phpinfo*"
| count by client_ip, useragent
| where _count > 5
| if(_count > 50, "High", if(_count > 20, "Medium", "Low")) as RiskScore Sumo Logic detection for Gather Victim Host Information (T1592). Uses _sourceCategory path filtering for flexible log routing compatibility, with JSON field extraction and statistical aggregation to surface gather victim host information patterns. Designed for the Sumo Logic Cloud SIEM platform.
Data Sources
Required Tables
False Positives & Tuning
- Legitimate SEO crawlers such as Googlebot, Bingbot, or commercial crawlers (Screaming Frog, Ahrefs, Semrush) may trigger on path enumeration rules — allowlist known crawler IP ranges and User-Agent prefixes
- Internal vulnerability scanners (Nessus, Qualys, Rapid7) run by the security team against web assets will generate identical patterns — exclude known scanner IP ranges via watchlist
- Developer tooling such as curl, wget, or Python requests used legitimately by CI/CD pipelines or deployment scripts may match scanner User-Agent patterns — baseline known build server IPs
Other platforms for T1592
Testing Methodology
Validate this detection against 3 adversary techniques from Atomic Red Team. Each test below lists the behaviour to exercise and the telemetry you should expect to see. Executable commands and cleanup steps are available with Pro.
- Test 1Web Server Fingerprinting via Automated Scanner User-Agent
Expected signal: Web server access log entries (IIS W3C or Apache combined format) showing requests from the test host IP with User-Agents matching masscan, python-requests, Go-http-client, and curl patterns against multiple disclosure paths. ReconScore should trigger at >= 2 given multiple fingerprinting paths and multiple scanner User-Agents.
- Test 2OS-Targeted User-Agent Rotation for Victim Profiling
Expected signal: Access log entries from test IP showing 5 distinct User-Agents representing Windows, Linux, macOS, Android, and iOS across the same path set within a short window. The User-Agent rotation hunting query should fire with UniqueOSHints = 5.
- Test 3Nmap Service and OS Version Detection Scan
Expected signal: Network flow records showing TCP SYN packets to multiple ports from test host. Web server access logs showing nmap User-Agent requests (if HTTP ports scanned). IDS/IPS logs showing port scan detection. Network baseline anomaly for rapid sequential port connections.
Unlock Pro Content
Get the full detection package for T1592 including response playbook, investigation guide, and atomic red team tests.