Gather Victim Host Information
This detection identifies adversary attempts to enumerate victim host information during pre-compromise reconnaissance. Because T1592 is a PRE-ATT&CK technique occurring outside the victim network, direct detection is impossible — this rule targets second-order indicators visible from the defender side: automated scanning tools and fingerprinting bots making requests to internet-facing web servers, User-Agent rotation patterns consistent with OS/browser profiling, and rapid enumeration of host-revealing paths such as /robots.txt, /.env, /phpinfo.php, and similar disclosure endpoints. The primary data source is web server access logs (IIS W3C or common log format), which record client IP, User-Agent, and requested paths — the exact data an adversary harvests to profile target host configurations before launching phishing, supply chain, or watering hole operations.
let ScannerUserAgents = dynamic([
"masscan", "nmap", "zgrab", "nikto", "sqlmap", "nuclei",
"dirbuster", "gobuster", "wfuzz", "ffuf", "whatweb",
"python-requests", "go-http-client", "libwww-perl",
"shodan", "censys", "binaryedge", "wget/", "lwp-trivial",
"apachebench", "java/", "ruby"
]);
let FingerPrintPaths = dynamic([
"/robots.txt", "/.git/", "/.env", "/.env.local", "/.env.production",
"/phpinfo.php", "/server-status", "/server-info",
"/crossdomain.xml", "/clientaccesspolicy.xml", "/sitemap.xml",
"/wp-admin", "/wp-login.php", "/xmlrpc.php",
"/.well-known/security.txt", "/CHANGELOG.txt", "/readme.html",
"/web.config", "/WEB-INF/web.xml"
]);
W3CIISLog
| where TimeGenerated > ago(1h)
| where csUserAgent has_any (ScannerUserAgents)
or csUriStem has_any (FingerPrintPaths)
| extend
ClientIP = cIP,
UserAgent = csUserAgent,
RequestPath = csUriStem,
ResponseStatus = scStatus,
ResponseBytes = scBytes
| summarize
RequestCount = count(),
UniqueUserAgents = dcount(csUserAgent),
UniquePaths = dcount(csUriStem),
HTTP200Count = countif(scStatus == 200),
HTTP404Count = countif(scStatus == 404),
FirstRequest = min(TimeGenerated),
LastRequest = max(TimeGenerated),
SampledUserAgents = make_set(csUserAgent, 10),
SampledPaths = make_set(csUriStem, 15)
by ClientIP, bin(TimeGenerated, 1h)
| extend
DurationMinutes = datetime_diff('minute', LastRequest, FirstRequest),
ReconScore = toint(0)
+ case(RequestCount > 100, 3, RequestCount > 30, 2, RequestCount > 5, 1, 0)
+ case(UniqueUserAgents > 5, 2, UniqueUserAgents > 2, 1, 0)
+ case(UniquePaths > 20, 2, UniquePaths > 8, 1, 0)
+ case(HTTP404Count > 20, 1, 0)
| where ReconScore >= 2
| project
TimeGenerated, ClientIP, RequestCount, UniqueUserAgents, UniquePaths,
HTTP200Count, HTTP404Count, DurationMinutes, ReconScore,
SampledUserAgents, SampledPaths, FirstRequest, LastRequest
| sort by ReconScore desc, RequestCount desc Data Sources
Required Tables
False Positives
- Legitimate SEO crawlers such as Googlebot, Bingbot, or commercial crawlers (Screaming Frog, Ahrefs, Semrush) may trigger on path enumeration rules — allowlist known crawler IP ranges and User-Agent prefixes
- Internal vulnerability scanners (Nessus, Qualys, Rapid7) run by the security team against web assets will generate identical patterns — exclude known scanner IP ranges via watchlist
- Developer tooling such as curl, wget, or Python requests used legitimately by CI/CD pipelines or deployment scripts may match scanner User-Agent patterns — baseline known build server IPs
Unlock Pro Content
Get the full detection package for T1592 including response playbook, investigation guide, and atomic red team tests.