T1593

Search Open Websites/Domains

Reconnaissance Last updated:

This detection identifies automated reconnaissance activity against your organization's public-facing web assets, which may indicate an adversary conducting pre-attack intelligence gathering via T1593. Since T1593 occurs externally (adversaries querying social media, search engines, and public websites), direct network-level detection from within the victim environment is impossible. This detection instead focuses on second-order observable indicators: anomalous automated scraping patterns against your web infrastructure (IIS, Apache, Nginx, Azure WAF), known OSINT/reconnaissance tool user agents in web access logs, high-velocity enumeration from single source IPs, and probing of sensitive disclosure paths such as /.git/, /robots.txt, sitemap.xml, and /admin. These patterns correlate with adversary pre-compromise reconnaissance workflows used by groups including Volt Typhoon, Mustang Panda, and Kimsuky prior to phishing or initial access operations.

What is T1593 Search Open Websites/Domains?

Search Open Websites/Domains (T1593) maps to the Reconnaissance tactic — the adversary is trying to gather information they can use to plan future operations in MITRE ATT&CK.

This page provides production-ready detection logic for Search Open Websites/Domains, covering the data sources and telemetry it touches: Microsoft Sentinel (IIS Logs via W3CIISLog), Azure Application Gateway WAF. The queries below are rated medium severity at low confidence, and ship for 7 SIEM platforms — KQL, SPL, Elastic, QRadar, Sumo, YARA-L, LogScale.

MITRE ATT&CK

Tactic
Reconnaissance
Technique
T1593 Search Open Websites/Domains
Canonical reference
https://attack.mitre.org/techniques/T1593/
Microsoft Sentinel / Defender
kusto
let KnownReconUserAgents = dynamic(["python-requests", "python-urllib", "go-http-client", "curl/", "wget/", "nuclei", "nikto", "dirbuster", "gobuster", "feroxbuster", "ffuf", "sqlmap", "scrapy", "zgrab", "masscan", "shodan", "censys", "binaryedge", "nmap", "burpsuite", "zap", "httpx", "katana", "subfinder", "amass", "theHarvester", "mechanize", "httplib2", "libwww-perl"]);
let SensitivePaths = dynamic(["/.git", "/.env", "/wp-admin", "/phpmyadmin", "/admin", "/robots.txt", "/sitemap.xml", "/.htaccess", "/web.config", "/backup", "/config", "/.well-known", "/xmlrpc.php", "/wp-login"]);
W3CIISLog
| where TimeGenerated > ago(1h)
| where isnotempty(cIP)
| extend UserAgentLower = tolower(csUserAgent)
| extend IsReconUA = iff(
    csUserAgent has_any (KnownReconUserAgents) or isempty(csUserAgent),
    true, false)
| extend IsSensitivePath = iff(
    csUriStem has_any (SensitivePaths),
    true, false)
| summarize
    TotalRequests = count(),
    UniqueURIs = dcount(csUriStem),
    UniquePaths = make_set(csUriStem, 30),
    ReconUARequests = countif(IsReconUA == true),
    SensitivePathHits = countif(IsSensitivePath == true),
    StatusCodes = make_set(scStatus),
    UserAgents = make_set(csUserAgent, 10),
    FirstRequest = min(TimeGenerated),
    LastRequest = max(TimeGenerated)
    by cIP, bin(TimeGenerated, 1h)
| where TotalRequests > 30 or ReconUARequests > 5 or SensitivePathHits > 3 or UniqueURIs > 25
| extend RiskScore = case(
    ReconUARequests > 20 and SensitivePathHits > 5, "High",
    ReconUARequests > 5 or SensitivePathHits > 3 or UniqueURIs > 50, "Medium",
    "Low")
| project
    TimeGenerated,
    SourceIP = cIP,
    TotalRequests,
    UniqueURIs,
    ReconUARequests,
    SensitivePathHits,
    SampledPaths = UniquePaths,
    UserAgents,
    StatusCodes,
    RiskScore,
    FirstRequest,
    LastRequest
| order by RiskScore asc, TotalRequests desc

Detects automated reconnaissance against public-facing web assets by correlating known OSINT and scanning tool user agents in IIS access logs with high-velocity enumeration patterns, sensitive path probing (/.git, /.env, /admin, /wp-admin), and anomalously high unique URI counts from single source IPs. Targets pre-compromise intelligence gathering consistent with T1593 sub-techniques (social media, search engine dorking, code repository searches) that manifest as automated scraping when adversaries pivot to directly probing your infrastructure.

medium severity low confidence

Data Sources

Microsoft Sentinel (IIS Logs via W3CIISLog) Azure Application Gateway WAF

Required Tables

W3CIISLog

False Positives

  • Legitimate commercial web crawlers and search engine bots (Googlebot, Bingbot, DuckDuckGo) may match known user agent patterns — whitelist verified crawler IP ranges from respective ASNs
  • Security vendors running authorized external attack surface scans (Qualys, Tenable, Rapid7) will produce reconnaissance-like patterns — maintain an allowlist of authorized scanner IPs
  • Developers or internal teams using curl, Python requests, or httpx for legitimate API testing or load testing against production endpoints
  • Content delivery networks and uptime monitoring services (Pingdom, UptimeRobot, StatusCake) making frequent automated HEAD/GET requests
  • Partners or customers running automated integrations that access your web endpoints at high frequency

Sigma rule & cross-platform mapping

The detection logic for Search Open Websites/Domains (T1593) above is provided in a vendor-neutral form so you can deploy it on any SIEM. The same logic is shipped here as native KQL (Microsoft Sentinel / Defender), SPL (Splunk), Elastic (Elastic Security (EQL)), QRadar (IBM QRadar (AQL)), Sumo (Sumo Logic CSE), YARA-L (Google Chronicle / SecOps), LogScale (CrowdStrike LogScale (CQL)) queries. In Sigma terms, this detection targets the following logsource:

logsource:
  product: azure

Browse the community-maintained Sigma rules for this technique:


Testing Methodology

Validate this detection against 3 adversary techniques from Atomic Red Team. Each test below lists the behaviour to exercise and the telemetry you should expect to see. Executable commands and cleanup steps are available with Pro.

  1. Test 1Automated Web Reconnaissance with Python Requests

    Expected signal: Web server access logs will show 25+ requests from 127.0.0.1 with user agent 'python-requests/2.x.x' hitting sensitive paths including /.git/config, /.env, /wp-admin, and /wp-config.php. IIS W3CIISLog or Apache access_combined logs will capture all requests.

  2. Test 2Directory Enumeration with Gobuster (DNS/HTTP Mode)

    Expected signal: Web server logs will show rapid sequential requests from 127.0.0.1 with user agent 'gobuster/3.x'. Each wordlist entry appears as a separate GET request. Requests arrive at ~5 concurrent requests/second. Response codes 200, 301, 302, 403, and 404 visible depending on what exists on the target.

  3. Test 3OSINT Reconnaissance with theHarvester Against Your Own Domain

    Expected signal: DNS resolver logs and network flow logs will show multiple DNS queries for subdomains of the target domain originating from the test host. If your DNS logging infrastructure captures queries, these appear as sequential lookups for www.example.com, mail.example.com, api.example.com, etc. theHarvester queries are external to the target and logged by Bing/search infrastructure, not the victim — this validates the external nature of T1593.

Unlock Pro Content

Get the full detection package for T1593 including response playbook, investigation guide, and atomic red team tests.

Response PlaybookInvestigation GuideHunting QueriesAtomic Red Team TestsTuning Guidance

Related Detections