Detect Search Open Websites/Domains in Splunk
This detection identifies automated reconnaissance activity against your organization's public-facing web assets, which may indicate an adversary conducting pre-attack intelligence gathering via T1593. Since T1593 occurs externally (adversaries querying social media, search engines, and public websites), direct network-level detection from within the victim environment is impossible. This detection instead focuses on second-order observable indicators: anomalous automated scraping patterns against your web infrastructure (IIS, Apache, Nginx, Azure WAF), known OSINT/reconnaissance tool user agents in web access logs, high-velocity enumeration from single source IPs, and probing of sensitive disclosure paths such as /.git/, /robots.txt, sitemap.xml, and /admin. These patterns correlate with adversary pre-compromise reconnaissance workflows used by groups including Volt Typhoon, Mustang Panda, and Kimsuky prior to phishing or initial access operations.
MITRE ATT&CK
- Tactic
- Reconnaissance
- Technique
- T1593 Search Open Websites/Domains
- Canonical reference
- https://attack.mitre.org/techniques/T1593/
SPL Detection Query
index=web (sourcetype="iis" OR sourcetype="apache:access" OR sourcetype="nginx:plus:access" OR sourcetype="access_combined" OR sourcetype="access_combined_wcookie")
| eval ua_lower=lower(http_user_agent)
| eval is_recon_ua=if(
match(ua_lower, "(python-requests|python-urllib|go-http-client|nuclei|nikto|dirbuster|gobuster|feroxbuster|ffuf|sqlmap|scrapy|zgrab|masscan|shodan|censys|binaryedge|nmap|burpsuite|zaproxy|httpx|katana|subfinder|amass|theharvester|mechanize|libwww-perl|curl\/|wget\/)")
OR isnull(http_user_agent) OR http_user_agent="-",
1, 0)
| eval is_sensitive_path=if(
match(uri_path, "(\/\.git|\/\.env|\/wp-admin|\/phpmyadmin|\/admin|\/robots\.txt|\/sitemap\.xml|\/\.htaccess|\/web\.config|\/backup|\/config|\/xmlrpc\.php|\/wp-login|\/\.well-known)"),
1, 0)
| bin _time span=1h
| stats
count as total_requests,
dc(uri_path) as unique_uris,
sum(is_recon_ua) as recon_ua_requests,
sum(is_sensitive_path) as sensitive_path_hits,
values(http_user_agent) as user_agents,
values(status) as status_codes,
values(uri_path) as sampled_paths,
min(_time) as first_request,
max(_time) as last_request
by _time, src_ip
| where total_requests > 30 OR recon_ua_requests > 5 OR sensitive_path_hits > 3 OR unique_uris > 25
| eval risk_score=case(
recon_ua_requests > 20 AND sensitive_path_hits > 5, "High",
recon_ua_requests > 5 OR sensitive_path_hits > 3 OR unique_uris > 50, "Medium",
1=1, "Low")
| table _time, src_ip, total_requests, unique_uris, recon_ua_requests, sensitive_path_hits, user_agents, status_codes, risk_score, first_request, last_request
| sort - risk_score total_requests Correlates web server access logs (IIS, Apache, Nginx) against known reconnaissance tool user agent strings and sensitive path enumeration patterns. Groups requests by source IP in 1-hour windows and scores based on request volume, known OSINT tool fingerprints, and hits against sensitive disclosure endpoints. Surfaces automated pre-attack reconnaissance consistent with adversary OSINT collection prior to phishing or initial access.
Data Sources
Required Sourcetypes
False Positives & Tuning
- Authorized security assessment firms conducting external vulnerability scans — coordinate with security team to maintain scanner IP allowlist
- Search engine crawlers (Googlebot, Bingbot) with legitimate high-frequency enumeration patterns
- Internal CI/CD pipelines or smoke test frameworks that perform automated HTTP checks against production endpoints
- API monitoring tools (Postman, Insomnia automated runners) used by developers for production health checks
Other platforms for T1593
Testing Methodology
Validate this detection against 3 adversary techniques from Atomic Red Team. Each test below lists the behaviour to exercise and the telemetry you should expect to see. Executable commands and cleanup steps are available with Pro.
- Test 1Automated Web Reconnaissance with Python Requests
Expected signal: Web server access logs will show 25+ requests from 127.0.0.1 with user agent 'python-requests/2.x.x' hitting sensitive paths including /.git/config, /.env, /wp-admin, and /wp-config.php. IIS W3CIISLog or Apache access_combined logs will capture all requests.
- Test 2Directory Enumeration with Gobuster (DNS/HTTP Mode)
Expected signal: Web server logs will show rapid sequential requests from 127.0.0.1 with user agent 'gobuster/3.x'. Each wordlist entry appears as a separate GET request. Requests arrive at ~5 concurrent requests/second. Response codes 200, 301, 302, 403, and 404 visible depending on what exists on the target.
- Test 3OSINT Reconnaissance with theHarvester Against Your Own Domain
Expected signal: DNS resolver logs and network flow logs will show multiple DNS queries for subdomains of the target domain originating from the test host. If your DNS logging infrastructure captures queries, these appear as sequential lookups for www.example.com, mail.example.com, api.example.com, etc. theHarvester queries are external to the target and logged by Bing/search infrastructure, not the victim — this validates the external nature of T1593.
References (8)
- https://attack.mitre.org/techniques/T1593/
- https://attack.mitre.org/techniques/T1593/001/
- https://attack.mitre.org/techniques/T1593/002/
- https://attack.mitre.org/techniques/T1593/003/
- https://www.cisa.gov/news-events/cybersecurity-advisories/aa24-038a
- https://www.microsoft.com/en-us/security/blog/2023/05/24/volt-typhoon-targets-us-critical-infrastructure-with-living-off-the-land-techniques/
- https://www.greynoise.io/blog/understanding-mass-internet-scanners
- https://securitytrails.com/blog/google-hacking-techniques
Unlock Pro Content
Get the full detection package for T1593 including response playbook, investigation guide, and atomic red team tests.