Which SIEM platforms have a T1594 detection rule?

df00tech provides T1594 (Search Victim-Owned Websites) detection queries for 7 SIEM platforms: Microsoft Sentinel / Defender, Splunk, Elastic Security (EQL), IBM QRadar (AQL), Sumo Logic CSE, Google Chronicle / SecOps, CrowdStrike LogScale (CQL).

What data sources are required to detect Search Victim-Owned Websites (T1594)?

Detecting Search Victim-Owned Websites requires the following data sources: Microsoft Sentinel (W3C IIS Logs), Azure WAF Logs.

What severity and confidence is the T1594 detection?

The T1594 detection is rated medium severity with low confidence.

What are common false positives for T1594?

Common false positives for T1594 include: Legitimate search engine crawlers (Googlebot, Bingbot, DuckDuckBot) with high request volumes — filter by known crawler IP ranges and UA strings; Authorized penetration testing or red team engagements scheduled by the organization — cross-reference with change management records; Web archiving services such as archive.org (Internet Archive) performing scheduled snapshots.

Response playbooks, investigation guides, and Atomic Red Team tests are Pro-only. Upgrade to unlock the full detection package for T1594.

Upgrade to Pro

T1594 Splunk · SPL

Detect Search Victim-Owned Websites in Splunk

This detection identifies adversary reconnaissance activity targeting victim-owned websites, including automated crawling, directory enumeration, and harvesting of sensitive pages such as robots.txt, sitemap.xml, staff/contact directories, and hidden paths. Because T1594 is a PRE-ATT&CK technique occurring outside the victim network, detection relies on web server access logs, WAF telemetry, and CDN logs ingested into SIEM. Detection focuses on high-volume requests from single source IPs, enumeration of employee/contact pages, known scraping tool user agents, and sequential access patterns indicative of automated reconnaissance tools used by groups like Kimsuky, Volt Typhoon, Silent Librarian, and Sandworm Team.

MITRE ATT&CK

Tactic: Reconnaissance
Technique: T1594 Search Victim-Owned Websites
Canonical reference: https://attack.mitre.org/techniques/T1594/

SPL Detection Query

Splunk (SPL)

spl

index=* (sourcetype="access_combined" OR sourcetype="access_combined_wcookie" OR sourcetype="iis" OR sourcetype="ms:iis:auto" OR sourcetype="stream:http")
| eval ua_lower=lower(coalesce(http_user_agent, cs_User_Agent, useragent))
| eval uri=coalesce(uri_path, cs_uri_stem, url)
| eval status_code=coalesce(status, sc_status)
| eval src=coalesce(src_ip, c_ip, clientip)
| eval is_recon_ua=if(match(ua_lower, "(scrapy|python-requests|python-urllib|wget\/|nikto|masscan|nmap|zgrab|go-http-client|libwww-perl|dirbuster|gobuster|feroxbuster|wfuzz|ffuf|sqlmap|whatweb)"), 1, 0)
| eval is_recon_path=if(match(uri, "(?i)(robots\.txt|sitemap.*\.xml|\.well-known|/\.git|/\.env|/wp-admin|/phpmyadmin)"), 1, 0)
| eval is_employee_page=if(match(uri, "(?i)/(staff|team|employees|people|directory|contact|about|leadership|management|board)"), 1, 0)
| eval is_404=if(status_code="404" OR status_code=404, 1, 0)
| eval is_403=if(status_code="403" OR status_code=403, 1, 0)
| where is_recon_ua=1 OR is_recon_path=1 OR is_404=1 OR is_employee_page=1
| bin _time span=10m
| stats
    count as total_requests,
    sum(is_404) as count_404,
    sum(is_403) as count_403,
    dc(uri) as unique_paths,
    sum(is_recon_path) as recon_path_hits,
    sum(is_employee_page) as employee_page_hits,
    sum(is_recon_ua) as suspicious_ua_hits,
    values(ua_lower) as user_agents,
    values(uri) as accessed_paths
    by _time, src, host
| eval recon_score=0
| eval recon_score=recon_score + case(count_404 > 50, 3, count_404 > 20, 2, count_404 > 5, 1, true(), 0)
| eval recon_score=recon_score + case(total_requests > 500, 3, total_requests > 200, 2, total_requests > 100, 1, true(), 0)
| eval recon_score=recon_score + case(recon_path_hits > 3, 2, recon_path_hits >= 1, 1, true(), 0)
| eval recon_score=recon_score + if(suspicious_ua_hits > 0, 2, 0)
| eval recon_score=recon_score + case(employee_page_hits > 5, 2, employee_page_hits >= 1, 1, true(), 0)
| where recon_score >= 3
| sort -recon_score, -total_requests
| table _time, src, host, total_requests, count_404, count_403, unique_paths, recon_path_hits, employee_page_hits, suspicious_ua_hits, recon_score, user_agents, accessed_paths

medium severity low confidence

Detects website reconnaissance patterns from web access logs by identifying high 404 rates (directory enumeration), access to sensitive paths (robots.txt, sitemap.xml, .git), employee/contact page harvesting, and known scraping tool user agents. Uses a composite scoring model across 10-minute windows to surface high-confidence recon activity.

Data Sources

Web Access Logs (Apache/Nginx/IIS)Splunk Stream HTTP

Required Sourcetypes

access_combinediisstream:http

False Positives & Tuning

Legitimate search engine crawlers (Googlebot, Bingbot) generating high request volumes against public-facing pages
Authorized security scanning or bug bounty researchers operating within disclosed scope
Web performance monitoring tools (Pingdom, New Relic Synthetics, Datadog) performing health checks
Content Delivery Network (CDN) origin pull requests that appear as high-volume single-source traffic
Marketing analytics bots and SEO auditing tools used by internal teams

Other platforms for T1594

Microsoft Sentinel (KQL) Elastic Security (Elastic) IBM QRadar (QRadar) Sumo Logic CSE (Sumo) Google Chronicle (YARA-L) CrowdStrike LogScale (LogScale) All platforms (combined) →

Testing Methodology

Validate this detection against 3 adversary techniques from Atomic Red Team. Each test below lists the behaviour to exercise and the telemetry you should expect to see. Executable commands and cleanup steps are available with Pro.

Test 1Automated Website Crawling with wget Spider Mode
Expected signal: Web server access logs showing rapid sequential GET requests from single IP with wget user agent. Multiple 200, 301, and 404 responses across diverse URL paths. Request rate 20-100 req/min.
Test 2Reconnaissance Path Enumeration with robots.txt and sitemap.xml Harvest
Expected signal: Sequential requests to robots.txt, sitemap.xml then employee-related paths. User agent 'python-requests' in all requests. Mix of 200 and 404 responses across 60-second window.
Test 3Directory Enumeration with ffuf Wordlist Scanning
Expected signal: Burst of 404 responses (12 requests, 1 per path) within 90 seconds. ffuf or spoofed browser UA. Requests for paths like /admin, /staff, /.git, /.env. Rate approximately 10 req/min.

Last updated: 2026-03-19 Research depth: deep

References (4)

Response Playbook

1. Examine the full command line and decode any Base64 content...
2. Identify the parent process chain and user context...
3. Check for concurrent network connections from the process...

Investigation Guide

Related techniques: T1027, T1105, T1562.001...
Forensic artifacts: PSReadLine history, Prefetch, ScriptBlock logs...

Atomic Red Team Tests

Test 1: Encoded command execution...
Test 2: Download cradle via Net.WebClient...

Unlock playbooks & atomic tests with Pro

Get the full detection package for T1594 — response playbook and atomic red team tests, plus investigation guidance and hunting queries.

df00tech Pro — £29/user/month

Response PlaybookInvestigation GuideHunting QueriesAtomic Red Team TestsTuning Guidance

Detect Search Victim-Owned Websites in Splunk

MITRE ATT&CK

SPL Detection Query

Data Sources

Required Sourcetypes

False Positives & Tuning

Other platforms for T1594

Testing Methodology

Response Playbook

Investigation Guide

Atomic Red Team Tests

Unlock playbooks & atomic tests with Pro

Related Detections

Tactic Hub

Related Techniques

Same Tactic: Reconnaissance

Popular Detections

MITRE ATT&CK

SPL Detection Query

Data Sources

Required Sourcetypes

False Positives & Tuning

Other platforms for T1594

Testing Methodology

Unlock playbooks & atomic tests with Pro

Related Detections

Tactic Hub

Related Techniques

Same Tactic: Reconnaissance

Popular Detections

Get new detections in your inbox