Search Engines
Adversaries may use search engines to collect information about victims that can be used during targeting. Search engine services typically crawl online sites to index content and may provide users with specialized syntax to search for specific keywords or specific types of content (i.e. filetypes). Adversaries may craft various search engine queries — commonly called 'Google dorks' — to harvest general information about victims, as well as use specialized queries to look for spillages or leaks of sensitive information such as network details, credentials, or exposed configuration files. Information from these sources may reveal opportunities for other forms of reconnaissance, establishing operational resources, and/or initial access. The Kimsuky threat group (G0094) has been documented using Google searches to identify target vulnerabilities, tools, and geopolitical trends.
let DorkPatterns = dynamic(["filetype:", "ext:", "inurl:", "intitle:", "intext:", "site:", "cache:", "allintitle:", "allinurl:"]);
let SensitiveTerms = dynamic(["password", "passwd", "credential", "api_key", "apikey", "secret", "token", "config", "backup", ".env", "admin", "vpn", "portal", "jira", "confluence", "database"]);
let SearchEngineReferrers = dynamic(["google.com/search", "bing.com/search", "search.yahoo.com", "duckduckgo.com", "yandex.com/search"]);
let RefererDorks = W3CIISLog
| where TimeGenerated > ago(24h)
| where isnotempty(csReferer) and csReferer has_any (SearchEngineReferrers)
| extend SearchQuery = url_decode(extract(@"[?&]q=([^&]+)", 1, csReferer))
| where isnotempty(SearchQuery)
| where SearchQuery has_any (DorkPatterns) or SearchQuery has_any (SensitiveTerms)
| extend HasDorkOperator = SearchQuery has_any (DorkPatterns)
| extend HasSensitiveTerm = SearchQuery has_any (SensitiveTerms)
| extend DetectionBranch = "referer_dork"
| project TimeGenerated, cIP, csUsername, csMethod, csUriStem, csUriQuery, SearchQuery, csReferer, HasDorkOperator, HasSensitiveTerm, scStatus, DetectionBranch;
let SensitivePathAccess = W3CIISLog
| where TimeGenerated > ago(24h)
| where csUriStem has_any (".env", ".git/config", ".git/HEAD", "wp-config.php", "web.config", "config.php", ".htpasswd", "/backup", "database.sql", "/credentials", "/.aws/credentials", "/.ssh/id_rsa", "phpinfo.php", "/server-status", "/elmah.axd", "/.DS_Store")
| extend HasDorkOperator = false
| extend HasSensitiveTerm = false
| extend SearchQuery = ""
| extend DetectionBranch = "sensitive_path_access"
| project TimeGenerated, cIP, csUsername, csMethod, csUriStem, csUriQuery, SearchQuery, csReferer, HasDorkOperator, HasSensitiveTerm, scStatus, DetectionBranch;
RefererDorks
| union SensitivePathAccess
| sort by TimeGenerated desc Data Sources
Required Tables
False Positives
- Legitimate users reaching public web content via normal search engine queries that happen to contain sensitive keywords (e.g., searching for 'admin portal login guide' and landing on your documentation)
- Security researchers and authorized penetration testers performing scheduled reconnaissance assessments against your domains
- Search engine crawlers (Googlebot, Bingbot, DuckDuckBot) probing robots.txt, sitemap.xml, and other indexed paths as part of normal site indexing
- Automated vulnerability scanners (Qualys, Nessus, Burp Suite Enterprise) probing for sensitive file paths during authorized scheduled scans
- Web monitoring and uptime services that access known paths for availability checks, potentially triggering the sensitive path branch
References (10)
- https://attack.mitre.org/techniques/T1593/002/
- https://www.exploit-db.com/google-hacking-database
- https://www.recordedfuture.com/threat-intelligence-101/threat-analysis-techniques/google-dorks
- https://securitytrails.com/blog/google-hacking-techniques
- https://github.com/laramies/theHarvester
- https://developers.google.com/search/docs/crawling-indexing/robots/intro
- https://search.google.com/search-console/about
- https://learn.microsoft.com/en-us/azure/sentinel/data-connectors/iis-logs
- https://learn.microsoft.com/en-us/azure/azure-monitor/reference/tables/w3ciislog
- https://www.sans.org/blog/google-hacking-finding-vulnerabilities/
Unlock Pro Content
Get the full detection package for T1593.002 including response playbook, investigation guide, and atomic red team tests.