T1589.003 Sumo Logic CSE · Sumo

Detect Employee Names in Sumo Logic CSE

Adversaries may gather employee names that can be used during targeting. Employee names can be used to derive email addresses as well as to help guide other reconnaissance efforts and craft more-believable lures. Adversaries may easily gather employee names since they may be readily available and exposed via online or other accessible data sets such as social media, LinkedIn, corporate websites, and press releases. Real-world threat actors including Kimsuky, Sandworm Team, and Silent Librarian have been observed collecting victim employee name information to support subsequent phishing campaigns, credential attacks, and social engineering operations. Detection is inherently challenging because this activity primarily occurs outside the victim's environment on public platforms. Effective detection pivots to monitoring organization-owned web properties for automated scraping, tracking OSINT tool execution on monitored endpoints, and identifying downstream artifacts such as systematic user enumeration via authentication systems.

MITRE ATT&CK

Tactic
Reconnaissance
Technique
T1589 Gather Victim Identity Information
Sub-technique
T1589.003 Employee Names
Canonical reference
https://attack.mitre.org/techniques/T1589/003/

Sumo Detection Query

Sumo Logic CSE (Sumo)
sql
// ── Branch 1: Web directory scraping via proxy telemetry ──────────────────
(_sourceCategory=*proxy* OR _sourceCategory=*web_gateway* OR _sourceCategory=*zscaler*
 OR _sourceCategory=*bluecoat* OR _sourceCategory=*cisco_wsa* OR _sourceCategory=*squid*)
| parse regex "(?<src_ip>\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})" nodrop
| parse regex "(?<url_path>(?:https?://[^/]+)?(?:/(?:team|about(?:-us)?|staff|employees|directory|people|our-team|leadership|management|bios|meet-the-team|board|partners)(?:[/?#]|$)[^\s"]*))"
    nodrop
| where !isNull(url_path)
| parse regex "(?<dest_host>(?:https?://)?([a-z0-9\-\.]+\.[a-z]{2,}))" nodrop
| timeslice 1m
| stats
    count as EventCount,
    dcount(url_path) as UniquePages,
    first(url_path) as URLSample
    by src_ip, dest_host, _timeslice
| where EventCount > 25
| eval RequestsPerMin = EventCount
| eval ScrapeRisk = if(RequestsPerMin > 15, "HIGH - rapid automated scraping",
    if(EventCount > 60 and UniquePages > 8, "MEDIUM - high volume enumeration",
    if(EventCount > 40, "MEDIUM - elevated directory access", "LOW")))
| where ScrapeRisk != "LOW"
| eval DetectionType = "Web_Directory_Scraping"
| fields _timeslice, DetectionType, src_ip, dest_host, EventCount, UniquePages, RequestsPerMin, ScrapeRisk, URLSample
| sort by EventCount desc

// ── Branch 2: OSINT tool execution on managed endpoints (run as separate saved search) ──
// (_sourceCategory=*sysmon* OR _sourceCategory=*winevent* OR _sourceCategory=*endpoint*)
// | where %EventID in ("1", "4688")
// | parse field=%CommandLine "*" as CommandLine nodrop
// | parse field=%Image "*" as Image nodrop
// | where CommandLine matches "*theHarvester*"
//     OR CommandLine matches "*recon-ng*"
//     OR CommandLine matches "*CrossLinked*"
//     OR CommandLine matches "*linkedin2username*"
//     OR CommandLine matches "*spiderfoot*"
//     OR Image matches "*theHarvester*"
//     OR Image matches "*crosslinked*"
// | eval ScrapeRisk = "HIGH - known OSINT tool on managed endpoint"
// | eval DetectionType = "Harvesting_Tool_Execution"
// | fields _messageTime, DetectionType, %Computer, CommandLine, Image, ScrapeRisk
// | sort by _messageTime desc
high severity medium confidence

Sumo Logic detection for T1589.003 employee name harvesting with two branches. Branch 1 is the active query: it searches proxy source categories, extracts source IP, destination host, and URL path via regex, aggregates per source IP per minute using timeslice and stats, then applies volume and rate-based risk tiering — flagging sources with >25 directory page requests in a 1-minute bucket. Branch 2 (commented) is a companion saved search targeting Sysmon/WinEvent source categories for EventID 1 or 4688, matching OSINT tool names in CommandLine or Image fields. Deploy both as separate Scheduled Searches with alert thresholds appropriate for your environment volume.

Data Sources

Sumo Logic Installed Collector — proxy log source categories (Squid, Zscaler, Bluecoat, Cisco WSA)Sumo Logic Windows Agent — Sysmon Operational / Windows Security Event LogSumo Logic Cloud-to-Cloud source for Zscaler or similar SaaS proxy

Required Tables

_sourceCategory=*proxy*_sourceCategory=*web_gateway*_sourceCategory=*sysmon*_sourceCategory=*winevent*

False Positives & Tuning

  • Automated HR platform or CRM integrations that scrape internal or third-party employee directories for org chart population — these will produce high volumes of directory path hits from a consistent service account IP
  • Corporate SEO audit tools or web performance monitoring services that crawl all pages including /about and /team on a scheduled basis, generating burst traffic from monitoring infrastructure IPs
  • Internal IT asset inventory or intranet search indexers that recursively crawl company web properties and hit staff directory pages as part of site-wide indexing jobs
Download portable Sigma rule (.yml)

Other platforms for T1589.003


Testing Methodology

Validate this detection against 5 adversary techniques from Atomic Red Team. Each test below lists the behaviour to exercise and the telemetry you should expect to see. Executable commands and cleanup steps are available with Pro.

  1. Test 1theHarvester Employee Name and Email Enumeration

    Expected signal: Sysmon Event ID 1 (Linux auditd equivalent): process creation for 'theHarvester' or 'python3' with command line arguments '-d example.com -b google'. Sysmon Event ID 3 / auditd SYSCALL: outbound network connections to Google APIs and search endpoints. Sysmon Event ID 11: creation of /tmp/harvest_output.json. On Windows endpoints: DeviceProcessEvents with FileName=python.exe and ProcessCommandLine containing 'theHarvester' and '-b google'.

  2. Test 2CrossLinked LinkedIn Employee Name to Email Permutation

    Expected signal: Sysmon Event ID 1: process create for python3 with CommandLine containing 'CrossLinked' or 'crosslinked' and '-f' and '{first}.{last}'. Sysmon Event ID 3: outbound DNS and TCP connections to linkedin.com and www.linkedin.com on port 443. Sysmon Event ID 11: file creation at /tmp/crosslinked_names.txt. DeviceProcessEvents (MDE): ProcessCommandLine containing 'crosslinked' or '{first}.{last}'.

  3. Test 3Corporate Team Page Automated Scraping Simulation

    Expected signal: Sysmon Event ID 3 (Network Connect): repeated outbound connections to httpbin.org:443. Process creation for curl. In a real environment targeting a corporate web property: WAF/proxy logs showing 30+ requests to /team, /about-us, /staff URLs from the same source IP within 60 seconds with User-Agent 'Python-urllib/3.9'. CommonSecurityLog entries with RequestURL matching directory patterns.

  4. Test 4recon-ng LinkedIn Contacts Module Employee Enumeration

    Expected signal: Sysmon Event ID 1: process create for recon-ng binary or python3 with recon-ng in command path. Sysmon Event ID 11: file creation in ~/.recon-ng/workspaces/employee_hunt/ including SQLite database data.db. Sysmon Event ID 3: outbound connections to linkedin.com, api.linkedin.com on port 443. DeviceProcessEvents: FileName containing 'recon-ng' or ProcessCommandLine containing 'recon-ng'.

  5. Test 5Hunter.io API Employee Name and Email Harvesting

    Expected signal: Sysmon Event ID 3: outbound DNS query for api.hunter.io and TCP connection to api.hunter.io:443. Process creation for curl or python3 with api.hunter.io in command line arguments. In proxy/web access logs: GET requests to api.hunter.io/v2/domain-search with domain parameter. If monitoring DNS (Sysmon Event ID 22): DNS query for api.hunter.io.

Unlock Pro Content

Get the full detection package for T1589.003 including response playbook, investigation guide, and atomic red team tests.

Response PlaybookInvestigation GuideHunting QueriesAtomic Red Team TestsTuning Guidance

Related Detections