T1119

Automated Collection

Collection Last updated:

Once established within a system or network, an adversary may use automated techniques for collecting internal data. Methods for performing this technique could include use of a Command and Scripting Interpreter to search for and copy information fitting set criteria such as file type, location, or name at specific time intervals. In cloud-based environments, adversaries may also use cloud APIs, data pipelines, command line interfaces, or ETL services to automatically collect data. This functionality could also be built into remote access tools. This technique may incorporate use of other techniques such as File and Directory Discovery and Lateral Tool Transfer to identify and move files, as well as Cloud Service Dashboard and Cloud Storage Object Discovery to identify resources in cloud environments.

What is T1119 Automated Collection?

Automated Collection (T1119) maps to the Collection tactic — the adversary is trying to gather data of interest to their goal in MITRE ATT&CK.

This page provides production-ready detection logic for Automated Collection, covering the data sources and telemetry it touches: Process: Process Creation, Command: Command Execution, Microsoft Defender for Endpoint. The queries below are rated high severity at medium confidence, and ship for 7 SIEM platforms — KQL, SPL, Elastic, QRadar, Sumo, YARA-L, LogScale.

MITRE ATT&CK

Tactic
Collection
Technique
T1119 Automated Collection
Canonical reference
https://attack.mitre.org/techniques/T1119/
Microsoft Sentinel / Defender
kusto
let SensitiveExtensions = dynamic([
  ".doc", ".docx", ".xls", ".xlsx", ".pdf", ".ppt", ".pptx",
  ".mdb", ".accdb", ".csv", ".pst", ".ost", ".kdbx", ".pfx",
  ".pem", ".p12", ".key", ".rtf", ".txt"
]);
DeviceProcessEvents
| where Timestamp > ago(24h)
| where (
    // PowerShell recursive document search and collection
    (FileName in~ ("powershell.exe", "pwsh.exe") and
     ProcessCommandLine has_any ("-Recurse", "Get-ChildItem", "GCI ", "gci ") and
     ProcessCommandLine has_any (SensitiveExtensions))
    or
    // CMD recursive file enumeration targeting document types
    (FileName =~ "cmd.exe" and
     ProcessCommandLine has "dir" and ProcessCommandLine has "/s" and
     ProcessCommandLine has_any (SensitiveExtensions))
    or
    // forfiles automated file processing
    (ProcessCommandLine has "forfiles" and
     ProcessCommandLine has_any (SensitiveExtensions))
    or
    // Mass file copy with recursive flags (bulk staging)
    (FileName =~ "robocopy.exe" and
     ProcessCommandLine has_any ("/s", "/e", "/S", "/E", "/MIR", "/mir"))
    or
    (FileName =~ "xcopy.exe" and ProcessCommandLine has_any ("/s", "/S"))
    or
    // Archive tools ingesting document collections (pre-exfiltration staging)
    (FileName in~ ("rar.exe", "winrar.exe") and
     ProcessCommandLine has_any (" a ", "-a", "/a") and
     ProcessCommandLine has_any (SensitiveExtensions))
    or
    (FileName =~ "7z.exe" and
     ProcessCommandLine has_any (" a ", "a ") and
     ProcessCommandLine has_any (SensitiveExtensions))
    or
    // Python file traversal and collection scripts
    (FileName in~ ("python.exe", "python3.exe") and
     ProcessCommandLine has_any ("os.walk", "glob.glob", "shutil.copy", "os.listdir", "scandir"))
    or
    // VBScript/JScript file collection via Scripting.FileSystemObject
    (FileName in~ ("wscript.exe", "cscript.exe") and
     ProcessCommandLine has_any ("GetFolder", "GetFile", "CopyFile", "MoveFile", "Files"))
)
| extend AutoColl_RecursiveSearch = ProcessCommandLine has_any ("-Recurse", "/s", "/S", "os.walk", "forfiles", "Get-ChildItem")
| extend AutoColl_SensitiveExt = ProcessCommandLine has_any (SensitiveExtensions)
| extend AutoColl_ArchiveTool = FileName in~ ("rar.exe", "winrar.exe", "7z.exe")
| extend AutoColl_MassCopy = FileName in~ ("robocopy.exe", "xcopy.exe")
| extend AutoColl_CredentialFiles = ProcessCommandLine has_any (".pfx", ".pem", ".p12", ".key", ".kdbx")
| project Timestamp, DeviceName, AccountName, FileName, ProcessCommandLine,
         InitiatingProcessFileName, InitiatingProcessCommandLine,
         AutoColl_RecursiveSearch, AutoColl_SensitiveExt, AutoColl_ArchiveTool,
         AutoColl_MassCopy, AutoColl_CredentialFiles
| sort by Timestamp desc

Detects automated data collection activity using Microsoft Defender for Endpoint DeviceProcessEvents. Identifies key patterns: PowerShell Get-ChildItem recursive searches targeting sensitive file extensions (documents, spreadsheets, PDFs, credential stores), CMD dir /s bulk enumeration, forfiles automated file processing, mass copy tools (robocopy/xcopy) with recursive flags, archive tools (RAR/7z) staging document collections pre-exfiltration, and Python/VBScript file traversal scripts. Enrichment fields classify the collection type by category to assist analyst triage and prioritization.

high severity medium confidence

Data Sources

Process: Process Creation Command: Command Execution Microsoft Defender for Endpoint

Required Tables

DeviceProcessEvents

False Positives

  • Backup software agents (Veeam, Acronis, Windows Backup) performing scheduled recursive file enumeration and copy operations using robocopy or xcopy with standard recursive flags
  • Enterprise file sync and DLP agents (OneDrive sync client, SharePoint sync, Varonis, Symantec DLP) scanning for specific document types as part of classification and policy enforcement
  • IT administrators running robocopy or PowerShell Get-ChildItem for bulk file migrations, server decommissions, or departmental data reorganization projects
  • Software developers using Python scripts with os.walk or glob.glob for build processes, automated test data preparation, or log parsing pipelines
  • Anti-virus and endpoint security products performing scheduled content-inspection scans that enumerate files by extension type across user directories

Sigma rule & cross-platform mapping

The detection logic for Automated Collection (T1119) above is provided in a vendor-neutral form so you can deploy it on any SIEM. The same logic is shipped here as native KQL (Microsoft Sentinel / Defender), SPL (Splunk), Elastic (Elastic Security (EQL)), QRadar (IBM QRadar (AQL)), Sumo (Sumo Logic CSE), YARA-L (Google Chronicle / SecOps), LogScale (CrowdStrike LogScale (CQL)) queries. In Sigma terms, this detection targets the following logsource:

logsource:
  category: process_creation
  product: windows

Browse the community-maintained Sigma rules for this technique:


Testing Methodology

Validate this detection against 4 adversary techniques from Atomic Red Team. Each test below lists the behaviour to exercise and the telemetry you should expect to see. Executable commands and cleanup steps are available with Pro.

  1. Test 1PowerShell Recursive Document Collection to Staging Directory

    Expected signal: Sysmon Event ID 1: Process Create with Image=powershell.exe, CommandLine containing 'Get-ChildItem', '-Recurse', '.docx', '.xlsx', '.pdf'. Sysmon Event ID 11: Multiple file creation events in %TEMP%\df00tech-stage for each copied file. PowerShell ScriptBlock Log Event ID 4104 with the full collection script. Security Event ID 4663 (if SACL auditing enabled) for each source document read.

  2. Test 2CMD Recursive File Enumeration with dir /s

    Expected signal: Sysmon Event ID 1: Process Create with Image=cmd.exe, CommandLine containing 'dir /s /b' and '.docx', '.xlsx', '.pdf'. Sysmon Event ID 11: File creation event for %TEMP%\df00tech-filelist.txt. Security Event ID 4688 (if process creation auditing and command line logging are enabled) with full command line including extension targets.

  3. Test 3forfiles Automated Document Enumeration

    Expected signal: Sysmon Event ID 1: Process Create for the shell executing forfiles with CommandLine containing 'forfiles', '/S', and '.docx'. Child cmd.exe process creation events as forfiles spawns a cmd.exe instance per matching file. Sysmon Event ID 11: File creation for %TEMP%\df00tech-forfiles.txt. The child cmd.exe processes with 'echo @PATH' are also logged individually.

  4. Test 47-Zip Archive Collection — Document Staging Pre-Exfiltration

    Expected signal: Sysmon Event ID 1: Process Create with Image=7z.exe, CommandLine containing 'a' (add to archive), target paths with '.docx', '.xlsx', '.pdf', '-r' (recursive), and '-p' (password). Sysmon Event ID 11: File creation event for %TEMP%\df00tech-archive.7z. Security Event ID 4663 (if auditing enabled) for each source document file opened by 7z.exe during archiving.

Unlock Pro Content

Get the full detection package for T1119 including response playbook, investigation guide, and atomic red team tests.

Response PlaybookInvestigation GuideHunting QueriesAtomic Red Team TestsTuning Guidance

Related Detections

Tactic Hub