Detect Archive via Library in Microsoft Sentinel
Adversaries may compress or encrypt collected data prior to exfiltration using third-party or built-in programming libraries rather than standalone archival utilities. Libraries such as Python's zlib, bzip2, gzip, zipfile, and rarfile modules; .NET's System.IO.Compression (GZipStream, DeflateStream, ZipArchive); C libraries libzip and zlib; and platform-native libraries enable adversaries to compress and encrypt data programmatically within a running process. Because no separate archival utility process (7-Zip, WinRAR, tar) is spawned, this technique evades detections focused on command-line archivers. Malware families including TajMahal, LunarWeb, SeaDuke, BBSRAT, InvisiMole, and Denis have all used library-based compression to stage and exfiltrate collected data.
MITRE ATT&CK
- Tactic
- Collection
- Technique
- T1560 Archive Collected Data
- Sub-technique
- T1560.002 Archive via Library
- Canonical reference
- https://attack.mitre.org/techniques/T1560/002/
KQL Detection Query
let CompressionDlls = dynamic(["zlib.dll", "zlib1.dll", "zlibwapi.dll", "bzip2.dll", "libbzip2.dll", "libzip.dll", "minizip.dll"]);
let ScriptingInterpreters = dynamic(["python.exe", "python3.exe", "ruby.exe", "perl.exe", "node.exe", "java.exe", "javaw.exe", "wscript.exe", "cscript.exe", "mshta.exe"]);
let StagingPaths = dynamic(["\\Temp\\", "\\tmp\\", "\\AppData\\Local\\Temp\\", "\\AppData\\Roaming\\", "\\ProgramData\\", "\\Users\\Public\\"]);
// Branch 1: Compression DLLs loaded by unexpected processes
let CompDllLoads =
DeviceImageLoadEvents
| where Timestamp > ago(24h)
| where FileName in~ (CompressionDlls)
| where not(FolderPath has_any ("C:\\Windows\\System32\\", "C:\\Windows\\SysWOW64\\", "C:\\Program Files\\7-Zip\\", "C:\\Program Files (x86)\\7-Zip\\", "C:\\Program Files\\WinRAR\\"))
| extend DetectionSource = "CompressionDllLoad"
| project Timestamp, DeviceName, AccountName, DetectionSource,
ProcessName = InitiatingProcessFileName,
ProcessCmdLine = InitiatingProcessCommandLine,
LibraryLoaded = FileName, LibraryPath = FolderPath,
ParentProcess = InitiatingProcessParentFileName;
// Branch 2: PowerShell using .NET compression classes
let PSCompression =
DeviceProcessEvents
| where Timestamp > ago(24h)
| where FileName in~ ("powershell.exe", "pwsh.exe")
| where ProcessCommandLine has_any (
"IO.Compression", "GZipStream", "DeflateStream", "ZipFile", "ZipArchive",
"BinaryWriter", "MemoryStream", "Compress", "System.IO.Compression",
"ICSharpCode.SharpZipLib", "DotNetZip"
)
| extend DetectionSource = "PowerShellLibraryCompression"
| project Timestamp, DeviceName, AccountName, DetectionSource,
ProcessName = FileName,
ProcessCmdLine = ProcessCommandLine,
LibraryLoaded = "System.IO.Compression (in-process)", LibraryPath = "",
ParentProcess = InitiatingProcessFileName;
// Branch 3: Python processes invoking compression library functions inline or via script args
let PythonCompression =
DeviceProcessEvents
| where Timestamp > ago(24h)
| where FileName in~ ("python.exe", "python3.exe", "python3.10.exe", "python3.11.exe", "python3.12.exe")
| where ProcessCommandLine has_any (
"import zlib", "import bz2", "import gzip", "import zipfile", "import rarfile",
"import lzma", "import tarfile", "zlib.compress", "bz2.compress", "gzip.open",
"zlib", "bzip2", "rarfile"
)
| extend DetectionSource = "PythonLibraryCompression"
| project Timestamp, DeviceName, AccountName, DetectionSource,
ProcessName = FileName,
ProcessCmdLine = ProcessCommandLine,
LibraryLoaded = "Python compression module", LibraryPath = "",
ParentProcess = InitiatingProcessFileName;
// Branch 4: Compressed files written to staging paths by scripting interpreters
let StagedCompressedFiles =
DeviceFileEvents
| where Timestamp > ago(24h)
| where ActionType == "FileCreated"
| where FileName has_any (".gz", ".bz2", ".zlib", ".lz", ".lzma", ".xz")
| where FolderPath has_any (StagingPaths)
| where InitiatingProcessFileName in~ (ScriptingInterpreters)
| extend DetectionSource = "StagedCompressedFileCreation"
| project Timestamp, DeviceName, AccountName, DetectionSource,
ProcessName = InitiatingProcessFileName,
ProcessCmdLine = InitiatingProcessCommandLine,
LibraryLoaded = "File artifact: " + FileName, LibraryPath = FolderPath,
ParentProcess = InitiatingProcessParentFileName;
CompDllLoads
| union PSCompression
| union PythonCompression
| union StagedCompressedFiles
| sort by Timestamp desc Multi-branch detection covering library-based archival across four signal sources: (1) compression DLLs (zlib.dll, bzip2.dll, libzip.dll) loaded by processes outside known-good installation paths; (2) PowerShell invoking .NET System.IO.Compression classes (GZipStream, DeflateStream, ZipArchive) programmatically; (3) Python interpreter processes with compression library imports visible in command line arguments; (4) compressed file artifacts (.gz, .bz2, .zlib) written to staging paths by scripting interpreters. Designed to detect in-process compression that leaves no archivar utility process trail.
Data Sources
Required Tables
False Positives & Tuning
- Legitimate Python data science or ETL pipelines that compress output files using zlib or gzip before writing to storage (Spark, Pandas, Airflow DAGs)
- Software installers and update agents that use zlib/bzip2 for package decompression and compression during installation
- PowerShell-based backup or log rotation scripts that use System.IO.Compression.GZipStream to compress old logs or create compressed archives
- Developer workstations running build tools (Maven, Gradle, npm) that link against or load compression libraries during compilation and packaging
- Monitoring and APM agents (Datadog, New Relic, Elastic APM) that compress telemetry payloads before sending to collection endpoints
Other platforms for T1560.002
Testing Methodology
Validate this detection against 4 adversary techniques from Atomic Red Team. Each test below lists the behaviour to exercise and the telemetry you should expect to see. Executable commands and cleanup steps are available with Pro.
- Test 1Python zlib Compression of Sensitive File Collection
Expected signal: Sysmon Event ID 1: Process Create with Image=python.exe, CommandLine containing 'import zlib' and 'zlib.compress'. Sysmon Event ID 11: File Create for %TEMP%\stage_df00.zlib. DeviceFileEvents: FileCreated action for the .zlib output file. DeviceFileEvents: FileRead action for the hosts file.
- Test 2PowerShell GZipStream Compression via System.IO.Compression
Expected signal: Sysmon Event ID 1: Process Create with Image=powershell.exe, CommandLine containing 'IO.Compression', 'GZipStream', 'MemoryStream', and 'CompressionMode'. Sysmon Event ID 11: File Create for %TEMP%\stage_df00.gz. PowerShell ScriptBlock Log Event ID 4104 with full .NET compression code.
- Test 3Python bzip2 Multi-File Collection and Compression
Expected signal: Syslog/auditd: python3 process execution with command line containing 'import bz2', 'tarfile', and output path in /tmp. File creation event for /tmp/stage_df00.tar.bz2. Auditd syscall events: openat for source files, write for output file. Linux process accounting: python3 with suspicious file access pattern.
- Test 4Python rarfile Library Compression (Third-Party Library)
Expected signal: Sysmon Event ID 1: python.exe with CommandLine containing 'import zipfile', 'ZIP_DEFLATED', and temp path. Sysmon Event ID 11: File Create for %TEMP%\stage_df00_lib.zip. Potential child process for pip install subprocess. DeviceFileEvents shows file read of hosts file and write of .zip artifact.
References (12)
- https://attack.mitre.org/techniques/T1560/002/
- https://github.com/madler/zlib
- https://libzip.org/
- https://pypi.org/project/rarfile/
- https://learn.microsoft.com/en-us/dotnet/api/system.io.compression
- https://securelist.com/kaspersky-lab-discovers-the-tajmahal-apt-framework/90240/
- https://www.welivesecurity.com/2024/05/23/eset-research-unveils-lunar-toolset-diplomatic-espionage/
- https://unit42.paloaltonetworks.com/bbsrat-attacks-targeting-russian-organizations-linked-to-roaming-tiger/
- https://www.welivesecurity.com/2018/06/07/invisimole-equipped-spyware-undercover/
- https://github.com/redcanaryco/atomic-red-team/blob/master/atomics/T1560.002/T1560.002.md
- https://learn.microsoft.com/en-us/defender-endpoint/advanced-hunting-deviceimageloadevents-table
- https://docs.microsoft.com/en-us/sysinternals/downloads/sysmon
Unlock Pro Content
Get the full detection package for T1560.002 including response playbook, investigation guide, and atomic red team tests.