Detect Non-Standard Encoding in Sumo Logic CSE
Adversaries may encode data with a non-standard data encoding system to make the content of command and control traffic more difficult to detect. Non-standard encoding schemes diverge from existing protocol specifications — for example, modified Base64 using a custom alphabet, XOR encoding with a static or rolling key, character substitution (replacing '/' with '-s', '+' with '-p'), or custom binary serialization. Real-world examples include OceanSalt (NOT operation on bytes), Small Sieve (hex byte swapping), TONESHELL (XOR with 32/256-byte key), NightClub (modified Base64 in DNS subdomains), RDAT (Base64 with character substitutions in DNS), InvisiMole (modified Base32 in DNS subdomains), and Uroburos (custom Base62/Base32). Detection focuses on anomalous DNS subdomain lengths and entropy, unusual encoded patterns in network traffic, and scripting processes generating high-entropy outbound data.
MITRE ATT&CK
- Tactic
- Command and Control
- Technique
- T1132 Data Encoding
- Sub-technique
- T1132.002 Non-Standard Encoding
- Canonical reference
- https://attack.mitre.org/techniques/T1132/002/
Sumo Detection Query
/* Branch 1: Sysmon DNS queries (EventCode 22) with encoded subdomain labels */
(_sourceCategory=*windows* OR _sourceCategory=*sysmon*)
| where EventCode = "22"
| parse regex "QueryName:\\s+(?<DNSQuery>[^\\r\\n]+)" nodrop
| parse regex "Image:\\s+(?<ProcessImage>[^\\r\\n]+)" nodrop
| parse regex "User:\\s+(?<UserName>[^\\r\\n]+)" nodrop
| parse regex field=DNSQuery "^(?<FirstLabel>[^.]+)\\." nodrop
| where !(isNull(FirstLabel)) and !(isEmpty(FirstLabel))
| where length(FirstLabel) > 50
or FirstLabel matches /^[A-Za-z0-9+\/=]{40,}$/
or FirstLabel matches /^[A-Za-z0-9_\-]{40,}$/
or FirstLabel matches /^[0-9a-fA-F]{40,}$/
| eval EncodingType = if(FirstLabel matches /^[0-9a-fA-F]{40,}$/, "HexEncoded",
if(FirstLabel matches /^[A-Za-z0-9+\/=]{40,}$/, "Base64Like",
"ModifiedBase64URLSafe"))
| eval DetectionBranch = "DNS_NonStandardEncoding"
| fields _time, _sourceHost, UserName, ProcessImage, DNSQuery, FirstLabel, EncodingType, DetectionBranch
/* Branch 2: Sysmon network connects (EventCode 3) — scripting engine beaconing */
/* Run as a separate query:
(_sourceCategory=*sysmon*)
| where EventCode = "3"
| parse regex "Image:\\s+(?<ProcessImage>[^\\r\\n]+)" nodrop
| parse regex "DestinationIp:\\s+(?<DestIP>[^\\r\\n]+)" nodrop
| parse regex "DestinationPort:\\s+(?<DestPort>[^\\r\\n]+)" nodrop
| where ProcessImage matches /(?i).*(python|perl|ruby|wscript|cscript|mshta|powershell|pwsh).*/
| where !(DestIP matches /^(10\\.|172\\.(1[6-9]|2[0-9]|3[01])\\.|192\\.168\\.|127\\.).*/)
| eval DetectionBranch = "Beaconing_ScriptingEngine"
| timeslice 1h
| stats count as ConnCount, dcount(DestIP) as UniqueIPs, values(DestIP) as DestIPs
by _timeslice, _sourceHost, ProcessImage
| where ConnCount > 10 and UniqueIPs < 3
*/ Sumo Logic detection for T1132.002 using Sysmon EventCode 22 (DNS query) to extract and classify first subdomain labels matching Base64, URL-safe Base64, or hex encoding patterns exceeding 40–50 characters. Raw message parsing via `parse regex` extracts fields from Sysmon XML event payloads. A secondary commented branch covers EventCode 3 (Network Connect) to detect scripting engine beaconing with low destination IP diversity over 1-hour windows. Both branches are suitable for Sumo Logic Cloud SIEM rule creation.
Data Sources
Required Tables
False Positives & Tuning
- IT management platforms using wscript.exe or cscript.exe for legitimate administrative tasks communicating with management endpoints using token-encoded URLs
- DNS-based geographic load balancers or service discovery systems generating long hash-based first labels for endpoint routing (e.g., Route 53 weighted routing, Cloudflare DNS)
- Automated enterprise PowerShell scripts performing scheduled telemetry collection or reporting to cloud services using long session-scoped query parameters
Other platforms for T1132.002
Testing Methodology
Validate this detection against 4 adversary techniques from Atomic Red Team. Each test below lists the behaviour to exercise and the telemetry you should expect to see. Executable commands and cleanup steps are available with Pro.
- Test 1Simulate DNS Tunneling with Modified Base64 Subdomain Encoding
Expected signal: Sysmon Event ID 22 (DNS Query): QueryName will contain a long alphanumeric subdomain label (length > 30 characters) matching the pattern [a-z0-9ps]{30,}\.df00tech-test\.local. Sysmon Event ID 1 (Process Create): powershell.exe with command line containing Base64, Replace, and Resolve-DnsName. PowerShell ScriptBlock Log Event ID 4104 capturing the encoding logic.
- Test 2XOR-Encoded C2 Data Transmission Simulation (TONESHELL Pattern)
Expected signal: Sysmon Event ID 1: powershell.exe with CommandLine containing -bxor, New-Object System.Net.WebClient, and UploadString. Sysmon Event ID 3: Network connection attempt to 127.0.0.1:8080 (connection will be refused but event fires). PowerShell ScriptBlock Log Event ID 4104 capturing the full XOR encoding loop and WebClient upload code.
- Test 3High-Volume DNS Query Burst Simulating DNS Tunneling Data Transfer
Expected signal: 25x Sysmon Event ID 22 (DNS Query) events within ~5 seconds, each with a unique QueryName containing a long base64-like subdomain label (length 40-70 characters) under df00tech-dnstest.local. All queries initiated by powershell.exe. The burst pattern with unique subdomains matches DNS tunneling telemetry.
- Test 4HTTP C2 with Custom Base64 Alphabet Encoding (Neo-reGeorg Pattern)
Expected signal: Sysmon Event ID 1: powershell.exe with CommandLine containing IndexOf, ToCharArray, WebClient, and UploadString — all indicators of custom encoding implementation. Sysmon Event ID 3: Network connection to 127.0.0.1:8080. PowerShell ScriptBlock Log Event ID 4104 capturing the full custom alphabet encoding logic. If stream:http is available, the POST body will contain d=<60+ char custom-alphabet string>.
References (11)
- https://attack.mitre.org/techniques/T1132/002/
- https://en.wikipedia.org/wiki/Binary-to-text_encoding
- https://en.wikipedia.org/wiki/Character_encoding
- https://www.welivesecurity.com/2023/08/10/moustachedbouncer-espionage-targeted-isp-level-adversary-in-the-middle-attacks-against-belarus/
- https://unit42.paloaltonetworks.com/rdat-used-to-target-middle-eastern-energy-company/
- https://www.welivesecurity.com/2020/06/18/digging-up-invismole-hidden-arsenal/
- https://media.kasperskycontenthub.com/wp-content/uploads/sites/43/2017/08/07172148/ShadowPad_technical_description_PDF.pdf
- https://www.mandiant.com/sites/default/files/2022-02/rt-apt41-dual-operation.pdf
- https://github.com/L-codes/Neo-reGeorg
- https://www.ncsc.gov.uk/files/NCSC-GCHQ-Small-Sieve-Malware-Analysis-Report.pdf
- https://www.cisa.gov/sites/default/files/2022-02/aa22-055a-iranian-government-sponsored-actors-conduct-cyber-operations.pdf
Unlock Pro Content
Get the full detection package for T1132.002 including response playbook, investigation guide, and atomic red team tests.