Detect Exfiltration to Code Repository in Microsoft Sentinel
Adversaries may exfiltrate data to a code repository rather than over their primary command and control channel. Code repositories are often accessible via an API (ex: https://api.github.com). Access to these APIs are often over HTTPS, which gives the adversary an additional level of protection. Exfiltration to a code repository can also provide a significant amount of cover to the adversary if it is a popular service already used by hosts within the network. Tools such as Empire have been observed using GitHub for data exfiltration, leveraging the GitHub API to stage and retrieve data as part of a C2 channel.
MITRE ATT&CK
- Tactic
- Exfiltration
- Technique
- T1567 Exfiltration Over Web Service
- Sub-technique
- T1567.001 Exfiltration to Code Repository
- Canonical reference
- https://attack.mitre.org/techniques/T1567/001/
KQL Detection Query
let CodeRepoDomains = dynamic(["github.com", "api.github.com", "gitlab.com", "api.gitlab.com", "bitbucket.org", "api.bitbucket.org", "dev.azure.com", "raw.githubusercontent.com", "gist.github.com", "codeberg.org"]);
// Signal 1: Large outbound data transfers from scripting/git tools to code repository domains
let NetworkSignal = DeviceNetworkEvents
| where Timestamp > ago(24h)
| where RemoteUrl has_any (CodeRepoDomains)
| where InitiatingProcessFileName has_any ("git", "curl", "wget", "python", "powershell", "pwsh", "node", "ruby", "perl")
| where BytesSent > 524288
| extend Signal = "LargeUploadToCodeRepo"
| extend BytesSentMB = round(toreal(BytesSent) / 1048576, 2)
| project Timestamp, DeviceName, AccountName, Signal, RemoteUrl, RemotePort, BytesSent, BytesSentMB, BytesReceived, InitiatingProcessFileName, InitiatingProcessCommandLine, InitiatingProcessParentFileName;
// Signal 2: Git push commands explicitly targeting external repository URLs
let GitPushSignal = DeviceProcessEvents
| where Timestamp > ago(24h)
| where FileName =~ "git.exe" or FileName =~ "git"
| where ProcessCommandLine has "push"
| where ProcessCommandLine has_any (CodeRepoDomains)
| extend Signal = "GitPushToExternalRepo"
| extend BytesSentMB = 0.0
| project Timestamp, DeviceName, AccountName, Signal, RemoteUrl="", RemotePort=0, BytesSent=tolong(0), BytesSentMB, BytesReceived=tolong(0), InitiatingProcessFileName, InitiatingProcessCommandLine, InitiatingProcessParentFileName=InitiatingProcessParentFileName;
// Signal 3: Direct API calls to repository REST APIs using PUT/POST (file upload without git client)
let ApiUploadSignal = DeviceProcessEvents
| where Timestamp > ago(24h)
| where FileName has_any ("curl", "wget", "python", "powershell", "pwsh", "node", "ruby")
| where ProcessCommandLine has_any ("api.github.com", "api.gitlab.com", "api.bitbucket.org", "gist.github.com")
| where ProcessCommandLine has_any ("-X PUT", "-X POST", "method='PUT'", "method='POST'", "requests.put", "requests.post", "Invoke-RestMethod", "Invoke-WebRequest", "contents", "gists")
| extend Signal = "CodeRepoAPIUpload"
| extend BytesSentMB = 0.0
| project Timestamp, DeviceName, AccountName, Signal, RemoteUrl="", RemotePort=443, BytesSent=tolong(0), BytesSentMB, BytesReceived=tolong(0), InitiatingProcessFileName=FileName, InitiatingProcessCommandLine=ProcessCommandLine, InitiatingProcessParentFileName;
union NetworkSignal, GitPushSignal, ApiUploadSignal
| sort by Timestamp desc Detects potential data exfiltration to external code repositories via three complementary signals: (1) large outbound transfers (>512KB) from git/scripting tools to known repository domains via DeviceNetworkEvents, (2) git push commands explicitly containing external repository URLs in the process command line via DeviceProcessEvents, and (3) direct REST API calls to repository APIs (GitHub Contents API, GitLab API, Gist API) using PUT/POST methods from scripting tools — a technique used by Empire-style frameworks to upload data without the git client. All three signals feed a unified result set for triage.
Data Sources
Required Tables
False Positives & Tuning
- Software developers legitimately pushing code to GitHub or GitLab as part of normal development workflow — especially on developer workstations
- CI/CD pipeline agents (Jenkins build servers, GitHub Actions self-hosted runners, GitLab CI runners) performing automated builds and deployments that push artifacts or release assets
- Developer IDEs with integrated Git (VS Code, IntelliJ, Visual Studio) performing background sync, auto-push on save, or pull request creation via API
- Backup and configuration management scripts that legitimately use GitHub/GitLab as a storage backend for infrastructure-as-code or configuration files
- Security tools such as Dependabot, Renovate, or Snyk that create automated pull requests by pushing fix branches to repositories
Other platforms for T1567.001
Testing Methodology
Validate this detection against 4 adversary techniques from Atomic Red Team. Each test below lists the behaviour to exercise and the telemetry you should expect to see. Executable commands and cleanup steps are available with Pro.
- Test 1Exfiltrate Data via GitHub Contents API Using PowerShell
Expected signal: Sysmon Event ID 1: Process Create with Image=powershell.exe and CommandLine containing 'api.github.com', 'Invoke-RestMethod', 'PUT', and 'token'. Sysmon Event ID 3: Network Connection from powershell.exe to api.github.com:443. PowerShell ScriptBlock Log Event ID 4104 with full API request including the encoded content. Network proxy logs show HTTPS PUT to api.github.com with outbound payload.
- Test 2Git Push Sensitive Files to External Repository from Command Shell
Expected signal: Sysmon Event ID 1: Multiple process creates — cmd.exe, git.exe (init), git.exe (add), git.exe (commit), git.exe (push) with CommandLine containing 'github.com' and '--force'. Sysmon Event ID 3: Network Connection from git.exe to github.com:443. Sysmon Event ID 11: File Create for stolen_creds.txt in %TEMP%\df00tech-exfil. Security Event ID 4688 if command line auditing enabled.
- Test 3Exfiltrate Data via GitHub Gist API Using curl
Expected signal: Sysmon Event ID 1: Process Create with Image=powershell.exe spawning curl.exe, CommandLine containing 'api.github.com/gists', '-X POST', and the Authorization header with PAT. Sysmon Event ID 3: Network Connection from curl.exe to api.github.com:443. Proxy logs show HTTPS POST to api.github.com/gists with outbound JSON payload.
- Test 4Automated Data Exfiltration via Python GitHub API Script
Expected signal: Sysmon Event ID 1: Process Create with Image=python.exe and CommandLine containing 'api.github.com', 'PUT', 'urllib.request', and the Authorization token. Sysmon Event ID 3: Network Connection from python.exe to api.github.com:443. No PowerShell ScriptBlock logging (Python process); look for Python audit hooks or endpoint DLP alerts on the data access pattern (reading hosts file).
References (10)
- https://attack.mitre.org/techniques/T1567/001/
- https://github.com/EmpireProject/Empire
- https://docs.github.com/en/rest/repos/contents
- https://docs.github.com/en/rest/gists/gists
- https://docs.github.com/en/organizations/keeping-your-organization-secure/managing-security-settings-for-your-organization/reviewing-the-audit-log-for-your-organization
- https://learn.microsoft.com/en-us/defender-endpoint/advanced-hunting-devicenetworkevents-table
- https://learn.microsoft.com/en-us/defender-endpoint/advanced-hunting-deviceprocessevents-table
- https://github.com/redcanaryco/atomic-red-team/blob/master/atomics/T1567.001/T1567.001.md
- https://github.com/SigmaHQ/sigma/tree/master/rules/windows/network_connection
- https://www.cisa.gov/news-events/cybersecurity-advisories/aa23-347a
Unlock Pro Content
Get the full detection package for T1567.001 including response playbook, investigation guide, and atomic red team tests.