T1213.003 Microsoft Sentinel · KQL

Detect Code Repositories in Microsoft Sentinel

Adversaries may leverage code repositories to collect valuable information including proprietary source code and unsecured credentials embedded within software. Code repositories such as GitHub, GitLab, Bitbucket, and Azure DevOps store source code and automate software builds, and may be hosted internally or externally. Once adversaries gain access via compromised credentials, stolen OAuth tokens, or insider access, they may bulk-clone repositories, run automated secret-scanning tools (trufflehog, gitleaks) to harvest embedded API keys and passwords, or enumerate organizational repositories at scale via API calls. LAPSUS$ searched victim networks for GitLab and GitHub instances to discover high-privilege credentials; Scattered Spider enumerated internal GitHub repositories as part of broader data theft operations; APT41 cloned victim Git repositories during intrusions. Successful exploitation provides adversaries with source code for developing targeted exploits, service credentials for lateral movement, and intellectual property for competitive or financial gain.

MITRE ATT&CK

Tactic
Collection
Technique
T1213 Data from Information Repositories
Sub-technique
T1213.003 Code Repositories
Canonical reference
https://attack.mitre.org/techniques/T1213/003/

KQL Detection Query

Microsoft Sentinel (KQL)
kusto
let SecretScanTools = dynamic(["trufflehog", "gitleaks", "git-secrets", "gitrob", "shhgit", "detect-secrets", "gitallsecrets", "git-hound"]);
let RepoAPIPatterns = dynamic([
    "api.github.com/orgs", "api.github.com/users/", "api.github.com/repos", "api.github.com/search/repositories",
    "gitlab.com/api/v4/projects", "gitlab.com/api/v4/groups",
    "api.bitbucket.org/2.0/repositories",
    "dev.azure.com" 
]);
DeviceProcessEvents
| where Timestamp > ago(24h)
| where (
    // Bulk git clone / archive / bundle operations
    (FileName in~ ("git.exe", "git") and ProcessCommandLine has_any ("clone", "archive", "bundle"))
    // Secret scanning tools targeting repositories
    or FileName has_any (SecretScanTools)
    or ProcessCommandLine has_any (SecretScanTools)
    // API-based repository enumeration via scripting tools
    or (
        FileName in~ ("powershell.exe", "pwsh.exe", "python.exe", "python3.exe", "curl.exe", "curl", "wget.exe", "wget")
        and ProcessCommandLine has_any (RepoAPIPatterns)
    )
)
| extend IsBulkClone = iff(FileName in~ ("git.exe", "git") and ProcessCommandLine has "clone", true, false)
| extend IsSecretScan = iff(FileName has_any (SecretScanTools) or ProcessCommandLine has_any (SecretScanTools), true, false)
| extend IsAPIEnum = iff(ProcessCommandLine has_any (RepoAPIPatterns), true, false)
| extend IsBulkExtract = iff(FileName in~ ("git.exe", "git") and (ProcessCommandLine has "archive" or ProcessCommandLine has "bundle"), true, false)
| summarize
    EventCount = count(),
    CommandSamples = make_set(ProcessCommandLine, 15),
    IsBulkClone = max(toint(IsBulkClone)),
    IsSecretScan = max(toint(IsSecretScan)),
    IsAPIEnum = max(toint(IsAPIEnum)),
    IsBulkExtract = max(toint(IsBulkExtract)),
    FirstSeen = min(Timestamp),
    LastSeen = max(Timestamp)
    by DeviceName, AccountName, FileName, bin(Timestamp, 1h)
| where IsSecretScan == 1
    or IsAPIEnum == 1
    or IsBulkExtract == 1
    or (IsBulkClone == 1 and EventCount >= 5)
| extend DetectionType = case(
    IsSecretScan == 1, "SecretScanningToolExecution",
    IsAPIEnum == 1, "RepositoryAPIEnumeration",
    IsBulkExtract == 1, "GitBulkExtraction",
    IsBulkClone == 1 and EventCount >= 5, "BulkRepositoryCloning",
    "MultipleSignals"
)
| project FirstSeen, LastSeen, DeviceName, AccountName, FileName, DetectionType, EventCount, CommandSamples
| sort by FirstSeen desc
high severity high confidence

Detects adversarial collection from code repositories using Microsoft Defender for Endpoint (MDE) DeviceProcessEvents. Identifies four key attack patterns: (1) bulk git clone operations — five or more clone executions within a one-hour window from the same account, consistent with automated repository harvesting (APT41, Scattered Spider); (2) secret scanning tool execution — known credential-harvesting tools such as trufflehog, gitleaks, gitrob, and shhgit run against local or remote repositories; (3) API-based repository enumeration — scripting runtimes (PowerShell, Python, curl) making calls to GitHub/GitLab/Bitbucket organization or repository list endpoints; (4) git archive and bundle commands used to bulk-extract complete repository content as a single artifact. Results are summarized per account per hour with a DetectionType label for analyst triage.

Data Sources

Process: Process CreationCommand: Command ExecutionMicrosoft Defender for Endpoint

Required Tables

DeviceProcessEvents

False Positives & Tuning

  • CI/CD pipeline agents (Jenkins, GitHub Actions runners, Azure DevOps build agents) that perform bulk repository clones as part of legitimate build orchestration
  • Security engineering teams running authorized secret scanning (gitleaks, trufflehog) as part of AppSec pipeline or pre-commit hooks
  • Developer onboarding scripts that clone multiple repositories simultaneously to set up a local development environment
  • Backup and archival automation jobs that use git bundle or git archive to create scheduled snapshots of organizational repositories
  • Supply chain security tools (Dependabot, Renovate, Snyk) that enumerate repositories to check for vulnerable dependencies
Download portable Sigma rule (.yml)

Other platforms for T1213.003


Testing Methodology

Validate this detection against 4 adversary techniques from Atomic Red Team. Each test below lists the behaviour to exercise and the telemetry you should expect to see. Executable commands and cleanup steps are available with Pro.

  1. Test 1Bulk Repository Cloning via Shell Loop

    Expected signal: Sysmon Event ID 1 (if configured on Linux via auditd): multiple git process creation events with CommandLine containing 'clone' within seconds. On Windows endpoints: DeviceProcessEvents entries for git.exe with ProcessCommandLine matching 'clone https://github.com/'. Five or more clone events from the same AccountName within a 1-hour bin.

  2. Test 2Secret Scanning with Trufflehog Against Local Repository

    Expected signal: Sysmon Event ID 1: Process Create with Image containing 'trufflehog' or CommandLine containing 'trufflehog'. If trufflehog is not installed, the which command still creates a process event. DeviceProcessEvents: FileName='trufflehog' or ProcessCommandLine has 'trufflehog'.

  3. Test 3GitHub Organization Repository Enumeration via API

    Expected signal: Sysmon Event ID 1: Process Create for curl with CommandLine containing 'api.github.com/orgs'. Sysmon Event ID 3: Network Connection to api.github.com:443. DeviceProcessEvents: FileName='curl' with ProcessCommandLine has 'api.github.com/orgs'. DeviceNetworkEvents: RemoteUrl containing 'api.github.com'.

  4. Test 4Git Archive Bulk Content Extraction

    Expected signal: Sysmon Event ID 1: Process Create with Image containing 'git' and CommandLine containing 'archive --format'. Sysmon Event ID 11: File Create events for the extracted files in /tmp. DeviceProcessEvents: FileName='git' with ProcessCommandLine has 'archive'. DeviceFileEvents: multiple file creation events from the git process in output directory.

Unlock Pro Content

Get the full detection package for T1213.003 including response playbook, investigation guide, and atomic red team tests.

Response PlaybookInvestigation GuideHunting QueriesAtomic Red Team TestsTuning Guidance

Related Detections