Detect Data from Cloud Storage in Splunk
Adversaries access data from cloud storage services including IaaS object stores (Amazon S3, Azure Blob Storage, Google Cloud Storage) and SaaS platform storage (OneDrive, SharePoint, Google Drive, Dropbox). Attack vectors include exploiting misconfigured public bucket access, using compromised credentials or SAS tokens, abusing overly permissive IAM roles, and automated tools such as Rclone, Pacu, and AADInternals for bulk extraction. Threat actors observed using this technique include Fox Kitten, APT42, HAFNIUM, Scattered Spider, and Storm-0501 — the latter specifically modifying Azure Storage account configurations to expose non-remotely accessible accounts for data exfiltration. Misconfigurations enabling anonymous or overly broad access have led to exposure of PII, medical records, and financial data at scale.
MITRE ATT&CK
- Tactic
- Collection
- Technique
- T1530 Data from Cloud Storage
- Canonical reference
- https://attack.mitre.org/techniques/T1530/
SPL Detection Query
((index=aws sourcetype=aws:cloudtrail eventSource="s3.amazonaws.com"
(eventName="GetObject" OR eventName="ListBucket" OR eventName="ListObjects"
OR eventName="ListObjectsV2" OR eventName="GetBucketAcl" OR eventName="GetBucketPolicy"
OR eventName="GetBucketPublicAccessBlock"))
OR
(index=o365 sourcetype=o365:management:activity
(Workload="OneDrive" OR Workload="SharePoint")
(Operation="FileDownloaded" OR Operation="FileSyncDownloadedFull"
OR Operation="FileAccessed" OR Operation="FileCopied")))
| eval Platform=case(
sourcetype="aws:cloudtrail", "AWS_S3",
sourcetype="o365:management:activity", "Microsoft365",
true(), "Unknown")
| eval BucketOrSite=coalesce('requestParameters.bucketName', SiteUrl, "unknown")
| eval ObjectKey=coalesce('requestParameters.key', SourceFileName, "unknown")
| eval ActorIdentity=coalesce('userIdentity.arn', 'userIdentity.userName', UserId, "unknown")
| eval SourceIP=coalesce(sourceIPAddress, ClientIP, "unknown")
| eval IsAnonymous=case(
'userIdentity.type'="Anonymous", 1,
'userIdentity.principalId'="anonymous", 1,
true(), 0)
| eval IsListOp=if(eventName IN ("ListBucket","ListObjects","ListObjectsV2",
"GetBucketAcl","GetBucketPolicy","GetBucketPublicAccessBlock"), 1, 0)
| eval IsGetOp=if(eventName="GetObject" OR Operation IN
("FileDownloaded","FileSyncDownloadedFull"), 1, 0)
| eval ToolSignature=if(match(coalesce(userAgent,""),
"(?i)(rclone|pacu|boto3|aadInternals|aws-cli\/|python-requests|curl)"), 1, 0)
| bin _time span=30m
| stats
count as TotalRequests,
sum(IsGetOp) as DownloadCount,
sum(IsListOp) as EnumCount,
max(IsAnonymous) as HasAnonymousAccess,
max(ToolSignature) as KnownToolDetected,
dc(ObjectKey) as UniqueObjects,
values(BucketOrSite) as TargetResources,
first(ActorIdentity) as Actor,
first(userAgent) as UserAgentSample
by _time, SourceIP, Platform
| where DownloadCount > 50 OR HasAnonymousAccess=1 OR EnumCount > 20
| eval SuspicionScore=
if(HasAnonymousAccess=1, 3, 0) +
if(DownloadCount > 500, 3, if(DownloadCount > 100, 2, if(DownloadCount > 50, 1, 0))) +
if(EnumCount > 20, 1, 0) +
if(KnownToolDetected=1, 2, 0)
| eval Severity=case(
HasAnonymousAccess=1 OR SuspicionScore >= 5, "High",
SuspicionScore >= 3, "Medium",
true(), "Low")
| where SuspicionScore > 0
| table _time, Platform, Actor, SourceIP, TargetResources, DownloadCount,
EnumCount, UniqueObjects, HasAnonymousAccess, UserAgentSample,
SuspicionScore, Severity
| sort - _time Detects cloud storage data collection across AWS S3 and Microsoft 365 platforms. Ingests AWS CloudTrail S3 data events (GetObject, ListBucket, ListObjects) and Microsoft 365 OneDrive/SharePoint file access events into a unified detection. Computes a suspicion score across four dimensions: bulk download volume (50+ GetObject operations in 30 minutes), bucket or site enumeration activity (20+ list operations), anonymous or unauthenticated access, and known exfiltration tool user agent signatures (Rclone, Pacu, Boto3, AADInternals). Requires AWS CloudTrail S3 data event logging enabled and forwarded to Splunk, and the Splunk Add-on for Microsoft Office 365 configured.
Data Sources
Required Sourcetypes
False Positives & Tuning
- Automated backup systems (AWS Backup, Veeam for AWS) performing scheduled S3 object downloads will trigger bulk download thresholds — these typically run from known backup service account ARNs
- Data warehouse ETL pipelines and analytics platforms (Glue, EMR, Databricks) performing high-volume S3 reads as part of normal batch processing jobs
- Content delivery and replication workflows that legitimately use anonymous access for public S3 buckets hosting static assets, open datasets, or software distribution content
- Microsoft 365 eDiscovery and compliance export operations that bulk-access SharePoint or OneDrive content for legal review — typically run from compliance service accounts
- Developer tool user agents (boto3, aws-cli) used legitimately by engineers accessing their own team's storage resources during normal development workflows
Other platforms for T1530
Testing Methodology
Validate this detection against 4 adversary techniques from Atomic Red Team. Each test below lists the behaviour to exercise and the telemetry you should expect to see. Executable commands and cleanup steps are available with Pro.
- Test 1AWS S3 Anonymous Bucket Enumeration and Download
Expected signal: AWS CloudTrail will record ListObjects and GetObject events with userIdentity.type=Anonymous and userIdentity.principalId=anonymous. sourceIPAddress will be the tester's public IP. No ARN present in the identity block. S3 server access logs (if enabled) will show - as the requester.
- Test 2AWS S3 Bulk Object Download with Valid Credentials
Expected signal: CloudTrail records: ListObjects (eventName=ListObjectsV2) and multiple GetObject events from the same source IP within a short window. userIdentity.type=IAMUser or AssumedRole with the test key ARN. requestParameters.bucketName contains the target bucket. High event volume triggers the 50+ GetObject threshold in the SPL detection.
- Test 3Rclone Cloud Storage Sync (Exfiltration Tool Pattern)
Expected signal: CloudTrail GetObject and ListObjectsV2 events with userAgent containing 'rclone/' version string (e.g., 'rclone/v1.65.0'). High-volume sequential GetObject events for each file in the bucket. The rclone.conf file will contain plaintext cloud credentials at ~/.config/rclone/rclone.conf — a forensic artifact.
- Test 4AADInternals OneDrive File Collection
Expected signal: Microsoft 365 OfficeActivity logs: FileDownloaded operations with UserId matching the authenticated account, OfficeWorkload=OneDrive. UserAgent will identify AADInternals. ClientIP will be the tester's IP. Events appear in the Unified Audit Log within minutes. EntraID SigninLogs will show the authentication used to obtain the access token.
References (11)
- https://attack.mitre.org/techniques/T1530/
- https://aws.amazon.com/premiumsupport/knowledge-center/secure-s3-resources/
- https://docs.microsoft.com/en-us/azure/storage/common/storage-security-guide
- https://redcanary.com/blog/rclone-mega-extortion/
- https://www.trendmicro.com/vinfo/us/security/news/virtualization-and-cloud/a-misconfigured-amazon-s3-exposed-almost-50-thousand-pii-in-australia
- https://github.com/RhinoSecurityLabs/pacu
- https://github.com/Gerenios/AADInternals
- https://learn.microsoft.com/en-us/azure/storage/blobs/monitor-blob-storage-reference
- https://learn.microsoft.com/en-us/microsoft-365/compliance/audit-log-activities
- https://docs.aws.amazon.com/AmazonS3/latest/userguide/cloudtrail-logging.html
- https://www.microsoft.com/en-us/security/blog/2024/09/26/storm-0501-ransomware-attacks-expanding-to-hybrid-cloud-environments/
Unlock Pro Content
Get the full detection package for T1530 including response playbook, investigation guide, and atomic red team tests.