Sensitive data theft is among adversaries’ most common goals. For defenders, data exfiltration can lead to the compromise of customer data, public exposure of trade secrets, and potentially permanent business and reputational damage. Victims of data exfiltration may also face legal issues for non-compliance with data protection laws. This must be a top concern for businesses.
In this blog, we examine various data exfiltration methods affecting cloud environments, networks, and physical media. We also share examples of how eCrime adversaries, including SCATTERED SPIDER, INDRIK SPIDER, and GRACEFUL SPIDER, conduct data exfiltration through cloud targeting, ransomware, zero-day exploitation, and other tactics. Finally, we demonstrate how the capabilities of the CrowdStrike Falcon® platform, including CrowdStrike Falcon® Next-Gen SIEM and CrowdStrike Falcon® Adversary Intelligence, detect this activity.
Table of Contents
Understanding Exfiltration Techniques
MITRE ATT&CK® offers good coverage of potential techniques that could be used to achieve exfiltration. In general, exfiltration channels can be grouped into three categories: cloud-based, network-based, and physical media-based. Below is a brief overview of each method.
Cloud-Based Exfiltration
Cloud-based exfiltration is becoming increasingly popular among attackers as it provides a convenient and often overlooked channel for data exfiltration. Common cloud-based exfiltration patterns include:
- Cloud storage services: Adversaries may use cloud storage services such as Dropbox, Google Drive, or OneDrive to exfiltrate data. These services typically use protocols like HTTPS, making it difficult to detect malicious activity. Furthermore, if the adversary has remote desktop (RDP) access to an infected system, they can use pre-installed applications such as web browsers or file-sharing clients to upload data, making detection even more challenging.
- Cloud-based email services: Adversaries may also use cloud-based email services such as Gmail, Outlook, or Yahoo Mail to exfiltrate data. Email services typically use common protocols like SMTP, IMAP, or POP3, making it difficult to distinguish between legitimate and malicious email activity. If the adversary has access to the infected system’s email client or web-based email interface, they can use pre-existing email accounts or create new ones to send data to external email addresses.
- Cloud-based data migration services: Although normally used for administrative tasks, adversaries with unauthorized access may abuse cloud data migration services to transfer unauthorized data from a private location to a location in the adversary’s control, leading to data compromise.
Proper configuration of IAM policies, implementation of least-privilege access, and continuous monitoring of data transfer activities are essential controls to prevent abuse of these services. However, detection of anomalous data transfer activities must supplement these policies.
Network-Based Exfiltration
Adversaries often transfer data from a compromised system or network to an external location within their control, using various network protocols. Two common techniques include:
- Exfiltration over alternative protocols, such as HTTP, FTP, or DNS, which can make it difficult to distinguish between legitimate and malicious traffic
- Exfiltration over command-and-control (C2) channels, which can be used to transmit data back to the attacker’s server
To detect these patterns, one would generally look to outbound traffic to external unknown or otherwise suspicious destinations.
Physical Media-Based Exfiltration
Physical media-based exfiltration involves the use of physical devices or media to exfiltrate data. This type of exfiltration is often associated with, but not limited to, insider threats, as it requires physical access to the device or media. Common patterns include:
- USB drives and removable media: Adversaries may use USB drives or other removable media to exfiltrate data. This can be done by simply plugging in a USB drive and copying sensitive data, or by using more sophisticated techniques like USB-based malware.
- Printed or written documents: Adversaries may use printed or written documents to exfiltrate data. This can be done by printing sensitive information and physically removing it from the premises, or by writing down sensitive information and transmitting it through other means.
In all cases, correlation of endpoint and network events at scale is crucial to detecting and preventing exfiltration.
Real-World Examples
As outlined in the CrowdStrike 2024 Global Threat Report, multiple threat actors used exfiltration techniques, with a greater focus on cloud environments. As the threat landscape continues to shift, it’s essential for organizations to stay vigilant by implementing robust security measures to protect their sensitive data. Here’s a snapshot of some of the cases observed by CrowdStrike.
SCATTERED SPIDER: Cloud-Conscious Exfiltration
SCATTERED SPIDER was a prominent threat actor in 2024. They demonstrated advanced tradecraft in targeted cloud environments and focused on maintaining persistence, obtaining credentials, moving laterally, and exfiltrating data. One notable tactic involved using the open-source S3 browser to exfiltrate data to an external adversary-controlled S3 bucket.
The adversary also accessed credentials stored in various cloud services, including AWS Secrets Manager, HashiCorp Vault, and SharePoint. In one instance, they located a domain controller inside a victim’s Azure tenant, copied the disks, and created a new adversary-controlled virtual machine (VM) to mount the disk copies. From these disk copies, they dumped the Active Directory (AD) database NTDS.
INDRIK SPIDER and BITWISE SPIDER: Ransomware and Exfiltration
CrowdStrike Services responded to an incident involving INDRIK SPIDER and BITWISE SPIDER’s LockBit RED ransomware. During this incident, INDRIK SPIDER exfiltrated credentials from the cloud-based credential manager Azure Key Vault. This example highlights the growing trend of ransomware groups using data exfiltration to increase pressure on victims.
GRACEFUL SPIDER: Zero-Day Exploitation
GRACEFUL SPIDER, a group operating since 2016, exploited three zero-days to exfiltrate data from hundreds of victims worldwide. This campaign resulted in the second-highest number of dedicated leak site (DLS) posts that year. GRACEFUL SPIDER’s use of zero-days demonstrates their growing sophistication in their pursuit of sensitive data.
VICE SPIDER: Targeted Exfiltration
VICE SPIDER used PowerShell scripting to automate data exfiltration by searching for specific directory and file names containing sensitive keywords. This exfiltration of data was then later used in an attempt to extort victim organizations.
Data Exfiltration Warning Signs
It is essential to recognize the warning signs of data exfiltration in order to contain its effects. These can easily translate into investigative action points or threat hunting leads. Let’s go through some key behavior patterns associated with an exfiltration operation.
Provisioning Patterns
Adversaries may tamper with organizations’ infrastructure to facilitate data exfiltration. Some examples include:
- Activity from a user account that has been inactive or is no longer affiliated with the organization
- Unusual creation of new accounts or changes in access privileges
- Unexpected deployment of new virtual machines in a cloud environment
- Changes to access control lists (ACLs) for sensitive resources or systems
For example, the following CrowdStrike Query Language (CQL) query identifies attempts to change the attributes of an AWS Amazon Machine Image (AMI), such as sharing it publicly or with another AWS account.
| #Vendor="aws" event.provider="ec2.amazonaws.com"
| #event.kind="event"
| event.action="ModifyImageAttribute"
| Vendor.requestParameters.launchPermission.add.items[0].group="all" OR Vendor.requestParameters.launchPermission.add.items[0].userId=*
| ReadableEventTime := formatTime("%Y-%m-%dT %H:%M:%S", field=@timestamp, locale=en_US)
// Perform aggregation
| groupBy([Vendor.requestParameters.imageId], function=collect([source.ip, user.name, cloud.region, ReadableEventTime]))
Anti-Forensic Patterns
Attackers may attempt to disable or tamper with security tools to avoid detection. Some examples include:
- Removal of shell history
- Deletion of event logs
- Changes in security settings or tampering with active monitoring tools
- Actions taken in incognito browser sessions
As an example, the following CQL query identifies attempts to clear or disable the ESXi Shell command line history from an interactive ESXi terminal:
#Vendor="vmware" #event.module="esxi"
| #event.kind="event" | array:contains("event.category[]", value="process")
| array:contains("event.type[]", value="start")
| case {
//direct manipulation of .ash_history file.;
process.command_line=/(rm|>|\/dev\/null|vi)\s+.+?\.ash_history/ ;
//clearing via history command;
process.command_line=/history\s+-c/ ;
//disable history for current session
process.command_line=/set\s+\+o\s+history/ ;
}
| ReadableEventTime := formatTime("%Y-%m-%dT %H:%M:%S", field=@timestamp, locale=en_US)
// Perform aggregation
| groupBy([host.name], function=[collect([process.command_line, ReadableEventTime])])
Data Staging Patterns
In preparation for the actual exfiltration, an adversary may take the following actions:
- Anomalous search queries or file access patterns across multiple systems within a short time frame
- Unusual bulk compression or encryption of files
- Large-scale data retrieval from repositories, or network shares by new users with recently granted access
- Changing of file or directory permissions, or sharing attributes without a legitimate business purpose
- Copies of sensitive files
- Screenshots of sensitive files taken
One can, for example, identify suspicious search queries of emails and documents containing sensitive keywords using the following CQL query:
#Vendor="microsoft"
| in(event.action, values=["SearchQueryInitiatedExchange", "SearchQueryInitiatedSharePoint"])
// for more keywords suggestions, review https://github.com/peass-ng/PEASS-ng/blob/6a98d4698779a863d7dba3aa7f30260bcb45e263/winPEAS/winPEASps1/winPEAS.ps1#L477
| Vendor.QueryText=/(?:invoice|ach|password|reset|keys|wire|transfer|login|bank|payment|payroll|deposit|wallet|violence|abuse|theft|steal|harassment)/i
| ReadableEventTime := formatTime("%Y-%m-%dT %H:%M:%S", field=@timestamp, locale=en_US)
| groupBy([user.email], function=collect([ReadableEventTime, Vendor.QueryText]))
Anomalous Traffic Patterns
While an actual data exfiltration is in progress, one may observe:
- Unusual data transfers between systems, especially large files or volumes of data
- Increased data traffic to unknown or suspicious IP addresses or to an unusual geographic location
- Email messages with unusually large attachments sent to external recipients
- Burst of outbound network traffic with large data payloads in fixed chunks from multiple internal sources
- Unusual protocol usage, such as UDP, SSH, Telnet, or FTP, to new destinations or on non-standard ports
- Multiple file uploads from network-attached storage (NAS) devices
- Data transmitted over insecure protocols (e.g., HTTP instead of HTTPS) or unencrypted data found in network captures
- Data transfers to cloud resources using unsigned or unverified URLs
- Spike in traffic to/from unmanaged hosts
As an example, the following CQL query identifies outbound ICMP traffic exceeding 5MB per hour:
| array:contains(array="event.category[]", value="network")
| network.transport="icmp" network.bytes>=0
| cidr(source.ip, subnet=["127.0.0.0/8", "10.0.0.0/8", "172.16.0.0/12", "192.168.0.0/16", "::1"])
| !cidr(destination.ip, subnet=["224.0.0.0/4", "10.0.0.0/8", "172.16.0.0/12", "192.168.0.0/16", "127.0.0.0/8", "169.254.0.0/16", "::1"])
// Extend with common exclusions
| destination.ip =~ !in(values=["1.1.1.1", "8.8.4.4", "8.8.8.8", "9.9.9.9"])
//
// Include only matches from CrowdStrike TI. Variation: Configure 'confidenceThreshold' as "unverified"
| ioc:lookup(field=destination.ip,type="ip_address",confidenceThreshold="high")
// Perform aggregation
| groupBy([source.ip, destination.ip], function=sum(network.bytes, as="total_bytes"))
| total_mega_bytes := total_bytes/1024/1024
| total_mega_bytes >= 5
Anomalous Endpoint Patterns
Common behaviors that indicate potentially compromised accounts or systems prior to exfiltration include:
- Installation of new software on systems without a legitimate business purpose, such as remote desktop applications
- Excessive browser uploads or unusual network traffic on virtual hosts
- Unusual user logins, from unusual locations or at odd hours, or otherwise unusual activity levels
- Multiple failed login attempts
- File system anomalies, such as modified timestamps, unexpected file deletions, or the creation of new and unexpected files or directories
- File uploads conducted with unknown, unusual binaries or tools that have just been installed/downloaded
Using CQL, one can, for example, detect anomalous remote login activity by establishing a baseline of normal login patterns and flagging current activity that exceeds 2.5 standard deviations above the established average:
defineTable(
query={
#event_simpleName="UserLogon" LogonType=10
| bucket(field=["UserName", "LogonServer"], span=1h, function=count(as=hourly_login_count), limit=500)
| hourly_login_count > 0
| groupBy(["UserName", "LogonServer"], function=[
avg(hourly_login_count, as=hourly_average_login_count),
stdDev(hourly_login_count, as=hourly_stddev_login_count)
])
},
include=[*],
name="user_login_baseline",
start=7d,
end=1h
)
| #event_simpleName=UserLogon LogonType=10
| groupBy(["UserName", "LogonServer"], function=[count(field="AuthenticationId", as=total_logins, distinct=true)])
| match(file="user_login_baseline", field=["UserName", "LogonServer"], strict=false)
| threshold := hourly_average_login_count + (2.5 * hourly_stddev_login_count)
| case {
threshold!=* | threshold := "0"; // User not in baseline
*;
}
| test(total_logins>threshold)
Physical Medium Patterns
These patterns indicate potential data exfiltration via physical media:
- Violations of local storage policies, such as unauthorized use of external devices
- Large amounts of data written to external devices, such as USB drives
- Writing events to external devices, initiated by unusual processes or applications
- Large amounts of data transferred to external devices via wireless protocols (e.g., AirDrop)
For example, through CQL, one can create a query that reviews files written to an arbitrary external device:
#repo="base_sensor"
| (#event_simpleName="DcUsbDeviceConnected") OR (#event_simpleName=/(.*)FileWritten$/ IsOnRemovableDisk=1)
| DeviceId := DeviceInstanceId | DeviceId := DiskParentDeviceInstanceId
//
// Allowlist for known Device IDs
// | DeviceId!=/XYZ/
//
| selfJoinFilter([aid, DeviceId],
where=[
{ #event_simpleName=DcUsbDeviceConnected },
{ #event_simpleName=/(.*)FileWritten$/ IsOnRemovableDisk=1 }
]
)
| ReadableEventTime := formatTime("%Y-%m-%dT %H:%M:%S", field=@timestamp, locale=en_US)
| case{
#event_simpleName=*FileWritten
| FileSizeKB:= unit:convert(Size, binary=true, to=k)
| format("%,d KB",field=["FileSizeKB"], as="FileSizeKB")
| format("%s - %s (%s)", field=[ReadableEventTime, FileName,FileSizeKB], as=FileDetails);
#event_simpleName=DcUsbDeviceConnected
| format("%s, %s, %s",field=[DeviceProduct, DeviceManufacturer, DevicePropertyClassName], as=DeviceInformation);
*;
}
// Perform aggregation
| groupBy([aid, DeviceId],
function=([
collect([ComputerName, DeviceInformation, FileDetails])
])
)
| drop([DeviceId])
// Perform aggregation
| groupBy([aid, DeviceId],
function=([
collect([ComputerName, DevicePropertyClassName, DeviceManufacturer, DeviceProduct, FileDetails])
])
)
Detect Exfiltration Attacks with Falcon Next-Gen SIEM
CrowdStrike’s Falcon platform provides robust protection against data exfiltration through its comprehensive suite of first-party solutions, including endpoint detection and response (EDR), cloud security, and identity protection. For organizations seeking to further extend their threat hunting and detection capabilities, Falcon Next-Gen SIEM combined with Falcon Adversary Intelligence offers an additional layer of visibility and analysis. By ingesting and correlating data from third-party sources alongside CrowdStrike’s native telemetry and adversary insights, this solution enables more expansive threat hunting and deeper contextual insights. This leads to more accurate threat detection, reduces false positives, and helps identify potential exfiltration risks with greater precision.
See Falcon Next-Gen SIEM in action in these fast-paced demos.
Leave a Reply