SANS Digital Forensics and Incident Response Blog

Cloud Storage Acquisition from Endpoint Devices

Over the past several years, multiple tools have been released to enable API-based collection of cloud storage data. While this is an important capability, it has the often fatal liability that API-based collections require valid user credentials (and multi-factor authentication). An often overlooked area of cloud forensics is data and metadata stored on the local device. Unsurprisingly, endpoints which have synchronized to a cloud storage service may contain a wealth of information relevant to an investigation. Devices regularly record metadata on locally synchronized files in addition to files only present in the cloud. Deleted items may still be recoverable, and files may be present in cloud storage cache folders even when they were not selected for local synchronization. In short, cloud storage data can be more accessible on the local device and can contain files and metadata distinctly different than the current cloud repository. In the case of encryption technologies designed to keep data encrypted while in the cloud, the local archive may be the best available copy (even if incomplete). However, endpoint collection includes its own set of challenges.

On-demand Cloud Synchronization

The first generation of cloud drive applications were largely designed to synchronize local folders to backup copies in a cloud instance. Applications using this model are easy to investigate and collect evidence on since the files present in the local folder store are often identical to what can be found in the cloud. Recently a new model has emerged, allowing users to see and interact with all files present in the cloud drive, even when they are not present locally. Box Drive was one of the first to implement this feature, quickly followed by OneDrive and Dropbox Smart Sync. Implementing this feature has required application developers to use certain tricks to integrate cloud contents into the local view of the filesystem. As an example, the image below shows files from a OneDrive instance. OneDrive includes a new "Status" column and populates it with icons to indicate files only in the cloud (blue cloud icon) versus files available locally (green checkmark).

Figure 1: OneDrive On-Demand File Status Example

Figure 1: OneDrive On-Demand File Status

In OneDrive, when a file is cached locally, it is present in the OneDrive folder and will be captured by ordinary forensic acquisition of the volume (though the cloud-only files will not be present). However, other implementations differ. Box Drive uses the "callback filesystem" Windows capability, allowing it to create a virtual filesystem accessed via a reparse point set on the Box folder within the user profile. Think of a reparse point as a fancy symbolic link that redirects any filesystem activity in the folder to another location. In the case of Box Drive, this alternate location is a virtualized volume made up of locally cached files. A forensic image of the C volume would only capture the Box folder reparse tag information, making it seem as if the folder were empty. If the user is logged off or the application is not running, the Box folder will not be present at all. Similarly, G Suite File Stream uses a virtual mount point only present when the application is running and the user is authenticated. The G Suite File Stream mounted file system is FAT32 and uses a separate drive letter, easily missed by some acquisition tools and processes.

How should we deal with this issue of virtualized filesystems? If the investigator has access to a live system while the user is currently logged into the application, the cloud app folder can be logically acquired. Oleg Skulkin and Igor Mikhaylov previously introduced a solution using FTK Imager (employing the "Contents of a Folder" image option). In this article I will focus on the KAPE triage collection tool. Either option can capture the local files available in a virtualized filesystem. Interestingly, in some cloud storage applications, acquisition might also trigger the automatic retrieval and caching of cloud-only files (OneDrive and Box are two examples).

Cloud Storage Collection with KAPE

KAPE is one of the more exciting recent projects released to the digital forensics community. Written by Eric Zimmerman, it simplifies acquisition and processing of forensic artifacts and can be run on a live system, a mounted image, or across the network using SFTP for collection or in conjunction with a tool like F-Response. Live triage collection is a powerful capability and is now required in an increasing number of cases to capture critical evidence. Nowhere is that clearer than in this new breed of cloud storage applications where files are only available when the system is live and the user is authenticated.

Acquisition is accomplished in KAPE using target, or "tkape", files. These files define the location and acquisition options for each artifact. As an example, the contents of the publicly available OneDrive.tkape file follow:

Description: Microsoft OneDrive Storage Files and Metadata
Author: Chad Tilbury
Version: 1
Id: f3c680ca-0646-48cc-a471-5f484e22b1cf
RecreateDirectories: true
Name: OneDrive User Files
Category: Apps
Path: C:\Users\*\OneDrive*\
IsDirectory: True
Recursive: True
FollowReparsePoint: True
FollowSymbolicLinks: True
Comment: ""
Name: OneDrive Metadata Logs
Category: Apps
Path: C:\Users\*\AppData\Local\Microsoft\OneDrive\logs\
IsDirectory: True
Recursive: True
Comment: ""
Name: OneDrive Metadata Settings
Category: Apps
Path: C:\Users\*\AppData\Local\Microsoft\OneDrive\settings\
IsDirectory: True
Recursive: True
Comment: ""

Figure 2: KAPE Target File for OneDrive

The target file represented in Figure 2 demonstrates how easy it is to create (and modify) acquisition targets. In collaboration with Eric Zimmerman, two new collection options were added to KAPE to support cloud storage application collection. Recall that some "on-demand" cloud applications use reparse points and other symbolic link redirection to show files and folders to users that really exist in a different location (including only in the cloud). The options FollowReparsePoint and FollowSymbolicLinks will follow the redirection implemented by these NTFS features to collect the cached files in the redirected locations. They are particularly useful for the collection of Box and OneDrive files, but I suspect the community will find many other cloud applications requiring these options for collection. Of the most popular cloud storage applications, only Google File Stream cannot be collected with KAPE. This is due to the FAT32 volume it creates. In this case, FTK imager is a good backup as it will successfully image a Google File Stream volume (using the "Contents of a Folder" image option).

The KAPE Github repository includes target files for Box, Dropbox, Google Drive, and OneDrive. The meta-target file "CloudStorage.tkape" references these individual targets, facilitating collection using all the individual cloud app targets at once (and in the future will include any new cloud application targets added). It is important to note that the current KAPE target files only collect cloud storage files from default locations. If a user has renamed or moved their cloud storage folder, it will be missed by these target files (notice that the example in Figure 2 expects the "OneDrive" folder to be at C:\Users\<username>\OneDrive). Target files can be easily modified to collect any non-standard folder locations once they have been identified by the investigator. Thus, a review of the subject file system is recommended before you perform your final triage collection.


A word of caution: when requesting files via reparse points, the cloud application may download files from the cloud as part of the collection process. To further emphasize, collecting files via the reparse points of some cloud applications may collect files from the cloud that do not currently exist on the device. This could cause storage space and scope of authority issues, and there is no way to only specify "local" files in these folders. Note that a logical image accomplished with a tool like FTK Imager will exhibit the same behavior. If this is a problem for your investigation, an excellent mitigation (and good best practice) is to isolate the subject system from the network. Alternatively, the examiner can take screenshots and manually copy the local files from the live system. This is a great example of knowing and understanding your tools before using them in a real investigation. In this brave new world for investigators, detailed planning before performing an acquisition will pay dividends.

Speaking of knowing your tools, the SANS Institute FOR498 and FOR500 digital forensics classes now cover KAPE and cloud collection and analysis in depth!



Cloud Storage Forensics: Endpoint Evidence

  • When: Wednesday, December 3th, 2019 at 3:30 PM EST (20:30:00 UTC)
  • Conducted by Chad Tilbury
  • Register here

(Note: A recording and slides will be available afterwards at the same link)


Chad Tilbury, GCFA, has spent over twenty years conducting computer crime investigations ranging from hacking to espionage to multi-million dollar fraud cases. He is a senior instructor and co-author of FOR500 Windows Forensic Analysis and FOR508 Advanced Incident Response, Threat Hunting, and Digital Forensics at the SANS Institute. Find him on Twitter @chadtilbury.

Post a Comment


* Indicates a required field.