You are on page 1of 26

Advances in File Carving

Rob Zirnstein, President Forensic Innovations, Inc.


7/14/2011

www.ForensicInnovations.com

Our Data is GONE!


All of your servers have Crashed! Your customers Data is Lost! You backed up last week, but important business transactions have taken place since. 70% of companies with devastating data loss go out of business. All it took was one employee writing a simple SQL database script after you fired them.

We Didnt Find The Evidence!


What do you do when youve searched through all of the evidence and came up empty? When you know a suspect is hiding something, where do you look first? TrueCrypt Volumes & Unallocated Space Even good people shred data when faced with an investigation. The tools are easy to find.
www.TrueCrypt.org

How They Hide the Evidence


Deleting a file
Sends the file to the Windows Recycle Bin
Empty or bypass the Recycle Bin

Undelete tools depend on the deleted directory entry


That can be deleted or overwritten too Then theres no undeleting possible

Store files in a TrueCrypt Volume


Undetectable as a file (except for my tools) Looks like random data in unallocated space (except for my tool)

How To Get The Files Back


File Carving
Definition: General term for extracting data (files)
out of undifferentiated blocks (raw data), like "carving" a sculpture out of soap stone. http://www.forensicswiki.org/wiki/File_Carving

The sectors containing the files are orphaned Some of them may get overwritten They are like many jigsaw puzzles thrown into a trash bag, if they were fragmented. If some sectors were stored consecutively, then its like puzzle pieces that werent pulled apart before getting trashed.

File Carving Assumptions


No Files are Fragmented!?!
All Files are stored in consecutive sectors

Sector Size = 512 bytes


May be detected through disk structure

Cluster Size = 512 to 16,384 bytes


May be detected through disk structure

File Slack may be ignored RAM slack is ignored


Or incorrectly bundled in with File Slack Isnt it always zeroed out?

File Carving Techniques


Block Based Carving Statistical Carving Header/Footer Carving Header/Maximum File Size Carving Header/Embedded Length Carving File Structure Based Carving Semantic Carving Carving with Validation Fragment Recovery Carving Repackaging Carving SmartCarving Hash Carving Fuzzy Hash Carving

http://www.forensicswiki.org/wiki/File_Carving

Block Based Carving


Analyze each sector on a block-by-block basis to determine if they belong together in the same file. Assuming that each sector can only be part of a single file

Statistical Carving
Use statistics or content characteristics to identify each sector. Entropy measurement Filter out blocks that clearly arent part of a desired file type.

Header/Footer Carving
Search for file header signature(s). Search for the matching file footer signatures. Capture the sectors in between.

Header/Maximum File Size Carving


Search for file header signature(s). Consult a list of maximum file lengths for each header type. Capture the sectors in between. Many file types do not detect the additional unrelated data that may get appended to the recovered file.

Header/Embedded Length Carving


Search for file header signature(s). Read the file length from one of the fields in the header.

File Structure Based Carving


Once a sectors file type is identified
Match to other sectors that contain similar data structures. Use knowledge of the file types data structures to search for structure parts expected to exist in later sectors.

Semantic Carving
Identify the language used in a sector. Identify the language used in each of the following sectors Collect the sectors that are written in the same language

Carving with Validation


Use a file interpreter or viewer to load each recovered file.
If the interpreter encounters invalid data, assume that is the point where the carving method failed.

Use on completed files. Use on each added sector.

Fragment Recovery Carving


Find two or more fragments that belong to the same file. Filter out the sectors between the fragments that dont belong.

Repackaging Carving
Used on partially recovered files. Rebuild the parts of the file that were not able to be recovered. The result should be a file that can be opened with its native application or a standard viewer.

SmartCarving
Use knowledge of the file systems typical fragmentation effects. Preprocess the source sectors.
Decompress, decrypt or translate the data

Collate the identified blocks.


Sort by file type

Reassemble the blocks in sequences that match their file type.

Hash Carving
Calculate a hash value for each sector
MD5, SHA-1

Compare the hash value to a list of known sector hash values


This list can be of known Good and/or known Bad files. Filter out known Good files. (ex: Installed applications) Recover known Bad files. (ex: known illicit material)

Fuzzy Hash Carving


Calculate a fuzzy hash value for each sector. Compare the fuzzy hash values of sectors to determine which sectors are similar in content. Combine similar sectors into recovered files. Match raw data sectors together for object types that have no identifiable signatures or that extend beyond a single sector. Recover file types not previously encountered.

Tools Today (1)


Adroit Photo Recovery/Forensics
combination of SmartCarving, header carving, structure based validation and validation of the entire file to determine if each new sector belongs; Repackaging Carving is also available; http://www.forensicswiki.org/wiki/File_Carving:SmartCarving Supports JPEG, RAW camera images, PNG, BMP and GIF files

DataLifter
header-footer carving; Supports 25 file types

Encase
header-footer carving; Supports ~250 file types

Foremost
file structure based carving for avi, bmp, doc, gif, hmlt, jpg, mov, pdf, png, rar, wav and zip files. header-footer carving for art, asf, chm, cookie, cpp, dat, dbx, fws, idx, java, lnk, mail, mbx, mp3, mpg, ost, pgd, pgp, ppt, pst, ra, rdp, rpm, tif, txt, wma, wmv, wpc and xls files.

Forensic Toolkit (FTK)


internal techniques unknown; Supports abl, aol, asd, bmp, doc, dot, emf, gif, html, jpg, mpp, one, pdf, png, ppt, pub, puz, vsd, vss, vst, xla, xls and xlt files.

http://www.forensicswiki.org/w/images/b/b9/Kloet_2007.pdf

Tools Today (2)


HstEx / Netanalysis
internal techniques unknown; Supports browser history formats

NFI Defraser
Fragment recovery carving & carving with validation; Supports MPEG, 3GPP, Quicktime & AVI files

PhotoRec
combination of file structure based carving and header-footer carving of 80 file formats

PyFlag
appears to use a simple text search method, ignoring sector boundaries; Supports server log file formats

Recover My Files
internal techniques unknown; Supports 200 file types

Revit
SmartCarving; Supported file types list not available

http://www.forensicswiki.org/w/images/b/b9/Kloet_2007.pdf

Tools Today (3)


Scalpel
combination of header-footer and header-maximum file size carving; Supports art, avi, dat, dbx, doc, fws, gif, htm, idx, java, jpg, mail, max, mbx, mov, mpg, ost, pdf, pgd, pgp, pins, png, pst, ra, rpm, tif, txt, wav, wpc and zip files.

X-Ways
header-footer carving; unknown support list

http://www.forensicswiki.org/wiki/Tools:Data_Recovery#Carving

Tool Problems
Few tools handle file fragmentation The tools that handle fragmentation support very few file types Most tools can not detect false positives Most tools hard code file type support Only 1 tool claims to rebuild partial files
It only supports 5 file types (image files)

Performance is a problem
most tools utilize inefficient databases and scripting languages

Future Tools
Carver 2.0
Open Source, in the early specification stages

File Harvester
Combination of multiple methods:
Block Based Carving Statistical Carving Header/Footer Carving Header/Embedded Length Carving File Structure Based Carving Fragment Recovery Carving Repackaging Carving (Phase 3) SmartCarving Fuzzy Hash Carving (secret sauce)

Thank you
Contact
Rob Zirnstein Rob.Zirnstein@ForensicInnovations.com www.ForensicInnovations.com (317) 430-6891

You might also like