Saturday 25 February 2012

(Monkey) Carvings of Unknown File Types with Scalpel / Foremost on SIFT

Thierry13 recently requested we look into file carving - specifically, how do we carve for a non-standard / unknown files. For the scalpel and foremost carving utilities (both on SIFT) it's monkey's play!

FYI There's another file carving utility on SIFT called photorec but this won't handle new unknown files only certain image/movie/document/archived files.

Anyhoo, in order to configure scalpel / foremost, you must first have an idea of:
- the file extension (if required eg .doc),
- the maximum size of a potential file,
- the file header signature (and optional file footer signature).

You enter these parameters into a "scalpel.conf" or "foremost.conf" file before running the respective executable.
As scalpel was derived from foremost, the .conf files look very similar. You can usually cut/paste the same rule for both.
By default the executable will look for its .conf file in the current working directory but you can also tell it which .conf file to use.

Here's how I prepared for testing all of the above:
I plugged in freshly formatted 512 Mb USB Memory stick (in Windows XP).
I downloaded (for free):
- WinHex 
- FTK Imager

We will now launch WinHex to create our mystery file (eg "cheeky-file.c4n6") and then copy it to the USB stick.

1. In WinHex, go to "File" ... "New" ... and select 1024 bytes in the resultant popup. Press "OK".

2. Press Ctrl-L to fill the file with data. At the subsequent popup, use the default "Simple pseudo-random numbers" radio button and hit "OK".

3. Add in our file header and footer data. You can simply click at the beginning of the file and type C4N6 like the screenshot below. Note: I clicked in the right most column of the corresponding hex digits to edit.
Editing the Mystery File Header Signature

4. Similarly, click 6 bytes/characters from the end of the file and type MONKEY.

Editing the Mystery File Footer Signature

5. Now go to "File" ... "Save" and call the file something like "cheeky-file.c4n6". Save it to BOTH the USB stick AND the local hard disk.

6. Using Windows File Explorer, we delete the mystery file from the USB stick (eg Shift-Delete).

7. Now we take a physical image of the USB stick using FTK Imager.
In FTK Imager go to "File" ... "Create Disk Image". Select "Physical Drive" then "Next". Select the USB drive in the drop-menu and click "Finish".
Click the "Add" button then choose the Raw(dd) Destination Image Type. Press "Next". Press "Next" again to skip entering the case details.
Enter in a save location/filename (eg "usb512") for the image and press "Finish". Now click "Start".
It was at this point I got a couple of failures. After I unplugged another USB device (curse you Madden Game Controller!), I was then able to save the whole image.
There should now be a file (eg "usb512.001") with a corresponding audit log ("usb512.001.txt") containing the MD5 hash(es).

8. Now launch the SANS SIFT VM (v2.12).

9. Using Windows File Explorer, copy the image (eg "usb512.001") to the SIFT VM "/cases/" directory.
Also copy the new mystery file (eg "cheeky-file.c4n6") to "/cases/" so we can MD5 hash compare it with any subsequently carved results.

10. In a new SIFT terminal window, we should check the MD5 hash of the USB image by typing:
 "md5sum /cases/usb512.001".
The terminal window should look something like:

sansforensics@SIFT-Workstation:~$ md5sum /cases/usb512.001
85cc5e5ef0b44c314da7dfc9954236f6  /cases/usb512.001

We can then compare this to the "usb512.001.txt" audit file generated previously by FTK Imager (i.e.
"MD5 checksum:    85cc5e5ef0b44c314da7dfc9954236f6")

Cool bananas! Our image has not changed after being copied over.

11. Now we set up local editable copies of the "scalpel.conf" / "foremost.conf" files. Assuming the current directory is "/home/sansforensics/", type "cp /usr/local/etc/foremost.conf ." and "cp /usr/local/etc/scalpel.conf ."
Note: I found these existing .conf files on SIFT by typing "sudo find / -name scalpel.conf -print". There were two entries returned, so I picked the largest and most latest one. Ditto for the "foremost.conf" file. By default both .conf files have all their rules commented out. Which brings us to ...

12. Using gedit (eg "gedit foremost.conf &" to launch gedit in the background), add the following lines to the "foremost.conf" files (screenshot):

#Cheeky4n6Monkey Test file
    c4n6    y    2048    C4N6    MONKEY

This line is saying "If you find a header equal to C4N6 (case sensitive) and a footer equal to MONKEY (case sensitive) within 2048 bytes, retrieve the data and label it with the .c4n6 extension"
Note: The # sign means that line is a comment. Also note, there's a TAB between columns.

Editing "foremost.conf"

13. Similarly, edit the "scalpel.conf" file.

Editing "scalpel.conf"

14. Once we save the two .conf files, we can now run the carvers. For foremost the command line will look like:

sansforensics@SIFT-Workstation:~$ foremost -c foremost.conf -o usb512-foremost -i /cases/usb512.001
Processing: /cases/usb512.001

We are telling foremost to use the "foremost.conf" file in the current directory (this option is not strictly required) to carve the "/cases/usb512.001" file and store the results in the current directory under the "usb512-foremost" sub-directory .

Looking in the output directory yields:

sansforensics@SIFT-Workstation:~$ ls usb512-foremost/
audit.txt  c4n6
sansforensics@SIFT-Workstation:~$ ls usb512-foremost/c4n6/

Hooray! It looks like foremost found our mystery file and stored it in its own file-type specific directory ("c4n6").
To verify this, lets compare MD5 hashes of the foremost recovered file and the file we copied over earlier ("/cases/cheeky-file.c4n6").

sansforensics@SIFT-Workstation:~$ md5sum usb512-foremost/c4n6/00000640.c4n6
94b4265826825763fbf8c661fa04ac1c  usb512-foremost/c4n6/00000640.c4n6
sansforensics@SIFT-Workstation:~$ md5sum /cases/cheeky-file.c4n6
94b4265826825763fbf8c661fa04ac1c  /cases/cheeky-file.c4n6

And we have a MATCH!

Similarly, for scalpel, the command line will look something like:

sansforensics@SIFT-Workstation:~$ scalpel -c scalpel.conf -o usb512-scalpel /cases/usb512.001
Scalpel version 2.0
Written by Golden G. Richard III and Lodovico Marziale.
Multi-core CPU threading model enabled.
Initializing thread group data structures.
Creating threads...
Thread creation completed.

Opening target "/cases/usb512.001"

Image file pass 1/2.
/cases/usb512.001: 100.0% |*****************************|  490.0 MB    00:00 ETAAllocating work queues...
Work queues allocation complete. Building work queues...
Work queues built.  Workload:
c4n6 with header "C4N6" and footer "MONKEY" --> 1 files
Carving files from image.
Image file pass 2/2.
/cases/usb512.001: 100.0% |*****************************|  490.0 MB    00:00 ETAProcessing of image file complete. Cleaning up...
Scalpel is done, files carved = 1, elapsed  = 7 secs.

Note: The scalpel arguments take a slightly different format - there's no "-i" flag before the source file.

Now looking in the output directory yields:

sansforensics@SIFT-Workstation:~$ ls usb512-scalpel/
audit.txt  c4n6-0-0/ 
sansforensics@SIFT-Workstation:~$ ls usb512-scalpel/c4n6-0-0/

And comparing MD5 hashes yields:

sansforensics@SIFT-Workstation:~$ md5sum usb512-scalpel/c4n6-0-0/00000000.c4n6
94b4265826825763fbf8c661fa04ac1c  usb512-scalpel/c4n6-0-0/00000000.c4n6
sansforensics@SIFT-Workstation:~$ md5sum /cases/cheeky-file.c4n6
94b4265826825763fbf8c661fa04ac1c  /cases/cheeky-file.c4n6

Another Match!

So that was pretty monkey proof eh? We have made up our own mystery file, deleted it and then recovered it using FTK Imager, foremost and scalpel. For more information on these file carvers you can type "man scalpel" and "man foremost" at the SIFT terminal window.

Thierry13 also asked about how do we identify file headers and my copout answer would be use WinHex to look at a sample file first and tailor the .conf accordingly. If anyone has any alternative ideas, please leave a comment!
Also FYI, Gary Kessler keeps a reference table of file header signatures here.

Wow, this is my 20th post - with all this techno-babble, I have forgotten the humour component. Post number 21 will hopefully be less technical, more humour.