Friday 29 July 2016

A Timestamp Seeking Monkey Dives Into Android Gallery Imgcache

Are you sure?! Those waters look pretty turdy ...
UPDATE 4AUG2016: Added video thumbnail imgcache findings and modified version of script for binary timestamps.

Did you know that an Android device can cache images previously viewed with the stock Gallery3D app?
These cached images can occur in multiple locations throughout the cache files. Their apparent purpose is to speed up Gallery loading times.
If a user views an image and then deletes the original picture, an analyst may still be able to recover a copy of the viewed image from the cache. Admittedly, the cached images will not be as high a quality as the original, but they can still be surprisingly detailed. And if the pictures no longer exist elsewhere on the filesystem - "That'll do monkey, that'll do ..."

The WeAre4n6 blog has already documented their observations about Android imgcache here.
So why are we re-visiting this?
We were asked to see if there was any additional timestamp or ordering information in the cached pictures. If a device camera picture only exists in the Gallery cache, it won't have the typical YYYYMMDD_HHMMSS.JPG filename. Instead, it will be embedded in a cache file with a proprietary structure and will need to be carved out. These embedded cached JPGs do not have any embedded metadata (eg EXIF).
An unnamed commercial phone forensics tool will carve the cached pictures out but it currently does not extract any timestamp information.

Smells like an opportunity for some monkey style R&D eh?
Or was that just Papa Monkey's flatulence striking again? An all banana diet can be so bittersweet :D

Special Thanks to:
- Terry Olson for posting this question about the Gallery3D imgcache on the Forensic Focus Forum and then kindly sharing a research document detailing some imgcache structures.
- Jason Eddy and Jeremy Dupuis who Terry acknowledged as the source of the research document.
- LSB and Rob (@TheHexNinja) for their help and advice in researching the imgcache.
- Cindy Murphy (@CindyMurph) for sharing her recollections of a case involving imgcache and listening to this monkey crap on.
- JoAnn Gibb for her suggestions and also listening to this monkey crap on.

Our main test devices were a Samsung Galaxy Note 4 (SM-910G) and a Galaxy Note 4 Edge (SM-915G) both running Android 5.1.1.

Our initial focus was the following cache file:
userdata:/media/0/Android/data/com.sec.android.gallery3d/cache/imgcache.0

After an image is viewed fullscreen in the Gallery app, imgcache.0 appears to be populated with the viewed picture plus six (sometimes less) other images. It is suspected the other cached pictures are chosen based on the display order of the parent gallery and will be taken from before/after the viewed image. If a picture is found in this cache file, it is likely that the user would have seen it (either from the parent gallery view or when they viewed it fullscreen).
From our testing, this file contains the largest sized cached images. From the filesystem last modified times and file sizes, it is suspected that when the imgcache.0 file reaches a certain size, it gets renamed to imgcache.1 and newly viewed images then start populating imagecache.0.  Due to time constraints, we did not test for this rollover behaviour. By default, the initial imgcache.0 and imgcache.1 files appear to be 4 bytes long.

Also in the directory were mini.0 and micro.0 cache files which contained smaller cached images. Similarly to imgcache.0, these files also had  .1 files.

mini.0 contains the smallest sized, square clipped, thumbnail versions of the cached images. They appear to be similar to the images displayed from the Gallery preview list that is shown when the user long presses on a fullscreen viewed Gallery image.

micro.0 contains non-clipped images which are smaller versions of the images in imgcache.0 but larger in size than the images in mini.0. These appear to be populated when the user views a gallery of pictures. Launching the Gallery app can be enough to populate this cache (likely depends on the default Gallery app view setting).

imgcache.0 has been observed to contain a different number of images to mini.0 or micro.0. It is suspected this is due to how the images were viewed/previewed from within the Gallery app.

Other files were observed in the cache directory but their purpose remains unknown. eg imgcache.idx, micro.idx, mini.idx were all compromised mainly of zeroed data.

UPDATE 4AUG2016:
A device video was also created/saved on the test device and displayed via the Gallery app. A corresponding video thumbnail was consequently cached in the imgcache.0, mini.0 and micro.0 files. These video cache records were written in a slightly different format to the picture cache records.

The imgcache structure

Based on the supplied research document and test device observations, here's the record structure we observed for each Galaxy Note 4 “imgcache.0” picture record:
  • Record Size (4 Byte LE Integer) = Size in bytes from start of this field until just after the end of the JPG
  • Item Path String (UTF16-LE String) = eg /local/image/item/
  • Index Number (UTF16-LE String) =  eg 44
  • + separator (UTF16-LE String) = eg +
  • Unix Timestamp Seconds (UTF16-LE String) = eg 1469075274
  • + separator (UTF16-LE String) = eg +
  • Unknown Number String (UTF16-LE String) = eg 1
  • Cached JPG (Binary) = starts with 0xFFD8 ... ends with 0xFFD9
The cached JPG is a smaller version of the original picture.
The Unix Timestamp Seconds is referenced to UTC and should be adjusted for local time. We can use a program like DCode to translate it into a human readable format (eg 1469075274 = Thu, 21 July 2016 04:27:54. UTC).
The Index Number seems to increase for each new picture added to the cache and may help determine the order in which the picture was viewed.

There are typically 19 bytes between each imgcache.0 record. However, the first record in imgcache.0 usually has 20 bytes before the first record’s 4 byte Record Size.
The record structure shown above was also observed to be re-used in the “micro” and “mini” cache files.

UPDATE 4AUG2016:
Here's the record structure we observed for each Galaxy Note 4 “imgcache.0” video thumbnail record:

  •     Record Size (4 Byte LE Integer) = Size in bytes from start of this field until just after the end of the JPG
  •     Item Path String (UTF16-LE String) = eg /local/video/item/
  •     Index Number (UTF16-LE String) =  eg 44
  •     + separator (UTF16-LE String) = eg +
  •     Unix Timestamp Milliseconds (UTF16-LE String) = eg 1469075274000
  •     + separator (UTF16-LE String) = eg +
  •     Unknown Number String (UTF16-LE String) = eg 1
  •     Cached JPG (Binary) = starts with 0xFFD8 ... ends with 0xFFD9

The Unix Timestamp Milliseconds is referenced to UTC and should be adjusted for local time. We can use a program like DCode to translate it into a human readable format (eg 1469075274000 = Thu, 21 July 2016 04:27:54. UTC).

The item path string format did not appear to vary for a picture/video saved to the SD card versus internal phone memory.

The Samsung Note 4 file format documented above was NOT identical with other sample test devices including a Moto G (XT1033), a Samsung Galaxy Core Prime (SM-G360G) and a Samsung J1 (SM-J100Y).
The Moto G’s Gallery app cache record size did not include itself (ie 4 bytes smaller) and the Galaxy Core Prime / J1’s Gallery app cache record did not utilize a UTF16LE timestamp string. Instead, it used a LE 8 byte integer representing the Unix timestamp in milliseconds (for BOTH picture and video imgcache records). This was written between the end of the path string and the start of the cached JPG’s 0xFFxD8.
These differences imply that a scripted solution will probably require modifications on a per device/per app basis.

UPDATE 4AUG2016:
As a result of this testing, a second script (imgcache-parse-mod.py) was written to parse Galaxy S4 (GT-i9505)/ Galaxy Core Prime / J1 imgcache files which appear to share the same imgcache record structures. Please refer to the initial comments section of the imgcache-parse-mod.py script for a full description of that imgcache structure. This modified script will take the same input arguments as the original imgcache-parse.py script described in the next section.

Scripting

A Python 2 script (imgcache-parse.py) was written to extract JPGs from “imgcache”, “micro” and “mini” cache files to the same directory as the script.

UPDATE 4AUG2016:
The script searches the given cache file (eg imgcache.0) for the UTF16LE encoded "/local/image/item/" and/or “/local/video/item/” strings, finds the record size and then extracts the record's embedded JPG to a separate file. The script also outputs an HTML table containing the extracted JPGs and various metadata.

An example HTML output table looks like:

Example HTML output table for picture imgcache records

Example HTML output table entry for a video imgcache record


The extracted JPG filename is constructed as follows:

[Source-Cache-Filename]_pic_[Hex-Offset-of-JPG]_[Unix-Timestamp-sec]_[Human-Timestamp].jpg
OR
[Source-Cache-Filename]_vid_[Hex-Offset-of-JPG]_[Unix-Timestamp-ms]_[Human-Timestamp].jpg

The script also calculates the MD5 hash for each JPG (allowing for easier detection of duplicate images) and prints the filesize and the complete item path string.
Each HTML table record entry is printed in the same order as it appears in the input cache file. That is, the top row represents the first cache record and the bottom row represents the last cache record.

The script was validated with Android 5.1.1 and the Gallery3d app v2.0.8131802.
You can download it from my Github site here.

Here is the help for the script:
C:\Python27\python.exe imgcache-parse.py
Running imgcache-parse.py v2016-08-02

Usage:  imgcache-parse.py -f inputfile -o outputfile

Options:
  -h, --help   show this help message and exit
  -f FILENAME  imgcache file to be searched
  -o HTMLFILE  HTML table File
  -p           Parse cached picture only (do not use in conjunction with -v)
  -v           Parse cached video thumbnails only (do not use in conjunction with -p)

Here is an example of how to run the script (from Windows command line with the Python 2.7 default install). This will process/extract BOTH pictures and video cache records (default):

C:\Python27\python.exe imgcache-parse.py -f imgcache.0 -o opimg0.html
Running imgcache-parse.py v2016-08-02

Paths found = 14

/local/image/item/44+1469075274+1 from offset = 0X18
imgcache.0_pic_0X5A_1469075274_2016-07-21T04-27-54.jpg
JPG output size(bytes) = 28968 from offset = 0X5A

/local/image/item/43+1469073536+1 from offset = 0X7199
imgcache.0_pic_0X71DB_1469073536_2016-07-21T03-58-56.jpg
JPG output size(bytes) = 75324 from offset = 0X71DB

/local/image/item/41+1469054648+1 from offset = 0X1982E
imgcache.0_pic_0X19870_1469054648_2016-07-20T22-44-08.jpg
JPG output size(bytes) = 33245 from offset = 0X19870

/local/image/item/40+1469051675+1 from offset = 0X21A64
imgcache.0_pic_0X21AA6_1469051675_2016-07-20T21-54-35.jpg
JPG output size(bytes) = 40744 from offset = 0X21AA6

/local/image/item/39+1469051662+1 from offset = 0X2B9E5
imgcache.0_pic_0X2BA27_1469051662_2016-07-20T21-54-22.jpg
JPG output size(bytes) = 30698 from offset = 0X2BA27

/local/video/item/38+1469051577796+1 from offset = 0X33228
imgcache.0_vid_0X33270_1469051577796_2016-07-20T21-52-57.jpg
JPG output size(bytes) = 34931 from offset = 0X33270

/local/image/item/37+1469051566+1 from offset = 0X3BAFA
imgcache.0_pic_0X3BB3C_1469051566_2016-07-20T21-52-46.jpg
JPG output size(bytes) = 28460 from offset = 0X3BB3C

/local/image/item/27+1390351440+1 from offset = 0X42A7F
imgcache.0_pic_0X42AC1_1390351440_2014-01-22T00-44-00.jpg
JPG output size(bytes) = 97542 from offset = 0X42AC1

/local/image/item/28+1390351440+1 from offset = 0X5A7DE
imgcache.0_pic_0X5A820_1390351440_2014-01-22T00-44-00.jpg
JPG output size(bytes) = 122922 from offset = 0X5A820

/local/image/item/29+1390351440+1 from offset = 0X78861
imgcache.0_pic_0X788A3_1390351440_2014-01-22T00-44-00.jpg
JPG output size(bytes) = 127713 from offset = 0X788A3

/local/image/item/30+1390351440+1 from offset = 0X97B9B
imgcache.0_pic_0X97BDD_1390351440_2014-01-22T00-44-00.jpg
JPG output size(bytes) = 97100 from offset = 0X97BDD

/local/image/item/31+1390351440+1 from offset = 0XAF740
imgcache.0_pic_0XAF782_1390351440_2014-01-22T00-44-00.jpg
JPG output size(bytes) = 66576 from offset = 0XAF782

/local/image/item/32+1390351440+1 from offset = 0XBFBA9
imgcache.0_pic_0XBFBEB_1390351440_2014-01-22T00-44-00.jpg
JPG output size(bytes) = 34746 from offset = 0XBFBEB

/local/image/item/33+1390351440+1 from offset = 0XC83BC
imgcache.0_pic_0XC83FE_1390351440_2014-01-22T00-44-00.jpg
JPG output size(bytes) = 26865 from offset = 0XC83FE

Processed 14 cached pictures. Exiting ...

The above example output also printed the HTML table we saw previously.
Some further command line examples:
C:\Python27\python.exe imgcache-parse.py -f imgcache.0 -o output.html -p
(will parse/output picture cache items ONLY)

C:\Python27\python.exe imgcache-parse.py -f imgcache.0 -o output.html -v
(will parse/output video thumbnail cache items ONLY)

Testing

During testing of the Gallery app - device camera pictures, a screenshot and a picture saved from an Internet browser were viewed. Cached copies of these pictures were subsequently observed in the “imgcache.0”, “mini.0” and “micro.0” cache files.
From our testing, the Unix timestamp represents when the picture was taken/saved rather than the time it was browsed in the Gallery app.
This was tested for by taking camera picture 1 on the device, waiting one minute, then taking picture 2. We then waited another minute before viewing picture 1 in the Gallery app, waiting one minute and then viewing picture 2.
Running the imgcache-parse.py script and viewing the resultant output HTML table confirmed that the timestamp strings reflect the original picture’s created time and not the Gallery viewed time. The HTML table also displayed the order of the imgcache.0 file - picture 1 was written first, then picture 2.
We then cleared the Gallery app cache and viewed picture 2 in the Gallery app followed by picture 1.
Running the imgcache-parse.py script again and viewing the resultant output HTML table displayed the order of the imgcache.0 file. Picture 2 was written first, then picture 1.

UPDATE 4AUG2016:
A device video was also created (20160802_155401.mp4), uploaded to Dropbox (via app v2.4.4.8) and then downloaded and viewed in the Gallery app. The imgcache.0 record timestamp for the created video (1470117241703 = 05:54:01) differed to the imgcache.0 timestamp for the downloaded video (1470117253000 = 05:54:13). This difference of approximately 12 seconds was slightly longer than the 11 second video duration.
It is suspected that the created video’s imgcache timestamp represents when the original video was first being written and the downloaded video’s imgcache timestamp represents when the original video was finalised to the filesystem.
The video thumbnails displayed in the Gallery app and imgcache for each video were also different. The downloaded video thumbnail appeared to be from approximately 1 second into the video. The created video thumbnail seemed to be the first frame of the video. The MD5 hashes of both video files were identical.

As per LSB's helpful suggestion, rather than take a full image of the test phone for each acquisition of cache files, we plugged our test device into a PC and used Windows Explorer to browse to the Phone\Android\data\com.sec.android.gallery3d\cache folder and copy the cache files to our PC. This saved a significant amount of imaging time. To minimize any synchronization issues, the phone should be unplugged/re-plugged between file copies.


Final Thoughts

Depending on the device, it may be possible to determine the created timestamp of a picture viewed and cached from the Android Gallery app. The Gallery cache may also include pictures which are no longer available elsewhere on the device.
A Python script (imgcache-parse.py) was created to extract various metadata and the cached images from a Samsung Note 4 Gallery app’s (imgcache, micro and mini) cache files.
UPDATE 4AUG2016:A modified version of this script (imgcache-parse-mod.py) was also created to handle binary timestamps as observed on Galaxy S4 / J1 / Core Prime sample devices.

It is STRONGLY recommended that analysts validate their own device/app version combinations before running these scripts. Your mileage will vary!
For example, take a picture using the device camera and validate its YYYYMMDD_HHMMSS.JPG filename/metadata against the timestamp in the item path (if its there).
For case data, look for device images with date/time information in them (eg pictures of newspapers, receipts etc. or device screenshots) to increase the confidence level in extracted timestamps.

The Gallery app was not present in various Android 6.0 test devices that we looked at. It may have been usurped by the Google Photos app. However, we have seen the Gallery app on Android 5 and Android 4 devices which would still make up the majority of devices currently out there.

Monkey doesn't have the time/inclination but further areas of research could be:
- Decompiling the Gallery .apk and inspecting the Java code.
- Rollover functionality of the cache files (eg confirm how imgcache.1 get populated).
- Why there can be multiple copies of the same image (with same MD5) appearing at multiple offsets within the same imgcache file.
- Determining how the cache record index number is being calculated.
- Determining the “imgcache.idx”, “micro.idx”, “mini.idx” files purpose.

Anyhoo, it would be great to hear from you in the comments section (or via email) if you end up using these scripts for an actual case. Or if you have any further observations to add (don't forget to state your Android version and device please).

Sorry, but for mental health reasons I will NOT recover your dick pics for you. ie Requests for personal image recovery will be ignored. If you Google for "JPG file carver", you should find some programs that can help you recover/re-live those glorious tumescent moments.

Can you tell how working in forensics has affected my world view? ;)