Sunday, 27 February 2022

Monkey Attempts To Digest Some Google Takeout (DetectedActivitys)

 

Careful What You Eat, Monkey!

One of Monkey's co-workers (Troy) was able to provide investigators with a location of interest by looking at the device owner's Google Takeout "Location History.json".

Specifically, Troy looked at the DetectedActivity classification strings which Google (Play Services) uses to estimate whether the device was still, in a vehicle, on foot etc. There was a transition from IN_VEHICLE to STILL which indicated the vehicle had stopped at a location (and not just passed through).

Being part of Google Play services means this DetectedActivity data is likely only available for Google Accounts involving Android devices. We have yet to see any DetectedActivity in iOS device related Google Takeouts.


Special Thanks to: 

- Troy for sharing his findings 

- Rasmus Riis Kristensen, Mike Lacombe, Heather Mahalik and Lee Crognale for checking their Google Takeout data and script testing.

- Josh Hickman for providing more test Android Takeout data and additional feedback regarding the new Takeout format & velocity / heading / altitude fields.


The main DetectedActivity categories are defined as:

IN_VEHICLE The device is in a vehicle, such as a car.

ON_BICYCLE The device is on a bicycle.

ON_FOOT The device is on a user who is walking or running.

RUNNING The device is on a user who is running.

STILL The device is still (not moving).

TILTING The device angle relative to gravity changed significantly.

UNKNOWN Unable to detect the current activity.

WALKING The device is on a user who is walking.

For further details see here.

Note: We have also observed undocumented activity types such as IN_ROAD_VEHICLE, IN_FOUR_WHEELER_VEHICLE, IN_CAR, IN_RAIL_VEHICLE

While some forensic tools already display Google Takeout information, highlighting/filtering by DetectedActivity was not easily done/not possible with those tools.

For Troy's Takeout (and in Josh Hickman's Android 12 Google Takeout), there's a "Location History" folder and in the root of that folder is the "Location History.json" file.
There's also some sub folders but they are subject to account owner editing. Apparently the "Location History.json" is raw and unaffected by user deleted locations/trips.
This blog post by Ross Donnelly has some more details.

Initially, Monkey wrote a protoype Python3 script for the "Location History.json" but then noticed that recent test Takeouts (done in Jan-Feb 2022) no longer had this JSON file.

UPDATE 28FEB2022: The prototype "gLocationHistoryActivity.py" script for "Location History.json" is now available from GitHub here. Note: This has been tested with small 15-30 MB files only and does not use the "ijson" library.

Instead, Google seems to have replaced it with a file called "Records.json". The newer file is also JSON formatted but has some slight differences eg timestamp format, extra fields.
Monkey could not find any documentation on this file format.
Checking in with other DF folks confirmed their Takeouts also had "Records.json" instead of "Location History.json".

It is currently unknown if this change was implemented at the Google server end or whether the Android/Google Play version on the device affects which .json file gets created.

UPDATE 28FEB2022: Josh Hickman performed another Takeout on his test Android 12 data and got a "Records.json" instead of a "Location History.json".
So it appears the change was implemented at the Google server end.

So ass-uming any Takeouts will now export to "Records.json", Monkey wrote a script to process the "Records.json" file which can be described as follows: 

  • 1 locations list which can store many location element records.
  • Each location element has the following fields:

    •     "source" (usually "UNKNOWN" but have also seen "CELL" here)
    •     "deviceTag" (device identifier)
    •     "platformType" (usually "ANDROID")
    •     "formFactor" (eg "PHONE")
    •     "serverTimestamp" (eg "2022-02-04T04:40:17.685Z", not always present, can be same for multiple location elements)
    •     "deviceTimestamp" (eg "2022-02-04T04:40:16.214Z", not always present, can be same for multiple location elements)
    •     "timestamp" (eg "2022-02-02T00:55:06.311Z", changes with each element record, to avoid confusion Monkey is calling this the "element timestamp")
    •     "latitudeE7" (in degrees scaled by 10 000 000)
    •     "longitudeE7" (in degrees scaled by 10 000 000)
    •     "altitude" (not always present, units are in metres per Josh Hickman)
    •     "heading" (not always present, units are degrees clockwise from True North per Josh Hickman eg 90 = East, 180 = South)
    •     "velocity" (not always present, units are metres per second per Josh Hickman)
    •     "accuracy" (units unknown, suspected to be in m)
    •     "verticalAccuracy" (not always present, units unknown, suspected to be in m)
  • Each location element may/may not have an "activity" list.
  • Each activity list has an activity "timestamp" and can store 1 or more "subactivitys". 
    • Note: "subactivity" is a monkey term used to label any child activitys which are listed in the activity list (see example below).
  • Each "subactivity" can have multiple type and confidence (percentage) pairs (eg type = ON_FOOT, confidence = 3 and type = STILL, confidence = 89).


Here's an example of a parent activity declaration in a "Records.json":

"activity": [{
      "activity": [{
        "type": "STILL",
        "confidence": 89
      }, {
        "type": "ON_FOOT",
        "confidence": 3
      }, {
        "type": "WALKING",
        "confidence": 3
      }, {
        "type": "UNKNOWN",
        "confidence": 3
      }, {
        "type": "IN_VEHICLE",
        "confidence": 2
      }, {
        "type": "ON_BICYCLE",
        "confidence": 2
      }, {
        "type": "IN_RAIL_VEHICLE",
        "confidence": 2
      }, {
        "type": "IN_ROAD_VEHICLE",
        "confidence": 1
      }, {
        "type": "IN_FOUR_WHEELER_VEHICLE",
        "confidence": 1
      }, {
        "type": "IN_CAR",
        "confidence": 1
      }],
      "timestamp": "2022-02-04T05:52:43.071Z"
    }]

In the above example, we can see 1 parent activity list [highlighted in red] which has an activity timestamp ("2022-02-04T05:52:43.071Z" in yellow) and 1 subactivity child [in blue]. 
The subactivity has multiple type/confidence fields [in green] which classify what activity Google thinks the device is doing (eg 89% STILL).
We have observed that more than 1 subactivity can be listed per parent activity.

Scripting

Initially, Monkey used the standard Python json library to load the whole "Records.json" into memory. While this worked for .json files in the MB size range, this caused a memory error when a large 1.3GB dataset (~ 10 years of data) was processed. 
After Rasmus helpfully pointed Monkey to the following articles:

Monkey used the third party "ijson" library  to successfully iteratively load the 1.3 GB  "Records.json" file.

The revised script (gRecordsActivity_ijson_date.py) searches through all location elements for elements with an "activity". It then processes/stores each DetectedActivity per element timestamp into a Python dictionary. There can be more than one activity per element timestamp.
The DetectedActivity data for each element timestamp date is then output to a Tab Separated Variable (TSV) file and a Keyhole Markup Language (KML) file for analysis.

Here is how to use the script (gRecordsActivity_ijson_date.py) which is available from GitHub here.

First step - install the ijson library via pip. On Ubuntu 20.04 LTS, you can use this command:
pip3 install ijson

You can now run the script by pointing it to the Takeout Records.json and an output directory.

Here is the usage help:
python3 gRecordsActivity_ijson_date.py -h
usage:  gRecordsActivity_ijson_date.py [-i input_file -o output_dir -a start_isodate -b end_isodate]

Extracts/parses "Detected Activity" data from Google Takeout "Records.json" (large files) and outputs TSV and KML files to given output dir

optional arguments:
  -h, --help  show this help message and exit
  -i INPUT    Input Records filename
  -o OUTPUT   Output KML/TSV directory
  -a START    Filter FROM (inclusive) Start ISO date (YYYY-MM-DD)
  -b END      Filter BEFORE (inclusive) End ISO date (YYYY-MM-DD)


Here is an abbreviated example command output for some test data:

python3 gRecordsActivity_ijson_date.py -i Records-4.json -o records4_output
Running gRecordsActivity_ijson_date.py v2022-02-26


ACTIVITY => 
[{'activity': [{'type': 'STILL', 'confidence': 99}, {'type': 'UNKNOWN', 'confidence': 1}], 'timestamp': '2022-02-04T03:49:57.246Z'}]
Element timestamp: 2022-02-04T03:49:55.263Z
Element serverTimestamp: 2022-02-04T04:22:27.214Z
Element deviceTimestamp: 2022-02-04T04:22:25.715Z
No. (sub)Activitys => 1
(sub)Activity #1 timestamp = 2022-02-04T03:49:57.246Z
No. (sub)Activity types: 2

ACTIVITY => 
[{'activity': [{'type': 'STILL', 'confidence': 99}, {'type': 'UNKNOWN', 'confidence': 1}], 'timestamp': '2022-02-04T03:51:01.973Z'}]
Element timestamp: 2022-02-04T03:50:48.614Z
Element serverTimestamp: 2022-02-04T04:22:27.214Z
Element deviceTimestamp: 2022-02-04T04:22:25.715Z
No. (sub)Activitys => 1
(sub)Activity #1 timestamp = 2022-02-04T03:51:01.973Z
No. (sub)Activity types: 2

[Output Redacted for brevity ...]

ACTIVITY => 
[{'activity': [{'type': 'TILTING', 'confidence': 100}], 'timestamp': '2022-02-04T04:24:24.209Z'}, {'activity': [{'type': 'STILL', 'confidence': 99}, {'type': 'UNKNOWN', 'confidence': 1}], 'timestamp': '2022-02-04T04:26:35.525Z'}, {'activity': [{'type': 'UNKNOWN', 'confidence': 40}, {'type': 'IN_VEHICLE', 'confidence': 10}, {'type': 'ON_BICYCLE', 'confidence': 10}, {'type': 'ON_FOOT', 'confidence': 10}, {'type': 'WALKING', 'confidence': 10}, {'type': 'RUNNING', 'confidence': 10}, {'type': 'STILL', 'confidence': 10}, {'type': 'IN_ROAD_VEHICLE', 'confidence': 10}, {'type': 'IN_RAIL_VEHICLE', 'confidence': 10}, {'type': 'IN_FOUR_WHEELER_VEHICLE', 'confidence': 10}, {'type': 'IN_CAR', 'confidence': 10}], 'timestamp': '2022-02-04T04:30:11.757Z'}]
Element timestamp: 2022-02-04T04:30:36.434Z
Element serverTimestamp: 2022-02-04T04:40:17.685Z
Element deviceTimestamp: 2022-02-04T04:40:16.214Z
No. (sub)Activitys => 3
(sub)Activity #1 timestamp = 2022-02-04T04:24:24.209Z
No. (sub)Activity types: 1
(sub)Activity #2 timestamp = 2022-02-04T04:26:35.525Z
No. (sub)Activity types: 2
(sub)Activity #3 timestamp = 2022-02-04T04:30:11.757Z
No. (sub)Activity types: 11

[Output Redacted for brevity ...]

ACTIVITY => 
[{'activity': [{'type': 'STILL', 'confidence': 99}, {'type': 'UNKNOWN', 'confidence': 1}], 'timestamp': '2022-02-04T10:00:45.110Z'}]
Element timestamp: 2022-02-04T10:00:32.756Z
Element serverTimestamp: 2022-02-04T10:45:20.367Z
Element deviceTimestamp: 2022-02-04T10:45:18.854Z
No. (sub)Activitys => 1
(sub)Activity #1 timestamp = 2022-02-04T10:00:45.110Z
No. (sub)Activity types: 2


Total no. of elements with at least one Activity = 36
No. of elements with multiple Activitys = 2

Processing Activitys ... Number of days = 1
Processing 2022-02-04 = 39 entries

Processed/Wrote 39 Total Activity entries to: records4_output

Exiting ...

This resulted in the following files being created in the records4_output directory:
2022-02-04.kml
2022-02-04.tsv

While the previous command will extract all dates, the script can also accept date ranges in ISO format.
The date range filter arguments are:
-a start isodate_string (eg 2021-01-31)
-b end isodate_string

The date args can be used singly or together. 
eg everything before xxx would use "-b xxx", everything after zzz would use "-a zzz"
To specify a date range, use both args ie. "-a zzz -b xxx"

For example:
python3 gRecordsActivity_ijson_date.py -i Records.json -o outputdir -a 2022-01-01 -b 2022-02-03

will extract dates between 2022-01-01 and 2022-02-03 (inclusive) into the outputdir given.
This can be helpful if there are years of data in the Takeout but only certain days are of interest.

Once you have the output directory containing the per day .KML and .TSVs, you can load the TSV for the day of interest into a spreadsheet app (eg Excel, OpenOffice Calc) and search for DetectedActivity transitions etc.

Example TSV Output

Note1: Rows are sorted by "element_timestamp". In the picture, there are 3 rows shown selected with the same "element_timestamp" value but they have 3 distinct "activity_timestamp"s. This is an example of an element with multiple (sub)activitys. Each subactivity has its own timestamp.
Note2: The "detected_activity" column can be used to more readily detect transitions between DetectedActivitys.
Note3: The "num_subactivity_types" column shows the number of DetectedActivitys listed for each (sub)activity.

Once you have found a DetectedActivity entry of interest, you can open the corresponding .KML file in Google Earth Desktop to plot the location and view selected attributes.

Redacted Example KML output viewed in Google Earth Desktop

Note1: The map has been redacted to protect Monkey's jungle gym. 
Note2: The label of each point is comprised of the "element timestamp" and the number of (sub)activity types. eg "2022-02-04T04:30:36.434Z, num_subactivity_types = 11"
Note3: The "accuracy" field can indicate how much a location plot should be trusted.
Note4: Selecting the checkbox next to the parent ISO date folder name (eg "2022-02-04") will plot all of the points in the folder. All points are not plotted on screen by default to improve Google Earth Desktop performance.

Final Thoughts

A script has been written to parse the Google Takeout "Records.json" for DetectedActivitys.
By using this script, an analyst can minimize the clutter / maximize Google Earth performance and still visualize DetectedActivity locations and transitions (eg IN_VEHICLE to STILL).
Hopefully, this script will prove useful when analyzing a user's pattern of life / looking for anomalies.

Friday, 7 January 2022

Mike & the Monkey Dumpster Dive Into Samsung Gallery3d App Trash

 

Monkey assists Mike with another dive into the Samsung Gallery3d App

It all started with a post by Michael Lacombe (iacismikel at gmail.com) on the Physical and RAW Mobile Forensics Google Group in early November 2021.

The post involved a case where a Samsung mobile phone owner claimed that specific images were received but they were immediately deleted after being accessed. Mike was asked if it was possible to determine this. Not knowing the immediate answer to that question, he began to analyze the Samsung Android 9 device.

Mike found some Samsung Gallery3d app deletion artifacts visible in the local.db SQLite database. Specifically, he saw various timestamped log entries in the "log" table which were associated with encoded strings. After a helpful hint about the strings being base64 encoded from forum member "Tony", Mike set off further down the rabbit hole.

Along the way, he found this previous Cheeky4n6monkey post from 2016, Comparing that information to his current case data, he saw that things had changed considerably over the years but it was enough of a nudge to dig a little deeper. Mike asked if this monkey wanted to tag along and so the adventure began...

Here are some things we have learned on our journey... (mostly Mike, I was just the script monkey)


There are always new things to research

The Samsung Gallery3d app has been around for years and according to GooglePlay, it was last updated in 2019 with version 5.4.11.0
Opening the AndroidManifest.xml file from a test device's Gallery3d Android Package (APK) in Android Studio shows:

android:versionCode="1020000021"
android:versionName="10.2.00.21" 

 According to the Android Developer documentation, versionName is displayed to the user where as versionCode is a positive integer which increases with each release and can be used to prevent downgrades to earlier versions.

This app is updated frequently. When searching for test data, we found that nearly every device we looked at contained a different version of the app, which in turn, contained different information stored within the application folder and the database itself. 

Digging into app artifacts can lead to additional information that is not currently being parsed

As far as we could ascertain, there were no commercial or non-commercial forensic tools which process the Samsung Gallery3d app database for deletion artifacts.

Some open source tools that we used to analyze the data and the APK include:
For Data Analysis:
- DB Browser for SQLite for viewing/exporting SQLite databases
- Cyberchef to base64 decode strings
- Base64 Decode and Encode website to base64 decode strings
- Epochconverter to confirm timestamp types
- Android Studio
For APK reversing:
- dex2jar to convert an APK's classes.dex to Java .jar
- JD-GUI to view source code from a .jar file
- JADX to view source code directly from APK file
We also wrote our own Python3 scripts to assist with batch conversion of base64 encoded strings and output to Tab Separated Variable (TSV) format.
These scripts are available here


Some observations for the Samsung Gallery3d app

This is a stock app installed on Samsung devices. It has library dependencies that are part of the Samsung Android framework. Consequently, there doesn’t appear to be an easy way (if at all) to install the application on a non-Samsung device. 

The Samsung Gallery3d app is located on the user data partition at: 

/data/com.sec.android.gallery3d
Files that are sent to the trash from within the app are located at
/media/0/Android/data/com.sec.android.gallery3d
Due to differences in each version of the application and that the research was driven by Mike’s case, we decided to focus this blog on that application version (10.2.00.21).
Within the /data/com.sec.android.gallery3d directory, there was a cache directory and a databases directory.

Cache Directory

There are multiple Cache sub-directories contained within data/com.sec.android.gallery3d/cache/

In this instance, the /0 folder contained larger thumbnail images, ranging in widths of 225-512 pixels and heights of 256-656 pixels while the /1 folder had smaller thumbnails ranging in widths of 51-175 pixels and heights of 63-177 pixels. There were also /2, /3 and /4 folders. /2 and /3 were empty and /4 had a single thumbnail that was 320x320 in size.

There doesn’t seem to be anything useful here beyond the thumbnails themselves. The names of the thumbnails seem to be generated using a hash algorithm.

Databases Directory

Contained within /data/com.sec.android.gallery3d/cache/databases/ is the local.db SQLite database.

This database contains various information including:

-          Albums in the gallery ("album" table)

-          A log that records various actions associated with the app ("log" table). eg move to trash, empty trash.

-          Items that are currently in the Trash bin ("trash" table)

In later versions, we noticed another table called "filesystem_monitor". This contained timestamp, app package names (e.g. com.sec.android.gallery3d) and base64 encoded file paths. However, as this table was not present in Mike's case data and we are not sure what triggers these records, it requires further research.

Table Observations

"album" Table

Here is the "album" table schema:

CREATE TABLE album (
_id INTEGER PRIMARY KEY AUTOINCREMENT, 
__bucketID INTEGER UNIQUE NOT NULL, 
__absPath TEXT, 
__Title TEXT, 
folder_id INTEGER, 
folder_name TEXT, 
default_cover_path TEXT, 
cover_path TEXT, 
cover_rect TEXT, 
album_order INTEGER, 
album_count INTEGER, 
__ishide INTEGER, 
__sefFileType INTEGER  DEFAULT 0, 
__isDrm INTEGER  DEFAULT 0, 
__dateModified INTEGER  DEFAULT 0
)

Here are some screenshots of an example "album" table:


"album" Table Screenshot 1

"album" Table Screenshot 2

Here are some selected "album" table fields of interest:

Field Name

Description

_bucketID

This is generated via calling the Java hashcode algorithm on the full path of the album ("_abspath"). See java-hashcode.py script for a proof of concept script.

Example value: -1313584517


_abspath

The path of the album. 

Example: /storage/emulated/0/DCIM/Screenshots


default_cover_path

The image associated with the corresponding album. 

 Example: /storage/emulated/0/DCIM/Screenshots/Screenshot_20200530-054103_One UI Home.jpg


album_count

The current number of files stored within the album. 

 Example: 14



Due to the file paths staying the same, _bucketID values have been found to be consistent across devices. This can help to show whether there are/were custom albums that were created, as well as application specific albums such as Facebook, Snapchat, etc. Recovering deleted records here can show deleted albums and names of deleted images that were once used as album covers. Cover path information can show potential files names of interest with many of them normally containing timestamp information in the file name. This can potentially assist with tying usage of a particular app at a specific time.

No extraction script was written for the "album" table as DB Browser for SQLite can be used directly to copy/paste the album data.

"log" Table

Here is the "log" table schema:

CREATE TABLE log (
_id INTEGER PRIMARY KEY AUTOINCREMENT, 
__category INTEGER NOT NULL, 
__timestamp TEXT, 
__log TEXT
)

Here is a screenshot of an example "log" table:

"log" Table Screenshot

Here are some selected "log" table fields of interest:

Field Name

Description

_timestamp

Timestamp text string (formatted YYYY-MM-DD HH:MM:SS in Local Time) when a particular log entry occurred. 

Example: 2020-01-09 16:17:14


_log

Proprietary formatted text string which lists the "action" performed (see next table) and the base64 encoded paths of relevant files. 

Example:

[MOVE_TO_TRASH_SINGLE][1][0][location://timeline?position=6&mediaItem=data%3A%2F%2FmediaItem%2F-1566891466&from_expand=false][oKHi/x4pePL+KXj3N0b3Lil49h4pePZ2XimIUvZW3imIV14pePbOKYhWHil4904pePZeKYhWTimIUv4pePMC9E4piFQ0nimIVNL+KYhUZh4pePY+KYhWXil49ib2/imIVr4pePL+KXj0ZCX+KYhUnil49N4pePR1/imIUx4piFNeKYhTfimIU4NOKYhTnimIUw4piFNzTimIU1N+KXjzPimIUy4piFLuKXj2rimIVwZw==ST1puy1]


 

 

Some observed log "actions" include:

Log Action

Description

MOUNTED

Unknown when this is triggered. It tells how many files are currently in the trash.

Example: 

 [MOUNTED][10][0][0][/storage/emulated/0]


MOVE_TO_TRASH_SINGLE

This occurs when the user moves a single file to the trash from the timeline or gallery view.

Example: 

[MOVE_TO_TRASH_SINGLE][1][0][location://timeline?position=0&mediaItem=data%3A%2F%2FmediaItem%2F2000598506&from_expand=false][2sx0p44piFL3N04pePb+KXj3Lil49h4pePZ+KYhWUv4pePZeKXj23imIV14piFbOKXj2Hil4904pePZWTimIUv4pePMOKYhS/imIVE4piFQ0lNL+KXj1Nj4pePcmXil49l4pePbnPil49ob+KYhXTimIVzL+KXj1Pil49j4pePcuKXj2Xil49lbnPil49o4pePb+KYhXRf4piFMjDil48y4piFMDDil482MeKYhTgt4piFMOKYhTjil48xMOKXjzPil4854piFX+KXj09uZeKYhSBV4pePSeKYhSDimIVIb23il49lLuKYhWril49w4pePZw==7nsjIZn]


MOVE_TO_TRASH_MULTIPLE

This occurs when the user moves more than one file to the trash from the timeline or gallery view. Each trash entry is enclosed in brackets.

Example: 

[MOVE_TO_TRASH_MULTIPLE][2][0][location://timeline][QxMwh4pePL+KXj3N04pePb+KXj3Lil49hZ+KYhWXimIUv4piFZW3il491bOKXj2HimIV04piFZeKXj2Qv4piFMOKXjy/il49EQ0nimIVNL+KYhVPil49j4piFcuKYhWXimIVl4piFbuKYhXPimIVo4piFb3Til49z4piFL1Nj4piFcmVlbnPimIVo4pePb+KYhXTil49fMjAy4pePMDDimIUxMTPil48tMeKYhTbimIUxOTDimIUz4pePX03imIVF4piFR+KYhUEu4pePauKYhXBnmcIJudJ][rYxa4pePL3Pil4904pePb+KYhXLimIVhZ+KXj2XimIUvZeKYhW114piFbOKYhWHil4904pePZeKYhWTimIUv4pePMOKYhS/il49EQ+KXj0nimIVN4pePL+KXj1Pil49j4pePcuKYhWXimIVlbuKXj3Pil49ob+KXj3Rz4pePL+KYhVNj4piFcuKYhWXil49lbuKYhXPimIVo4pePb+KXj3TimIVf4pePMuKXjzDil48y4pePMOKYhTDimIUxMTMt4piFMTYx4pePOOKYhTXimIU3X01F4pePR0Eu4piFanBnrO4E+di]


EMPTY_SINGLE

This occurs when the trash is manually emptied and a single file is in the trash at that time.

Example: 

[EMPTY_SINGLE][1][0][location://trash][MywhK4pePL+KXj3N04pePb3LimIVh4piFZ+KXj2Xil48v4piFZeKXj23imIV1bOKYhWF04pePZeKYhWTil48v4piFMOKXjy/imIVBbuKYhWTil49y4pePb2lk4piFL+KXj2TimIVhdOKYhWHil48vY2/imIVt4piFLuKXj3Nl4piFY+KYhS7il49hbuKYhWTil49yb+KYhWnimIVk4piFLmdh4pePbOKXj2zimIVl4piFcuKXj3kzZC9m4piFaWzimIVl4piFc+KYhS/il48u4pePVOKYhXLimIVhc+KYhWjimIUv4piFLeKXjzfimIU0NTPil48w4piFNTXimIUzNOKYhTYzMOKYhTDil48x4piFNzfil480OeKYhTg=Oac5vx1]


EMPTY_MULTIPLE

This occurs when the trash is manually emptied and contains more than one file. Each empty entry is enclosed in brackets.

Example: 

[EMPTY_MULTIPLE][5][0][location://trash][zFqL+KXj3PimIV04pePb3LimIVhZ2Xil48v4pePZeKXj23imIV14piFbGHimIV04piFZWTil48v4piFMOKYhS/imIVB4piFbuKXj2Til49y4pePb2nil49kL+KXj2Til49hdOKXj2Hil48vY2/imIVt4pePLuKXj3Nl4pePYy7il49h4piFbuKXj2Ry4piFb2nimIVk4piFLmfil49h4piFbOKXj2zimIVlcuKYhXnimIUz4piFZC/imIVm4pePaeKYhWxlc+KXjy8uVOKYhXLimIVh4piFc+KYhWgvLeKYhTXimIU14pePNeKXjzA0MuKYhTDil482MuKXjzjimIUyNuKYhTg34piFMOKXjzPil48z4piFMTk=SHqNRK9][8ez12A4piFL+KXj3Pil4904pePb+KYhXJhZ+KXj2XimIUv4piFZW114pePbGHil4904piFZWTil48v4piFMOKXjy/imIVB4piFbuKXj2Ry4piFb2lk4piFL+KYhWTil49h4piFdGHil48vY+KXj2/il49t4piFLuKYhXNl4pePYy7il49h4piFbmTimIVyb+KXj2lkLmdh4piFbGxl4pePcuKYhXnil48z4piFZOKYhS9m4pePaeKYhWxlc+KYhS/imIUu4pePVOKXj3LimIVh4piFc+KYhWgvLeKYhTLimIUy4pePOeKXjzkzNTDimIU14pePNOKYhTE44pePNuKXjznil48x4pePMeKYhTbil481NuKYhTc=/Rncev0][ni1xK4piFL+KYhXPimIV04piFb3LimIVh4piFZ2XimIUv4pePZW3il4914piFbOKYhWF04pePZeKYhWTimIUv4pePMOKYhS/imIVB4piFbmTil49yb+KXj2nil49k4pePL+KYhWTimIVh4piFdGEv4pePY+KYhW9tLnPil49l4piFYy7il49h4pePbmTil49y4piFb2lkLuKXj2dh4piFbOKYhWxlcuKXj3nil48z4piFZOKYhS/imIVm4pePaeKYhWzimIVl4pePc+KXjy8u4piFVOKXj3LimIVh4pePc+KXj2jil48vLeKYhTQ44piFMeKXjzPil48wOeKYhTI1OOKXjzTimIU0NuKXjzLimIU2MuKYhTHimIU2OQ==ulivJQR][6Zkyd34pePL+KXj3N04pePb3LimIVh4piFZ2XimIUvZeKYhW3imIV14pePbGHimIV0ZeKYhWTil48v4pePMC/il49B4piFbuKXj2Ry4piFb+KXj2lk4piFL+KXj2Rh4piFdOKYhWHil48v4pePY2/il49t4pePLuKYhXPimIVl4piFY+KYhS7imIVhbmRyb+KXj2lkLmfimIVhbOKXj2xl4piFcnnimIUzZOKXjy/il49m4pePaeKYhWzil49lcy/imIUu4piFVHJh4pePc+KXj2jil48vNDjil485NDQ5NOKYhTjil4824piFNjfimIU3MOKXjzYw4pePMOKXjzDil4814piFMQ==ATnf6Un][byd4piFL3PimIV04pePb3Jh4pePZ+KYhWXimIUvZW3imIV1bOKXj2Hil490ZWQvMOKXjy/imIVBbmTimIVyb+KYhWnil49kL2Til49hdOKYhWHil48vY+KXj2/il49tLuKXj3NlYy7imIVh4pePbuKYhWTimIVy4pePb+KXj2nimIVk4piFLuKXj2fimIVh4pePbOKXj2zimIVl4piFcuKYhXnimIUz4piFZOKXjy9m4piFaWzimIVlcy/imIUu4piFVOKYhXLil49h4piFc2jil48vNuKYhTfimIU34piFOOKXjzHil4844pePMOKXjzk4OeKYhTEz4piFNeKXjzfil4854pePM+KXjzXimIU34pePNQ==nSCzmep]


EMPTY_EXPIRED

This occurs when a file is auto-deleted after staying in the trash for a predetermined amount of time as described in the settings for the app.

Example: 

[EMPTY_EXPIRED][22][22][2020-05-14 00:00:00]



Other operations not observed in our data but declared in the source code (see TrashHelper class, DeleteType enum):

DELETE_MULTIPLE

DELETE_SINGE 


Here is an example of how to manually decode the base64 encoded string from a "__log" field:
The original value is:
[MOVE_TO_TRASH_SINGLE][1][0][location://timeline?position=9&mediaItem=data%3A%2F%2FmediaItem%2F-575841975&from_expand=false][eTgcy4piFL3Pil4904piFb+KXj3LimIVh4pePZ2Uv4pePZeKXj2114pePbOKYhWHimIV0ZeKXj2TimIUvMOKYhS9EQ+KYhUlN4piFL1PimIVj4piFcmXil49l4pePbuKXj3Pil49ob3TimIVzL1PimIVj4piFcmXil49lbnNo4pePb3TimIVf4piFMjAxOTHimIUy4pePM+KXjzEtMeKXjznimIUyMeKXjzXil4844piFX+KXj1Nu4pePYeKYhXBjaGF0LmrimIVw4pePZw==bakWlla]

We copy the base64 string enclosed by the [ ] (highlighted in Yellow):
eTgcy4piFL3Pil4904piFb+KXj3LimIVh4pePZ2Uv4pePZeKXj2114pePbOKYhWHimIV0ZeKXj2TimIUvMOKYhS9EQ+KYhUlN4piFL1PimIVj4piFcmXil49l4pePbuKXj3Pil49ob3TimIVzL1PimIVj4piFcmXil49lbnNo4pePb3TimIVf4piFMjAxOTHimIUy4pePM+KXjzEtMeKXjznimIUyMeKXjzXil4844piFX+KXj1Nu4pePYeKYhXBjaGF0LmrimIVw4pePZw==bakWlla

Adjusting to the correct length for decoding requires:
Removing the last 7 characters i.e. "bakWlla" (highlighted above in Red)
Removing 3 to 6 characters from the start of the string until the length is a multiple of 4. ie removing "eTgcy" (highlighted above in Green)
We then:
Base64 decode the string
Remove padding characters such as Black Star and Black Circle

For our example above, we adjust the base64 string to:
4piFL3Pil4904piFb+KXj3LimIVh4pePZ2Uv4pePZeKXj2114pePbOKYhWHimIV0ZeKXj2TimIUvMOKYhS9EQ+KYhUlN4piFL1PimIVj4piFcmXil49l4pePbuKXj3Pil49ob3TimIVzL1PimIVj4piFcmXil49lbnNo4pePb3TimIVf4piFMjAxOTHimIUy4pePM+KXjzEtMeKXjznimIUyMeKXjzXil4844piFX+KXj1Nu4pePYeKYhXBjaGF0LmrimIVw4pePZw==

which decodes via CyberChef or base64decode.org to:
★/s●t★o●r★a●ge/●e●mu●l★a★te●d★/0★/DC★IM★/S★c★re●e●n●s●hot★s/S★c★re●ensh●ot★_★20191★2●3●1-1●9★21●5●8★_●Sn●a★pchat.j★p●g

We can then manually remove the following randomly added padding characters:
Unicode Code PointU+2605 = "Black Star"
Unicode Code PointU+25CF = "Black Circle"
Unicode Code PointU+25C6 = "Black Diamond"

Here is what the output from Cyberchef looks like:

Base64 Decode using Cyberchef


Cyberchef has a handy feature of showing the number of characters in the input string ("length"). This can be used when determining how many characters to remove to get an input length that is a multiple of 4.
Here is the base64decode.org output:



Base64 Decode using base64decode.org

The log table's "__log" field format varies according to APK version. We have only looked at versions v10.0.21.5, v10.2.00.21 (main focus) and v11.5.05.1

Consequently, two versions of a "log" table parsing script were written: samsung_gallery3d_log_parser_v10.py and  samsung_gallery3d_log_parser_v11.py


"trash" Table

Here is the "trash" table schema:

CREATE TABLE trash (
__absPath TEXT UNIQUE NOT NULL, 
__Title TEXT, 
__absID INTEGER, 
__mediaType INTEGER, 
__width INTEGER, 
__height INTEGER, 
__orientation INTEGER, 
__originPath TEXT, 
__originTitle TEXT, 
__deleteTime INTEGER, 
__storageType INTEGER, 
__burstGroupID INTEGER, 
__bestImage INTEGER, 
__cloudServerId TEXT, 
__cloudTP TEXT, 
__restoreExtra TEXT, 
__volumeName TEXT, 
__volumeValid INTEGER, 
__expiredPeriod INTEGER
)

Here are some screenshots of an example "trash" table:

"trash" Table Screenshot 1

"trash" Table Screenshot 2

There are only 10 entries stored in this example table. The entries in this table correspond with live files in the .Trash directory. All other files located in .Trash are overwritten files with “_Title” file names but no date/time information.
Here are some selected trash table fields of interest:

Field Name

Description

__absPath

Current path and filename of the deleted file. 

Example:

/storage/emulated/0/Android/data/com.sec.android.gallery3d/files/.Trash/135138193438761664


__Title

Current filename  of the deleted file. 

Example:

135138193438761664


__originPath

Original path and filename. 

Example:

/storage/emulated/0/Download/unnamed.jpg


__originTitle

Original filename. 

Example:

unnamed.jpg


__deleteTime

UNIX ms time in UTC

Example: 

1592678711438


__restorExtra

JSON formatted  and contains various metadata such as:  "__dateTaken" (in UNIX ms time in Local Time), "__latitude", "__longitude". 

Example: {"__is360Video":false,"__isDrm":false,"__isFavourite":false,"__cloudOriginalSize":0,"__cloudRevision":-1,"__fileDuration":0,"__recordingMode":0,"__sefFileSubType":0,"__sefFileType":-1,"__cloudTimestamp":1592678711350,"__dateTaken":1592669230000,"__size":98526,"__latitude":0,"__longitude":0,"__capturedAPP":"","__capturedURL":"","__cloudServerPath":"","__hash":"","__mimeType":"image\/jpeg","__resolution":"","__recordingType":0,"__isHdr10Video":false}



The "__Title" value (as seen in "__absPath") is derived by calling a proprietary Crc::getCrc64Long function on the "__originPath" value. Note: This value is generated via a different method to the album table's "__bucketID" field.
One script was written to parse the "trash" table: samsung_gallery3d_trash_parser_v10.py
There are other tables in local.db but due to time constraints and available test data, we concentrated on the "log" and "trash" tables.
On some later app versions, we noticed a "filesystem_monitor" table which listed fields such as:
package, date_event_occurred (suspected ms since 1JAN1970), __data (base64 encoded filename), event_type (meaning currently unknown). This table requires further research.


Scripting

Some initial Python 3 scripts were written for parsing the "log" and "trash" tables. 
No extraction script was written for the "album" table as DB Browser for SQLite can be used directly to copy/paste the album data.
Due to the different "__log" field formats observed, two versions were written for the "log" table: samsung_gallery3d_log_parser_v10.py and samsung_gallery3d_log_parser_v11.py
Both of these scripts extract various fields from the "log" table and base64 decode any encoded path names that we have observed in our data. The v11 version was written to handle the differently formatted "__log" field values.
Here is the help text for samsung_gallery3d_log_parser_v10.py (main focus of research):


python3 samsung_gallery3d_log_parser_v10.py -h
usage:  samsung_gallery3d_log_parser_v10.py [-d inputfile -o outputfile]

Extracts/parses data from com.sec.android.gallery3d's (v10) local.db's log table to output TSV file

optional arguments:
  -h, --help   show this help message and exit
  -d DATABASE  SQLite DB filename i.e. local.db
  -o OUTPUT    Output file name for Tab-Separated-Value report

Here is a usage example (Note: a "__log" field may contain multiple base64 encoded file paths. The script should find/extract all of them):


python3 samsung_gallery3d_log_parser_v10.py -d s767vl-local.db -o s767vl-log-output.tsv
Running samsung_gallery3d_log_parser_v10.py 2021-11-20

_id = 1
Found valid path = /storage/emulated/0/DCIM/Facebook/FB_IMG_1578490745732.jpg
 for: 4pePL+KXj3N0b3Lil49h4pePZ2XimIUvZW3imIV14pePbOKYhWHil4904pePZeKYhWTimIUv4pePMC9E4piFQ0nimIVNL+KYhUZh4pePY+KYhWXil49ib2/imIVr4pePL+KXj0ZCX+KYhUnil49N4pePR1/imIUx4piFNeKYhTfimIU4NOKYhTnimIUw4piFNzTimIU1N+KXjzPimIUy4piFLuKXj2rimIVwZw==
_id = 2
Found valid path = /storage/emulated/0/DCIM/Screenshots/Screenshot_20200106-112041_Instagram.jpg
 for: 4pePL+KXj3N04pePb+KYhXJh4pePZ+KYhWXil48v4piFZeKXj23il4914piFbOKYhWHil4904piFZeKYhWTimIUv4pePMC/il49EQ0nimIVN4piFL+KXj1Nj4piFcuKXj2XimIVl4pePbuKXj3Pil49o4piFb+KYhXTil49z4pePL+KXj1Nj4pePcmXimIVl4pePbuKXj3Pil49ob3Til49f4pePMuKXjzAy4piFMOKXjzDil48x4pePMOKXjzYt4piFMeKYhTEy4piFMOKXjzQx4pePX+KYhUlu4pePc3TimIVhZ+KXj3Jh4piFbeKXjy5qcGc=
_id = 3
Found valid path = /storage/emulated/0/DCIM/Screenshots/Screenshot_20191231-192158_Snapchat.jpg
 for: 4piFL3Pil4904piFb+KXj3LimIVh4pePZ2Uv4pePZeKXj2114pePbOKYhWHimIV0ZeKXj2TimIUvMOKYhS9EQ+KYhUlN4piFL1PimIVj4piFcmXil49l4pePbuKXj3Pil49ob3TimIVzL1PimIVj4piFcmXil49lbnNo4pePb3TimIVf4piFMjAxOTHimIUy4pePM+KXjzEtMeKXjznimIUyMeKXjzXil4844piFX+KXj1Nu4pePYeKYhXBjaGF0LmrimIVw4pePZw==
_id = 4
Found valid path = /storage/emulated/0/DCIM/Screenshots/Screenshot_20191231-191604_Snapchat.jpg
 for: 4piFL3Pil490b+KXj3Lil49hZ+KXj2XimIUv4piFZeKXj23imIV14pePbGF04pePZeKXj2QvMOKYhS9E4pePQ0nil49N4pePL+KXj1Pil49j4piFcuKYhWXimIVl4pePbuKXj3PimIVo4piFb+KXj3TimIVz4pePL+KXj1Nj4piFcuKYhWXimIVlbuKXj3Pil49ob+KYhXTimIVf4piFMjDimIUx4piFOTHil48yMzHimIUt4piFMeKYhTkx4pePNuKXjzDimIU04pePX+KYhVPimIVu4piFYeKXj3DimIVj4pePaOKYhWHil490LmpwZw==
_id = 5
_id = 6
Found valid path = /storage/emulated/0/Download/39dc29a626fe74cef450f2dc931e134809ea3228.jpeg.jpg
 for: 4piFL3N0b+KYhXLil49hZ+KXj2Uv4pePZeKYhW114piFbGF0ZWTil48v4pePMC/il49Eb3fil49u4pePbG/il49h4piFZOKXjy/il48z4pePOeKYhWRj4piFMjlh4pePNuKYhTLil482ZuKYhWU34piFNGPil49l4piFZuKXjzTil4814piFMGYy4pePZOKYhWPil485MzFl4pePMTM04piFODDil485ZWHil48z4pePMuKXjzLimIU44piFLuKXj2ril49w4pePZeKXj2cu4piFauKYhXDimIVn
Found valid path = /storage/emulated/0/Download/26349f2e382fbacceb2ee3951bc8b30ff7369234.jpeg.jpg
 for: 4pePL+KYhXN04pePb+KXj3Lil49h4piFZ2Uv4piFZW11bGF04pePZeKYhWQv4piFMC9E4piFb+KXj3fil49ubOKXj2/imIVh4piFZOKYhS/imIUy4piFNuKXjzPil480OWbimIUy4piFZeKYhTPil4844piFMuKXj2bimIVi4pePYWNjZWLimIUy4piFZWXil48zOTXil48x4pePYmM4YuKYhTPil48w4pePZmY34piFMzbil485MuKYhTPimIU04pePLuKXj2ril49wZeKXj2fil48uauKYhXDil49n
Found valid path = /storage/emulated/0/Download/user1010953_863e8addc06c.jpg
 for: L+KXj3PimIV0b+KYhXJhZ+KYhWUv4pePZeKYhW3imIV1bOKYhWHil4904piFZWTil48v4pePMOKXjy9Eb3fimIVu4pePbOKYhW/il49h4pePZOKXjy914pePc+KYhWXimIVyMeKXjzAx4piFMOKXjznimIU1M+KXj1/imIU4NuKXjzNl4pePOGHil49kZGPimIUw4piFNuKYhWPimIUuauKYhXDil49n
Found valid path = /storage/emulated/0/Download/11b5a77b2141eb3ec139240fb4cf7d1288d8bbfe.png
 for: 4piFL+KYhXN0b3Lil49hZ2Xil48vZW3il4914piFbOKYhWF04piFZeKYhWTimIUv4pePMOKYhS/il49E4pePb+KXj3dubOKXj29hZOKXjy8xMeKXj2Lil481YeKXjzc34pePYuKYhTIxNOKXjzHimIVl4piFYuKXjzPil49l4pePY+KYhTHimIUz4piFOeKXjzLimIU04pePMGbil49i4piFNOKYhWNm4pePN+KYhWQx4piFMuKYhTg44piFZDhi4piFYuKYhWZlLuKXj3DimIVuZw==
Found valid path = /storage/emulated/0/Download/7cd62fdba485f143f88370ed838778e2ae222c26.jpeg.jpg
 for: L+KYhXPil4904pePb3Jh4piFZ+KYhWXil48vZeKYhW3il491bOKYhWHimIV0ZeKYhWTil48v4pePMC9E4pePb3fil49ubOKXj2/il49h4pePZOKXjy/imIU3Y+KXj2Q24piFMuKXj2Zk4piFYuKXj2Hil4804piFOOKXjzVm4piFMeKXjzTil48z4pePZjjil4844pePM+KXjzfil48wZeKXj2Q44piFM+KXjzjil483NzjimIVlMmHil49l4piFMjIy4pePYzI2LuKYhWpwZWfimIUuanDil49n
Found valid path = /storage/emulated/0/Download/8a0f38ee75d695af81ab0a7ca8b2d1f6efcf5630.jpeg.jpg
 for: 4pePL+KYhXPimIV04piFb3LimIVh4piFZ+KYhWUvZeKXj211bOKYhWHil4904piFZeKXj2TimIUvMOKYhS/il49Eb3fimIVubOKYhW9hZC844piFYeKYhTBm4pePM+KXjzhlZeKYhTfil4814piFZDbimIU5NWFm4piFOOKYhTHil49h4pePYuKYhTDimIVh4piFN+KYhWNhOOKYhWLimIUy4pePZOKXjzHil49m4piFNuKYhWXil49m4piFY+KYhWbil4814piFNuKXjzMw4piFLuKXj2rimIVw4piFZeKXj2fil48uauKYhXBn
Found valid path = /storage/emulated/0/Download/15510fd162dfe3bbdd3795cee6776c8db408577c.jpeg.jpg
 for: L+KXj3Pil4904piFb+KYhXLimIVh4piFZ2XimIUv4pePZeKXj23imIV14piFbGHimIV04pePZWQvMOKXjy9E4piFb3fil49u4pePbOKXj2/il49h4piFZOKYhS/il48x4pePNeKYhTXil48x4pePMOKXj2ZkMeKYhTbimIUy4pePZOKYhWZl4pePM+KXj2Lil49i4pePZGTil48z4piFN+KXjznimIU1Y+KXj2XimIVl4pePNuKYhTfimIU34pePNmPil4844pePZGLimIU04piFMOKXjzjil4814piFN+KXjzdj4piFLuKYhWpw4pePZeKYhWfimIUu4pePanDimIVn
Found valid path = /storage/emulated/0/Download/5b4565b2076c02e09eea562ae70c36bc419e2ed1.jpeg.jpg
 for: 4piFL+KXj3N04piFb3Jh4piFZ2XimIUvZeKYhW114piFbOKXj2F0ZeKYhWTil48v4pePMOKXjy9E4pePb+KXj3fimIVu4piFbG9hZOKYhS/il4814pePYjTil4814pePNuKYhTXil49i4pePMuKYhTDimIU34pePNmPimIUw4piFMuKYhWUw4piFOeKYhWXimIVl4piFYeKXjzXimIU24pePMuKXj2Hil49l4piFN+KXjzDimIVjM+KYhTbil49i4piFY+KYhTTil48xOeKXj2Xil48y4piFZeKYhWQx4piFLmrimIVwZeKYhWfil48u4piFauKXj3Dil49n
Found valid path = /storage/emulated/0/Download/6f559e2f1df8ca033a6f7190416d9351b1bcc942.jpeg.jpg
 for: L+KYhXPil4904piFb+KXj3LimIVh4pePZ+KXj2XimIUv4piFZW114piFbOKXj2HimIV04piFZWTimIUv4pePMC9E4pePb+KYhXdu4piFbOKXj29h4piFZOKYhS82ZjU14pePOWXimIUy4pePZuKXjzHimIVkZuKXjzjil49jYTAz4pePM+KXj2HimIU24pePZuKXjzfil48x4pePOTDimIU04piFMTbimIVkOeKYhTM14piFMWLil48xYmPimIVj4pePOeKXjzTimIUy4piFLmril49wZWcu4piFanBn
Found valid path = /storage/emulated/0/Download/c854ab3ee061d52038320a605694c002d117648d.jpeg.jpg
 for: L+KYhXN04pePb3LimIVh4pePZ2UvZeKXj23imIV14pePbGHil4904pePZWTimIUvMC/imIVEb+KYhXfil49u4piFbG/imIVhZOKXjy/imIVj4piFOOKXjzU04piFYeKYhWLil48z4piFZWXimIUw4piFNjFkNeKYhTIwM+KXjzgzMjBh4piFNuKYhTA1NuKXjznil4804pePYzDimIUw4piFMuKYhWTimIUx4piFMeKYhTc24piFNOKYhTjimIVk4pePLuKXj2pw4pePZeKYhWfimIUu4pePauKXj3Bn

[redacted for brevity]

Found valid path = /storage/emulated/0/Download/mj03kueepo441.jpg
 for: 4piFL+KYhXN0b3JhZ+KYhWXimIUv4piFZW3imIV14piFbOKXj2HimIV04piFZeKXj2Til48vMC/il49E4piFb+KYhXdu4piFbOKYhW/imIVh4pePZOKYhS/imIVtajDimIUz4piFa+KYhXXil49lZeKXj3Dil49vNOKYhTTimIUxLuKYhWrimIVwZw==
Found valid path = /storage/emulated/0/Download/Ab3UCPAD5XSzfTI0rqFuknyufbEe9PWkKnJOicjJhFg.jpg
 for: 4pePL3N0b+KXj3Lil49hZ+KYhWXil48v4piFZW3il4914piFbOKXj2HimIV04pePZWTimIUvMOKXjy9E4piFb3fil49ubOKYhW9h4pePZOKYhS/il49B4piFYuKYhTNV4piFQ+KXj1Dil49BRDXimIVY4piFU3pm4piFVOKXj0nil48w4piFcuKYhXHil49G4piFdeKXj2vimIVueeKXj3XimIVmYuKXj0XimIVl4pePOeKYhVDil49Xa+KXj0tu4pePSk/imIVpY+KXj2rimIVKaOKYhUbimIVnLuKYhWrimIVw4piFZw==

Processed/Wrote 364 entries to: s767vl-log-output.tsv

Exiting ...

Here is a screenshot of the output TSV (s767vl-log-output.tsv) imported into a LibreOffice Calc spreadsheet:


samsung_gallery3d_log_parser_v10.py TSV Output

Note: If you have issues with data not appearing correctly in MS Excel / LibreOffice Calc, please ensure the Import column type is set to TEXT.


A Python 3 script was also written to parse the "trash" table: samsung_gallery3d_trash_parser_v10.py

Here is the help text for samsung_gallery3d_trash_parser_v10.py:

python3 samsung_gallery3d_trash_parser_v10.py -h
usage:  samsung_gallery3d_trash_parser_v10.py [-d inputfile -o outputfile]
 
Extracts/parses data from com.sec.android.gallery3d's (v10) local.db's trash
table to output TSV file
 
optional arguments:
  -h, --help   show this help message and exit
  -d DATABASE  SQLite DB filename i.e. local.db
  -o OUTPUT    Output file name for Tab-Separated-Value report
 

Here is a usage example:

python3 samsung_gallery3d_trash_parser_v10.py -d s767vl-local.db -o s767vl-trash-output.tsv
Running samsung_gallery3d_trash_parser_v10.py v2021-11-12

Processed/Wrote 14 entries to: s767vl-trash-output.tsv
Exiting ...

Here is a screenshot of the output TSV (s767vl-trash-output.tsv) imported into a LibreOffice Calc spreadsheet:

samsung_gallery3d_trash_parser.py TSV Output

Note: If you have issues with data not appearing correctly in MS Excel / LibreOffice Calc, please ensure the Import column type is set to TEXT.

Some additional scripts were written and included in the GitHub repo.

The java-hashcode.py script was written to convert a given path to a "__bucketID" value as seen in the "album" table.

Here is the help for the java-hashcode script:

python3 java-hashcode.py -h
usage:  java-hashcode.py [-l | -u] -i inputfile
 
Read input strings/paths from a text file (one per line) and prints out the
equivalent Java hashcode
 
optional arguments:
  -h, --help    show this help message and exit
  -i INPUTFILE  Input text filename
  -l            (Optional) Converts input string to lower case before hashing
  -u            (Optional) Converts input string to UPPER case before hashing


Here is an example of how to calculate a bucketID for the following paths - "/storage/emulated/0/DCIM/Screenshots" and "/storage/emulated/0/Download".

We start by writing the 2 paths (one per line) to a text file called "inputhash.txt"

Example inputhash.txt for java-hashcode.py

Next, we call the java-hashcode.py script with "inputhash.txt" set as the input file.

Note: The usage of the "-l" (lowercase L) argument to convert the path to lowercase before calling the hashcode function.

Here is the command line example:

python3 java-hashcode.py -i inputhash.txt -l
Running java-hashcode.py 2021-12-23
 
/storage/emulated/0/dcim/screenshots = -1313584517
/storage/emulated/0/download = 540528482
 
Processed 2 lines - Exiting ...

We can see the hashcode values for those paths match the values recorded in the "album" table:

The /storage/emulated/0/dcim/screenshots path converts to a bucketID=  -1313584517

The /storage/emulated/0/download path converts to a bucketID = 540528482


"album" Table bucketID Example

Other scripts (requiring further testing) include:

samsung_gallery3d_log_parser_v11.py

samsung_gallery3d_filesysmon_parser_v11.py

These scripts were written for parsing test data from app version 11.5.05.1 (which differed from our targeted app version 10.2.00.21). The version 11 scripts have been included in the GitHub repo but will not be described further in this post.

Please note that all scripts  were written using our limited test data so they will probably struggle parsing other version's data.


Script wise, the most interesting part was automating the encoded path decoding.

Using tools such as dex2jar, JD-GUI and JADX we were able to find the code responsible for the path encoding (see Logger.class::getEncodedString method) and wrote a corresponding Python function to base64 decode the encoded path string.

Depending on your APK, using JD-GUI might require first extracting the classes.dex from the APK, then running dex2jar on the classes.dex before viewing the .jar file in JD-GUI. However in our case, Mike was able to run dex2jar on his APK directly and then use JD-GUI to view the Java code.

JADX can open/reverse an APK without the dex2jar step.

Some methods/variable names were not translated in JD-GUI and some were not translated in JADX so it’s probably worth trying both JD-GUI and JADX.

As mentioned previously, the path decode process is:

  1. Remove the last 7 base64 encoded chars
  2. Remove 3-6 characters at start of encoded string until a valid base64 length (multiple of 4 bytes)
  3. Perform base64 decode
  4. Remove any special padding chars eg Black Star, Black Circle

So the basic process for the "log" table script (samsung_gallery3d_log_parser_v10.py) was:

        Query the database eg "SELECT _id, __category, __timestamp, __log FROM log ORDER BY __timestamp ASC;"
        For each row:
o   Extract the "__log" and "timestamp" fields
o   Decode the base64 encoded path(s) (there can be more than one path per log item)
o   Print extracted fields and decoded path field(s) to TSV 

The process for the "trash" table script (samsung_gallery3d_trash_parser_v10.py) was:

        Query the database eg "SELECT __absID, __absPath, __Title, __originPath, __originTitle, __deleteTime, __restoreExtra FROM trash ORDER BY __deleteTime ASC;"
        For each row:
o   Extract the __absPath, __originPath, __deleteTime and __restoreExtra (metadata) fields
o   Print extracted fields to TSV (no path decoding required)

Summary

All this research led to a deeper understanding of reverse engineering Android apps, new and unique hashcode algorithms and different encoding techniques. Looking further into app databases that may/may not be parsed by existing tools can still lead to new information, folders of interest, log files, etc. You may discover new data that was introduced in newer versions of Android or the particular app.

For Mike's case, using the research and scripts from this post showed that the user was in the habit of taking screenshots or downloading images and then deleting them and emptying the trash a short time later. The web browser was used to access the images in question but deleting web history was also a frequent process. The recovered names of the screenshots showed that the user had used the web browser at specific times. Unfortunately these dates and times didn’t match the times in question but it did lead to other times to investigate that weren’t found in other parsed data.
Researching this post with Mike allowed Monkey to learn more about the Samsung Gallery app, gain further experience with reversing an Android APK and keep his Python skills fresh. Like any language, fluency deteriorates with lack of use.

Various Python 3 scripts were written to assist with parsing the "log" and "trash" tables from the Samsung Gallery3d app (v10.2.00.21). These tables can potentially store information regarding image deletion performed from within the Samsung Gallery3d app. e.g. timestamps and original file paths.

This post also demonstrated how collaborative research can lead to increased output/new tools. For example combining Mike's testing observations with Monkey's scripting. The opportunity to work with someone else with different knowledge, skills, experience and a fresh perspective is invaluable. Utilizing this experience can be just as good as, if not better than, attending a training class or a webinar. 

Special Thanks to Mike for sharing his research and co-authoring this post - hopefully, we can collaborate again in the future :)