Monkey assists Mike with another dive into the Samsung Gallery3d App |
It all started with a post by Michael Lacombe
The post involved a case where a Samsung mobile phone owner claimed that specific images were received but they were immediately
deleted after being accessed. Mike was asked if it was possible to
determine this. Not knowing the immediate answer to that question, he began to analyze the Samsung Android 9
device
Along the way, he found this previous Cheeky4n6monkey post from 2016, Comparing that information to his current case data, he saw that things had
changed considerably over the years but it was enough of a nudge to dig a
little deeper. Mike asked if this monkey wanted to tag along and so the adventure
began...
Here are some things we have learned on our journey... (mostly Mike, I was just the script monkey)
There are always new things to research
The Samsung Gallery3d app has been around for years and according to GooglePlay, it was last updated in 2019 with version 5.4.11.0
Opening the AndroidManifest.xml file from a test device's Gallery3d Android Package (APK) in Android Studio shows:
android:versionCode="1020000021"android:versionName="10.2.00.21"
According to the Android Developer documentation, versionName is displayed to the user where as versionCode is a positive integer which increases with each release and can be used to prevent downgrades to earlier versions.
This app is updated frequently. When searching for test data, we found that nearly every device we looked at contained a different version of the app, which in turn, contained different information stored within the application folder and the database itself.
As far as we could ascertain, there were no commercial or non-commercial forensic tools which process the Samsung Gallery3d app database for deletion artifacts.
Some open source tools that we used to analyze the data and the APK include:
For Data Analysis:
- DB Browser for SQLite for viewing/exporting SQLite databases
- Cyberchef to base64 decode strings
- Base64 Decode and Encode website to base64 decode strings
- Epochconverter to confirm timestamp types
- Android Studio
For APK reversing:
- dex2jar to convert an APK's classes.dex to Java .jar
- JD-GUI to view source code from a .jar file
- JADX to view source code directly from APK file
We also wrote our own Python3 scripts to assist with batch conversion of base64 encoded strings and output to Tab Separated Variable (TSV) format.
These scripts are available here
Some observations for the Samsung Gallery3d app
This is a stock app installed on Samsung devices. It has library dependencies that are part of the Samsung Android framework. Consequently, there doesn’t appear to be an easy way (if at all) to install the application on a non-Samsung device.
The Samsung Gallery3d app is located on the user data partition at:
/data/com.sec.android.gallery3d
Files that are sent to the trash from within the app are located at
/media/0/Android/data/com.sec.android.gallery3d
Due to differences in each version of the application and that the research was driven by Mike’s case, we decided to focus this blog on that application version (10.2.00.21).
Within the /data/com.sec.android.gallery3d directory, there was a cache directory and a databases directory.
Cache Directory
There
are multiple Cache sub-directories contained within
data/com.sec.android.gallery3d/cache/
In this instance, the /0 folder
contained larger thumbnail images, ranging in widths of 225-512 pixels and
heights of 256-656 pixels while the /1 folder had smaller thumbnails ranging in
widths of 51-175 pixels and heights of 63-177 pixels. There were also /2, /3
and /4 folders. /2 and /3 were empty and /4 had a single thumbnail that was
320x320 in size.
There
doesn’t seem to be anything useful here beyond the thumbnails themselves. The
names of the thumbnails seem to be generated using a hash algorithm.
Databases Directory
Contained within /data/com.sec.android.gallery3d/cache/databases/ is the local.db SQLite database.
This database contains various information including:
- Albums in the gallery ("album" table)
- A log that records various actions associated with the app ("log" table). eg move to trash, empty trash.
- Items that are currently in the Trash bin ("trash" table)
In later versions, we noticed another table called "filesystem_monitor". This contained timestamp, app package names (e.g. com.sec.android.gallery3d) and base64 encoded file paths. However, as this table was not present in Mike's case data and we are not sure what triggers these records, it requires further research.
Table Observations
"album" Table
Here is the "album" table schema:
_id INTEGER PRIMARY KEY AUTOINCREMENT,
__bucketID INTEGER UNIQUE NOT NULL,
__absPath TEXT,
__Title TEXT,
folder_id INTEGER,
folder_name TEXT,
default_cover_path TEXT,
cover_path TEXT,
cover_rect TEXT,
album_order INTEGER,
album_count INTEGER,
__ishide INTEGER,
__sefFileType INTEGER DEFAULT 0,
__isDrm INTEGER DEFAULT 0,
__dateModified INTEGER DEFAULT 0
)
Here are some screenshots of an example "album" table:
"album" Table Screenshot 1 |
"album" Table Screenshot 2 |
Here are some selected "album" table fields of interest:
Field Name |
Description |
_bucketID |
This is generated via calling Example value: -1313584517 |
_abspath |
The path of the album. Example:
/storage/emulated/0/DCIM/Screenshots |
default_cover_path |
The image associated with the corresponding album. Example: /storage/emulated/0/DCIM/Screenshots/Screenshot_20200530-054103_One
UI Home.jpg |
album_count |
The current number of files stored within the album. Example: 14 |
Due to the file paths staying the same, _bucketID values have been found to be consistent across devices. This can help to show whether there are/were custom albums that were created, as well as application specific albums such as Facebook, Snapchat, etc. Recovering deleted records here can show deleted albums and names of deleted images that were once used as album covers. Cover path information can show potential files names of interest with many of them normally containing timestamp information in the file name. This can potentially assist with tying usage of a particular app at a specific time.
No extraction script was written for the "album" table as DB Browser for SQLite can be used directly to copy/paste the album data.
"log" Table
Here is the "log" table schema:
_id INTEGER PRIMARY KEY AUTOINCREMENT,
__category INTEGER NOT NULL,
__timestamp TEXT,
__log TEXT
)
Here is a screenshot of an example "log" table:
"log" Table Screenshot |
Here are some selected "log" table fields of interest:
Field Name |
Description |
_timestamp |
Timestamp text string (formatted YYYY-MM-DD HH:MM:SS in Local Time) when a particular log entry occurred. Example: 2020-01-09
16:17:14 |
_log |
Proprietary formatted text string which lists the "action" performed (see next table) and the base64 encoded paths of relevant files. Example: [MOVE_TO_TRASH_SINGLE][1][0][location://timeline?position=6&mediaItem=data%3A%2F%2FmediaItem%2F-1566891466&from_expand=false][oKHi/x4pePL+KXj3N0b3Lil49h4pePZ2XimIUvZW3imIV14pePbOKYhWHil4904pePZeKYhWTimIUv4pePMC9E4piFQ0nimIVNL+KYhUZh4pePY+KYhWXil49ib2/imIVr4pePL+KXj0ZCX+KYhUnil49N4pePR1/imIUx4piFNeKYhTfimIU4NOKYhTnimIUw4piFNzTimIU1N+KXjzPimIUy4piFLuKXj2rimIVwZw==ST1puy1] |
Some observed log "actions" include:
Log Action |
Description |
MOUNTED |
Unknown when this is triggered. It tells how many
files are currently in the trash. |
MOVE_TO_TRASH_SINGLE |
This occurs when the user moves a single file to the
trash from the timeline or gallery view. |
MOVE_TO_TRASH_MULTIPLE |
This occurs when the user moves more than one file to
the trash from the timeline or gallery view. |
EMPTY_SINGLE |
This occurs when the trash is manually emptied and a
single file is in the trash at that time. |
EMPTY_MULTIPLE |
This occurs when the trash is manually emptied and
contains more than one file. |
EMPTY_EXPIRED |
This occurs when a file is auto-deleted after staying
in the trash for a predetermined amount of time as described in the settings
for the app. |
Other operations not observed in our data but declared in the source code (see TrashHelper class, DeleteType enum):
DELETE_MULTIPLE
DELETE_SINGE
Here is an example of how to manually decode the base64 encoded string from a "__log" field:
The original value is:
[MOVE_TO_TRASH_SINGLE][1][0][location://timeline?position=9&mediaItem=data%3A%2F%2FmediaItem%2F-575841975&from_expand=false][eTgcy4piFL3Pil4904piFb+KXj3LimIVh4pePZ2Uv4pePZeKXj2114pePbOKYhWHimIV0ZeKXj2TimIUvMOKYhS9EQ+KYhUlN4piFL1PimIVj4piFcmXil49l4pePbuKXj3Pil49ob3TimIVzL1PimIVj4piFcmXil49lbnNo4pePb3TimIVf4piFMjAxOTHimIUy4pePM+KXjzEtMeKXjznimIUyMeKXjzXil4844piFX+KXj1Nu4pePYeKYhXBjaGF0LmrimIVw4pePZw==bakWlla]
We copy the base64 string enclosed by the [ ] (highlighted in Yellow):
eTgcy4piFL3Pil4904piFb+KXj3LimIVh4pePZ2Uv4pePZeKXj2114pePbOKYhWHimIV0ZeKXj2TimIUvMOKYhS9EQ+KYhUlN4piFL1PimIVj4piFcmXil49l4pePbuKXj3Pil49ob3TimIVzL1PimIVj4piFcmXil49lbnNo4pePb3TimIVf4piFMjAxOTHimIUy4pePM+KXjzEtMeKXjznimIUyMeKXjzXil4844piFX+KXj1Nu4pePYeKYhXBjaGF0LmrimIVw4pePZw==bakWlla
Adjusting to the correct length for decoding requires:
● Removing the last 7 characters i.e. "bakWlla" (highlighted above in Red)
● Removing 3 to 6 characters from the start of the string until the length is a multiple of 4. ie removing "eTgcy" (highlighted above in Green)
We then:
● Base64 decode the string
● Remove padding characters such as Black Star and Black Circle
For our example above, we adjust the base64 string to:
4piFL3Pil4904piFb+KXj3LimIVh4pePZ2Uv4pePZeKXj2114pePbOKYhWHimIV0ZeKXj2TimIUvMOKYhS9EQ+KYhUlN4piFL1PimIVj4piFcmXil49l4pePbuKXj3Pil49ob3TimIVzL1PimIVj4piFcmXil49lbnNo4pePb3TimIVf4piFMjAxOTHimIUy4pePM+KXjzEtMeKXjznimIUyMeKXjzXil4844piFX+KXj1Nu4pePYeKYhXBjaGF0LmrimIVw4pePZw==
which decodes via CyberChef or base64decode.org to:
★/s●t★o●r★a●ge/●e●mu●l★a★te●d★/0★/DC★IM★/S★c★re●e●n●s●hot★s/S★c★re●ensh●ot★_★20191★2●3●1-1●9★21●5●8★_●Sn●a★pchat.j★p●g
We can then manually remove the following randomly added padding characters:
Unicode Code PointU+2605 = "Black Star"
Unicode Code PointU+25CF = "Black Circle"
Unicode Code PointU+25C6 = "Black Diamond"
Here is what the output from Cyberchef looks like:
Base64 Decode using Cyberchef |
Cyberchef has a handy feature of showing the number of characters in the input string ("length"). This can be used when determining how many characters to remove to get an input length that is a multiple of 4.
Here is the base64decode.org output:
Base64 Decode using base64decode.org |
The log table's "__log" field format varies
according to APK version. We have only looked at versions v10.0.21.5,
v10.2.00.21 (main focus) and v11.5.05.1
Consequently, two versions of a "log" table parsing script were written: samsung_gallery3d_log_parser_v10.py and samsung_gallery3d_log_parser_v11.py
"trash" Table
Here is the "trash" table schema:
Here are some screenshots of an example "trash" table:
"trash" Table Screenshot 1 |
"trash" Table Screenshot 2 |
There are only 10 entries stored in this example table. The entries in this table correspond with live files in the .Trash directory. All other files located in .Trash are overwritten files with “_Title” file names but no date/time information.
Here are some selected trash table fields of interest:
Field Name |
Description |
__absPath |
Current path and filename of the deleted file. Example: /storage/emulated/0/Android/data/com.sec.android.gallery3d/files/.Trash/135138193438761664 |
__originPath |
Original path and filename. Example: /storage/emulated/0/Download/unnamed.jpg |
__originTitle |
Original filename. Example: unnamed.jpg |
__deleteTime |
UNIX ms time Example: 1592678711438 |
__restorExtra |
JSON formatted and
contains various metadata such as:
"__dateTaken" Example:
{"__is360Video":false,"__isDrm":false,"__isFavourite":false,"__cloudOriginalSize":0,"__cloudRevision":-1,"__fileDuration":0,"__recordingMode":0,"__sefFileSubType":0,"__sefFileType":-1,"__cloudTimestamp":1592678711350,"__dateTaken":1592669230000,"__size":98526,"__latitude":0,"__longitude":0,"__capturedAPP":"","__capturedURL":"","__cloudServerPath":"","__hash":"","__mimeType":"image\/jpeg","__resolution":"","__recordingType":0,"__isHdr10Video":false} |
The "__Title" value (as seen in
"__absPath") is derived by calling a proprietary Crc::getCrc64Long
function on the "__originPath" value. Note: This value is generated
via a different method to the album table's "__bucketID" field.
One script was written to parse the "trash" table: samsung_gallery3d_trash_parser_v10.py
There are other tables in local.db but due to time constraints and available test data, we concentrated on the "log" and "trash" tables.
On some later app versions, we noticed a "filesystem_monitor" table which listed fields such as:
package, date_event_occurred (suspected ms since 1JAN1970), __data (base64 encoded filename), event_type (meaning currently unknown). This table requires further research.
Scripting
Some initial Python 3 scripts were written for parsing the "log" and "trash" tables.
No extraction script was written for the "album" table as DB Browser for SQLite can be used directly to copy/paste the album data.
Due to the different "__log" field formats observed, two versions were written for the "log" table: samsung_gallery3d_log_parser_v10.py and samsung_gallery3d_log_parser_v11.py.
Both of these scripts extract various fields from the "log" table and base64 decode any encoded path names that we have observed in our data. The v11 version was written to handle the differently formatted "__log" field values.
Here is the help text for samsung_gallery3d_log_parser_v10.py (main focus of research):
Here is a usage example (Note: a "__log" field may contain multiple base64 encoded file paths. The script should find/extract all of them):
Here is a screenshot of the output TSV (s767vl-log-output.tsv) imported into a LibreOffice Calc spreadsheet:
samsung_gallery3d_log_parser_v10.py TSV Output |
Note: If you have issues with data not appearing correctly in MS Excel / LibreOffice Calc, please ensure the Import column type is set to TEXT.
A Python 3 script was also written to parse the "trash" table: samsung_gallery3d_trash_parser_v10.py
Here is the help text for samsung_gallery3d_trash_parser_v10.py:
usage: samsung_gallery3d_trash_parser_v10.py [-d inputfile -o outputfile]
Extracts/parses data from com.sec.android.gallery3d's (v10) local.db's trash
table to output TSV file
optional arguments:
-h, --help show this help message and exit
-d DATABASE SQLite DB filename i.e. local.db
-o OUTPUT Output file name for Tab-Separated-Value report
Here is a usage example:
Running samsung_gallery3d_trash_parser_v10.py v2021-11-12
Processed/Wrote 14 entries to: s767vl-trash-output.tsv
Exiting ...
Here is a screenshot of the output TSV (s767vl-trash-output.tsv) imported into a LibreOffice Calc spreadsheet:
samsung_gallery3d_trash_parser.py TSV Output |
Note: If you have issues with data not appearing correctly in MS Excel / LibreOffice Calc, please ensure the Import column type is set to TEXT.
Some additional scripts were written and included in the GitHub repo.
The java-hashcode.py script was written to convert a given path to a "__bucketID" value as seen in the "album" table.
Here is the help for the java-hashcode script:
usage: java-hashcode.py [-l | -u] -i inputfile
equivalent Java hashcode
-h, --help show this help message and exit
-i INPUTFILE Input text filename
-l (Optional) Converts input string to lower case before hashing
-u (Optional) Converts input string to UPPER case before hashing
Here is an example of how to calculate a bucketID for the following paths - "/storage/emulated/0/DCIM/Screenshots" and "/storage/emulated/0/Download".
We start by writing the 2 paths (one per line) to a text file called "inputhash.txt"
Example inputhash.txt for java-hashcode.py |
Next, we call the java-hashcode.py script with "inputhash.txt" set as the input file.
Note: The usage of the "-l" (lowercase L) argument to convert the path to lowercase before calling the hashcode function.
Here is the command line example:
Running java-hashcode.py 2021-12-23
/storage/emulated/0/dcim/screenshots = -1313584517
/storage/emulated/0/download = 540528482
Processed 2 lines - Exiting ...
We can see the hashcode values for those paths match the values recorded in the "album" table:
The /storage/emulated/0/dcim/screenshots path converts to a bucketID= -1313584517
The /storage/emulated/0/download path converts to a bucketID = 540528482
"album" Table bucketID Example |
Other scripts (requiring further testing) include:
● samsung_gallery3d_log_parser_v11.py
● samsung_gallery3d_filesysmon_parser_v11.py
These scripts were written for parsing test data from app version 11.5.05.1 (which differed from our targeted app version 10.2.00.21). The version 11 scripts have been included in the GitHub repo but will not be described further in this post.
Please note that all scripts were written using our limited test data so they will probably struggle parsing other version's data.
Script wise, the most interesting part was automating the encoded path decoding.
Using tools such as dex2jar, JD-GUI and JADX we were able to find the code responsible for the path encoding (see Logger.class::getEncodedString method) and wrote a corresponding Python function to base64 decode the encoded path string.
Depending on your APK, using JD-GUI might require first extracting the classes.dex from the APK, then running dex2jar on the classes.dex before viewing the .jar file in JD-GUI. However in our case, Mike was able to run dex2jar on his APK directly and then use JD-GUI to view the Java code.
JADX can open/reverse an APK without the dex2jar step.
Some methods/variable names were not translated
in JD-GUI and some were not translated in JADX so it’s probably worth trying
both JD-GUI and JADX.
As mentioned previously, the path decode process is:
- Remove the
last 7 base64 encoded chars
- Remove 3-6
characters at start of encoded string until a valid base64 length
(multiple of 4 bytes)
- Perform
base64 decode
- Remove any special padding chars eg
Black Star, Black Circle
So the basic process for the "log" table script (samsung_gallery3d_log_parser_v10.py) was:
The process for the "trash" table script (samsung_gallery3d_trash_parser_v10.py) was:
Summary
All this research led to a deeper understanding of reverse engineering Android apps, new and unique hashcode algorithms and different encoding techniques. Looking further into app databases that may/may not be parsed by existing tools can still lead to new information, folders of interest, log files, etc. You may discover new data that was introduced in newer versions of Android or the particular app.
For Mike's case, using the research and scripts from this post showed that the user was in the habit of taking screenshots or downloading images and then deleting them and emptying the trash a short time later. The web browser was used to access the images in question but deleting web history was also a frequent process. The recovered names of the screenshots showed that the user had used the web browser at specific times. Unfortunately these dates and times didn’t match the times in question but it did lead to other times to investigate that weren’t found in other parsed data.
Researching this post with Mike allowed Monkey to learn more about the Samsung Gallery app, gain further experience with reversing an Android APK and keep his Python skills fresh. Like any language, fluency deteriorates with lack of use.
Various Python 3 scripts were written to assist with parsing the "log" and "trash" tables from the Samsung Gallery3d app (v10.2.00.21). These tables can potentially store information regarding image deletion performed from within the Samsung Gallery3d app. e.g. timestamps and original file paths.
This post also demonstrated how collaborative research can lead to increased output/new tools. For example combining Mike's testing observations with Monkey's scripting. The opportunity to work with someone else with different knowledge, skills, experience and a fresh perspective is invaluable. Utilizing this experience can be just as good as, if not better than, attending a training class or a webinar.
Special Thanks to Mike for sharing his research and co-authoring this post - hopefully, we can collaborate again in the future :)