Thursday 31 July 2014

Squirrelling Away Plists

Just grabbin some acorns ...

Plists are Apple's way of retaining configuration information. They're scattered throughout OS X and iOS like acorns and come in 2 main types - XML and binary. Due to their scattered nature and the potential for containing juicy artefacts, monkey thought a script to read plists and extract their data into an SQLite database might prove useful. The idea being analysts run the script ( against a directory of plists and then browse the resultant table for any interesting squirrels. Analysts could also execute the same queries against different sets of data to find common items of interest (eg email addresses, filenames, usernames).
Similar in concept to SquirrelGripper which extracted exiftool data to a DB, the tool will only be as good as the data fields extracted and the analyst's queries. At the very least, it allows analysts to view the contents of multiple plists at the same time. Plus we get to try out Python 3.4's newly revised native "plistlib" which now parses BOTH binary and XML plists. Exciting times!

Not having easy access to an OS X or iOS system, monkey is going to have to improvise a bit for this post and also rely upon the kindness of plist donaters. Special Thanks to Sarah Edwards (@iamevltwin) and Mari DeGrazia (@maridegrazia) for sharing some sample plists used for testing.

XML based plists are text files which can be read using a text editor. Binary plists follow a different file format and typically require a dedicated reader (eg plist Editor Pro) or conversion to XML to make it human readable.
Both types of plist support the following data types:

CFString = Used to store text strings. In XML, these fields are denoted by the <string> tag.
CFNumber = Used to store numbers. In XML, the <real> tag is used for floats (eg 1.0) and the <integer> tag is used for whole numbers (eg 1).
CFDate = Used to store dates. In XML, the <date> tag is used to mark ISO formatted dates (eg 2013-11-17T20:10:06Z).
CFBoolean = Used to store true/false values. In XML, these correspond to <true/> or <false/> tags.
CFData = Used to store binary data. In XML, the <data> tag marks base64 encoded binary data.
CFArray = Used to group a list of values. In XML, the <array> tag is used to mark the grouping.
CFDictionary = Used to store sets of data values keyed by name. Typically data is grouped into dictionaries with <key> and <value> elements.  The <key> fields use name strings. The <value> elements are typically one of the following - <string>, <real>, <float>, <date>, <true/>, <false/>, <data>. The order of key declaration is not significant. In XML, the <dict> tag is used to mark the dictionary boundaries.

To show how it all fits together, let's take a look an XML plist example featuring everyone's favourite TV squirrel ...

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "">
<plist version="1.0">
    <string>Rocket J. Squirrel</string>
        <string>Rocky the Flying Squirrel</string>
    <key>City of Birth</key>
    <string>Frostbite Falls</string>
    <key>Year Of Birth</key>
    <key>Flight Capable</key>

Note: The DNA <data> field "cm9ja3ktZG5hCg==" is the base64 encoding of "rocky-dna".

We can cut and paste the above XML into plist Editor Pro and save it as a binary plist.
We can also open a new text file and paste the above XML into it to create an XML plist.

Further Resources

The Mac Developer Library documentation describes Plists here and the Apple manual page describes XML plists here.

Michael Harrington has a great working example / explanation of the binary plist file format here and here.

Setting Up

Using the binary capable "plistlib" requires Python v3.4+. So if you don't have it installed, you're gonna be disappointed. Note: Ubuntu 14.04 has Python 3.4 already installed so if you're already running that, you don't have to worry about all this setup stuff.

To install Python 3.4 on Ubuntu 12.04 LTS (eg like SANS SIFT 3), there's a couple of methods.
I used this guide from James Nicholson to install the 3.4.0beta source onto my development VM.
FYI 3.4.1 is currently the latest stable release and should be able to be installed in a similar manner.

There's also this method that uses an Ubuntu Personal Package Archive from Felix Krull.
But Felix makes no guarantees, so I thought it'd be better to install from source.

Alternatively, you can install Python 3.4.1 on Windows (or for OS X) from here.

Not having a Mac or iPhone, monkey created his own binary and XML plist files. First, we define/save the new binary plist file using plist Editor Pro (v2.1 for Windows), then we copy/paste the XML into new text file on our Ubuntu development VM and save it. This way we can have both binary and XML versions of our plist information. Note: Binary plists created by plist Editor Pro in Windows were read OK by the script in Ubuntu. However, Windows created XML plists proved troublesome (possibly due to Windows carriage returns/linefeeds?) - hence the cut and paste from the XML in plist Editor Pro to the Ubuntu text editor for saving.

For squirrels and giggles, we'll continue to base our test data on characters from the Rocky and Bullwinkle Show. For those that aren't familiar with the squirrel and moose, commiserations and see here.

The Script

For each file in the specified input directory (or just for an individual file), the script calls the "plistlib.load()" function.
This does the heavy lifting and returns the "unpacked root object" (usually a dictionary).
The script then calls a recursive "print_object" function (modified/re-used from here) to go into each/any sub-object of the root object and store the filename, plist path and plist value in the "rowdata" list variable.

Once all plist objects have been processed, the script creates a new database using the specified output filename and SQL "replaces" the extracted "rowdata" into a "plists" table. We use SQL "replace" instead of SQL "insert" so we don't get "insert" errors when running the script multiple times using the same source data and target database file. Although to be prudent, it's just as easy to define a different output database name each time ... meh.
The "plists" table schema looks like:

CREATE TABLE plists(filename TEXT NOT NULL, name TEXT NOT NULL, value TEXT NOT NULL, PRIMARY KEY (filename, name, value) )

Note: The "plists" table uses the combination of filename + name + value as a Primary Key. This should make it impossible to have duplicate entries.

See comments in code for further details.


To run the script we just point it at a directory or individual plist and give it a filename for the output SQLite database.
Here we are using the python3.4 beta exe from my Ubuntu development VM's locally installed directory ...

cheeky@ubuntu:~/python3.4b/bin$ ./python3.4 /home/cheeky/
Running v2014-07-24

Usage: -f plist -d database

  -h, --help   show this help message and exit
  -f FILENAME  XML/Binary Plist file or directory containing Plists
  -d DBASE     SQLite database to extract Plist data to

Here's how the test data was stored ...

cheeky@ubuntu:~/python3.4b/bin$ tree /home/cheeky/test-plists/
+-- bin-plists
¦   +-- boris.plist
¦   +-- bullwinkle.plist
¦   +-- natasha.plist
¦   +-- rocky.plist
+-- Red-Herring.txt
+-- xml-plists
    +-- boris-xml.plist
    +-- bullwinkle-xml.plist
    +-- natasha-xml.plist
    +-- rocky-xml.plist

2 directories, 9 files

Note: "Red-Herring.txt" is text file included to show how non-plist files are handled by the script.

Now we can try our extraction script with our test data ...

cheeky@ubuntu:~/python3.4b/bin$ ./python3.4 /home/cheeky/ -f /home/cheeky/test-plists/ -d /home/cheeky/bullwinkles.sqlite
Running v2014-07-24

*** WARNING /home/cheeky/test-plists/Red-Herring.txt is not a valid Plist!


Here is a screenshot of the resultant "bullwinkles.sqlite" database ...

Test Data Output Database

Note: The XML plist DNA <data> fields shown have been extracted and base64 *decoded* automatically by the "libplist" library. Our test data binary plists store the raw ASCII values we entered and the XML plists store the base64 encoded values. Being text based, I can understand why XML encodes binary data as base64 (so its printable). But binary plists don't have the printable requirement so there's no base64 encoding/decoding step and the raw binary values are written directly to the binary plist file.

By having the raw hexadecimal values from the <data> fields in the DB, we can cut and paste these <data> fields into a hex editor to see if there's any printable characters ...

Binary rocky.plist's DNA data value

From the previous 2 pictures, we can see that the "DNA" value from our binary "rocky.plist" is actually UTF-8/ASCII for "rocky-dna".

One nifty feature of plist Editor Pro is that from the "List view" tab, you can double click on a binary value represented by a "..." and it opens the data in a hex editor window. This binary inspection would be handy when looking at proprietary encoded data fields (eg MS Office FileAlias values). Or we could just run our script as above and cut and paste any <data> fields to a hex editor ...

From our results above, we can also see that the "Red-Herring.txt" file was correctly ignored by the script and that a total of 66 fields were extracted from our binary and XML plists (as expected).

Now we have a database, we can start SQL querying it for values ...
As the "name" and "value" columns are currently defined as text types, limited sorting functionality is available.

Here are a few simple queries for our test data scenario. Because our test plists are not actual OS X / iOS plists, you'll have to use your imagination/your own test data to come up with other queries that you might find useful/practical. More info on forming SQLite queries is available here.

Find distinct "Aliases"
WHERE name LIKE '%Alias%';

Find all the values from the "rocky-xml.plist"
SELECT * FROM plists
WHERE filename LIKE '%rocky-xml.plist';

Find/sort records based on "Weight" value
SELECT * FROM plists
WHERE name LIKE '%Weight' ORDER BY value;

Note: Sort is performed textually as the value column is TEXT.
So the results will be ordered like 125.0, 125.0, 2.5, 2.5, 53.5, 53.5, 65.7, 65.7.

Find/sort records by "Info Expiry Date" value
SELECT * FROM plists
WHERE name LIKE '%Info Expiry Date' ORDER BY value;

Note: This works as expected as the date text is an ISO formatted text string.

The script has been developed/tested on Ubuntu 12.04 LTS (64bit) with Python 3.4.0beta.
It was also tested (not shown) with OS X MS Office binary plists, a Time Machine binary backup plist and a cups.printer XML plist.

Additionally, the script has been run with the bullwinkle test data on Win7 Pro (64 bit) with Python 3.4.1 and on a Win 8.1 Enterprise Evaluation (64 bit) VM with Python 3.4.1

Final Words

The idea was to write a script that grabs as much plist data as it can and leave it to the analyst to formulate their own queries for finding the data they consider important.
The script also allowed monkey to sharpen his knowledge on how plists are structured and granted some valuable Python wrestling time (no, not like that!).
By re-using a bunch of existing Python libraries/code, the script didn't take much time (or lines of code) to put together.
The native Python "plistlib" also allows us to execute on any system installed with Python 3.4 (OS X, Windows, Linux) without having to install any 3rd party libraries/packages.
I have not been able to run/test it on a complete OS X system (or on iOS plist files) but in theory it *should* work (wink, wink). I am kinda curious to see how many plists/directories it can process and how long it takes. The bullwinkle test data took less than a second to execute.

Depending on what artefacts you're looking for, you can use the script as an application artefact browsing tool or by using the same queries on data from different sources, you could use it to detect known keywords/values (eg IP theft email addresses, app configuration). Or perhaps you have a bunch of application directories from an iOS device that you're curious about. Rather than having to inspect each plist individually, you can run this script once and snoop away.

The sorting could be made more comprehensive if each data type was extracted to it's own table (ie floats in one table, ints in another). However, given that sorting by time currently works already, that additional functionality might not be much use?

If anyone uses this script, I'd appreciate hearing any feedback either via the comments section below or via email. It'd be nice to know if this script is actually of some use and not just another theoretical tool LOL.

Thursday 17 July 2014

Android Has Some Words With Monkey

Be advised ... Here thar be Squirrels!

The recent NIST Mobile Forensics Webcast and SANS FOR585 poster got monkey thinking about using the Android emulator for application artefact research. By using an emulator, we don't need to "root" an Android device in order to access artefacts from the protected data storage area (eg "/data/data/"). As an added bonus, the emulator comes as part of the FREE Android Software Development Kit (SDK). Hopefully, this post will help encourage further forensic research/scripts for Android based apps.
So now we just need a target app to investigate ... "Words With Friends" (WWF) is a popular scrabble type game with chat functionality. For this post, we'll be focusing on using an Android emulator to retrieve in-game chat artefacts and then create a script to parse them (""). It's a fairly long post so you might want to take that potty break now before we begin ;)

0. Installation / Setup

On an Ubuntu 12.04 LTS (32 bit) Virtual Machine (using VMware Player), I installed the following:
- Android SDK bundle including the "eclipse" IDE (from
- dex2jar tool to convert .dex byte code into the .jar Java archive format (from
- JD-GUI Java decompiler to display the source code from a .jar file (from

Android has an official SDK install guide here which also continues on here.
Installation was as simple as unzipping the downloaded archives and launching the relevant executable.
Lazy monkey just unzipped the archives to his home directory (ie "/home/cheeky/").
Here's a quick guide:
- Go to and download the 32 bit linux ADT bundle (includes both the eclipse IDE and Android SDK tools)
- Double click the zip file and use the Ubuntu Archive Manager to unzip the bundle to "/home/*username*" (eg unzips to "/home/cheeky/adt-bundle-linux-x86-20140702/")
- Use the Nautilus File Exporer GUI to navigate to the eclipse sub-directory (eg "/home/*username*/eclipse/")
- Double click on "eclipse" icon to launch it (or you could launch it from the command line via "/home/*username*/eclipse/eclipse")
- Go to the "Window" ... "Android SDK Manager" drop menu item and launch it. Some packages are installed by default but if you want to run an emulator with a specific/previous version of Android you need to download/install that specific SDK Platform (eg 4.2.2 SDK platform) and a corresponding Hardware System Image (eg ARM for a Nexus 7 tablet).
- Unzip the downloaded dex2jar zip file contents to "/home/*username*" (eg "/home/cheeky/dex2jar-")
- Unzip the downloaded jd-gui zip file to "/home/*username*" (eg "/home/cheeky/"). Note: we only need to extract the "jd-gui" exe.

Also installed was the Bless Hex Editor (via Ubuntu Software Center) and the Firefox SQLite Manager extension (via the Firefox Add-ons Manager).

To make things a bit easier, I also setup a soft link (ie alias) so we can just type "adb" without the preceding path info to launch the Android Debug Bridge.
cheeky@ubuntu:~$ sudo ln -s /home/cheeky/adt-bundle-linux-x86-20140702/sdk/platform-tools/adb /usr/bin/adb

1. Getting the .apk app install file

Android .apk install files are zip archives. You can download them from the GooglePlay store by using a Chrome plugin or via the apk-downloader website. For this experiment however, I wanted to test the specific version from my Nexus 7 tablet (WWF 7.1.4), so I decided to use the Android Debug Bridge (adb) method.
Excellent adb instructions are available from the official Android Dev site here.

To prepare my Nexus 7 (1st gen c.2012) for the .apk file transfer, I attached it to my PC via USB cable. I then enabled the tablet's "Developer options" from the "Settings" menu by tapping the "About tablet" ... "Build number" several times. Next, I went into "Developer options" and enabled the "USB Debugging" and "Stay awake" options.

From our Ubuntu VM we can now check for connected devices/emulators ...
cheeky@ubuntu:~$ adb devices
List of devices attached
*serialnumber_of_device*    device


Note: I have redacted the serial number of my Nexus 7. Just imagine a 16 digit hex value in place of *serialnumber_of_device" ...

So now we know that the adb has recognized our physical device, let's try connecting to it ...
cheeky@ubuntu:~$ adb -s *serialnumber_of_device* shell

For squirrels and giggles, lets try to list the files in the protected "/data/data/" directory ...
127|shell@grouper:/ $ ls /data/data
opendir failed, Permission denied
1|shell@grouper:/ $

We also can't do a directory listing of "/data/app/" (where the .apk install files are located) ...
shell@grouper:/ $ ls /data/app
opendir failed, Permission denied
1|shell@grouper:/ $

Fortunately, we CAN list the installed 3rd party packages and associated .apk file by typing "pm list packages -f -3"
1|shell@grouper:/ $ pm list packages -f -3
shell@grouper:/ $

At this point, we type "exit" to logout from the physical device.
From the output of the "pm list packages -f -3" command, we know that the WWF .apk file is "/data/app/com.zynga.words-1.apk".
So we can use the "adb pull" command to copy it to our local Ubuntu VM.
cheeky@ubuntu:~$ adb -s *serialnumber_of_device* pull /data/app/com.zynga.words-1.apk wwf.apk
1252 KB/s (23648401 bytes in 18.442s)

The above command copies "/data/app/com.zynga.words-1.apk" to the current directory and names it as "wwf.apk".
I didn't feel like typing the whole long .apk filename each time (lazy monkey!) so just called it "wwf.apk".
Anyhow, a copy of the WWF apk is now stored as "/home/cheeky/wwf.apk".

2. Create/Launch emulator

Now we create an Android emulator and fire it up ...
The Android website has some detailed instructions about creating/running the emulator here, here and here.
Here's the quick version ...
- Assuming you still have the eclipse IDE open, go to the "Window" ... "Android Virtual Device (AVD) Manager" menu item and create a new device similar to the following:

Test emulator device specs
Note: Be sure to tick the "Use Host GPU" checkbox to improve emulator speed. It also helps to ensure your VM has plenty of RAM.
- Start the device emulator by selecting the "testtab" device AVD and clicking "start". Alternatively, you can launch the AVD Manager GUI from the command line instead of via eclipse ...
cheeky@ubuntu:~$ /home/cheeky/adt-bundle-linux-x86-20140702/sdk/tools/android avd

The emulator can take a minute to boot but eventually you should see something like this:

Emulator at startup

Now we can see if our emulator is recognized by typing "adb devices". Note: Our physical Nexus 7 device has been disconnected from the PC so it doesn't appear now.
cheeky@ubuntu:~$ adb devices
List of devices attached
emulator-5554    device


Update 2014-07-26:
To research Google product artefacts such as GoogleMaps and Hangouts, you can use an emulator with a Google APIs target set (eg "Google APIs - API Level 19") instead of an Android target as shown previously.

According to the official documentation, Google Play services can only be installed on an emulator with an AVD that runs a Google APIs platform based on Android 4.2.2 or higher. To be able to use a Google API in the emulator, you must also first install the target Google API system image (eg "Google APIs - API Level 19") from the SDK Manager.

3. Installing an .apk on the emulator

To install the "wwf.apk" we previously pulled off our Nexus 7 device, we type the following:
cheeky@ubuntu:~$ adb -s emulator-5554 install wwf.apk
2929 KB/s (23648401 bytes in 7.883s)
    pkg: /data/local/tmp/wwf.apk

There should now be a WWF icon in the emulator's "App" screen which we can launch by clicking on it.

WWF is now installed!

We should now see a login screen for WWF where we can provide an email address and start playing games / chatting with others (ie create lots of juicy artefacts!).
By default, the emulator retains data between emulator launches so you shouldn't lose much/any app data if/when the emulator closes (eg after a crash).

4. Capture artefact data (via adb pull and DDMS)

For squirrels and giggles, let's connect to the emulator and see what our access privileges are (remembering that they were limited on the Nexus 7 device) ...
cheeky@ubuntu:~$ adb -s emulator-5554 shell
root@android:/ #

Now lets try viewing the "/data/data/" directory ...
root@android:/ # ls /data/data
root@android:/ #

Note: We are logged into our emulator as "root" so that's why we can now see the contents of "/data/data/" :)
Let's double-check that WWF was installed OK ...

root@android:/ # pm list packages -f -3
root@android:/ #

Now we can search for WWF chat artefacts in "/data/data/" ...
Note: Because the Nexus 7 doesn't have a removable SD card, we don't have to worry about checking "/mnt/sdcard/" for app artefacts. But just keep it in mind for other devices which may support app data storage on the SD card.

Let's do a file listing of the WWF directory ...
root@android:/ # ls /data/data/com.zynga.words/                               
root@android:/ #

Looking closer at the "databases" directory ...
root@android:/ # ls /data/data/com.zynga.words/databases/                     
root@android:/ #

Hmmm ... it looks like this could contain some interesting info.

Update 2014-07-26:
By using the "lsof" command on the emulator whilst our target app is running, we can see a list of currently open files .
This can then be used to locate application artefact files (eg databases). For example, we can type:
lsof | grep com.zynga.wordswhich should also lead us to various files open in "/data/data/com.zynga.words/databases/".

We type "exit" to logout for now.
Now we can "pull" these files of interest from the emulator for further analysis. For example, I'm now going to skip ahead and pull the file where I eventually found the WWF chat artefacts ...
cheeky@ubuntu:~$ adb -s emulator-5554 pull /data/data/com.zynga.words/databases/WordsFramework
1111 KB/s (159744 bytes in 0.140s)

Alternatively, we can use the eclipse IDE and the Dalvik Debug Monitor Server (DDMS) tool to "pull" files. If our emulator is running (first, you better go catch it!), DDMS should connect to it automagically.
To launch the DDMS tool, use the following eclipse drop down menu - "Window" ... "Open Perspective" ... "DDMS"
The DDMS tool allows us to a bunch of cool things such as:
- pull/push files to the emulator
- dump process heap memory
- spoof phone calls (logs connections only / not capable of voice transmission/reception)
- send SMS text messages to the emulator
- set the GPS location of the phone
More information on DDMS is available here.

OK you should see the emulator on the LHS under "Devices" and the "File Explorer" tab on the RHS.

Dalvik Debug Monitor Server running in eclipse

Under the "File Explorer" Tab, browse to "/data/data/com.zynga.words/databases"
Then select the "WordsFramework" file and click the floppy disk icon to "pull" the file onto your Ubuntu box.

For squirrels and giggles, we can also dump the WWF process heap memory and later search it for interesting strings.
To do this, on the LHS, select the "com.zynga.words" process, toggle the "Update Heap" button (making DDMS continuously monitor the heap) and then click on the "Dump HPROF file" button (looks like cylinder with red arrow pointing down).

Dumping the process heap memory via DDMS

Next select the "Leak Suspects Report" and wait ... FYI we're not interested in the report as much as the accompanying HPROF dump.
After a while, the title bar changes to reflect that the .hprof is stored with a ridiculously long numeric filename in "/tmp/".

Obtaining the HPROF file

We can now run "strings" against this file and review ...
For example, I checked the validity of the word "twerk" using the emulator's WWF "Word Check" and then dumped the process heap. I then ran the "strings" command and piped the output to a separate text file ("8058l.txt") for easier analysis/searching.
cheeky@ubuntu:~$ strings --encoding=l /tmp/android1924397721207258058.hprof > 8058l.txt
From the text file output, I was able to observe a bunch of little endian UTF-16 strings (possibly a dictionary update) contained in memory. Opening the .hprof file separately in a hex editor also confirms this.

Viewing the HPROF in a Hex Editor shows something squirrelly ...

Entering some selected words from this list into the WWF "Word Check" confirmed that they are acceptable words.
I think we just found us a squirrel! Who's a good monkey eh? Purr ...

DDMS also has a "Network Statistics" tab view available but this was not working with my emulator. It apparently can be hit and miss according to this Google Android Issue Tracker notice.
Anyway, there's more than one way to get network stats ...
First, we login to the emulator (without WWF running) and take a baseline list of network connections via "netstat"
cheeky@ubuntu:~$ adb -s emulator-5554 shell
root@android:/ #
root@android:/ # netstat
Proto Recv-Q Send-Q Local Address          Foreign Address        State
 tcp       0      0*              LISTEN
 tcp       0      0 *              LISTEN
 tcp       0      0         ESTABLISHED
tcp6       0      1 ::ffff: ::ffff: CLOSE_WAIT
root@android:/ # 

According to our emulator's "Settings" ... "About Phone" ... "Status" our emulator's IP address is which we can see under the "Local Address" columns.
Now we launch WWF and then take another "netstat" ...
root@android:/ # netstat                                                      
Proto Recv-Q Send-Q Local Address          Foreign Address        State
 tcp       0      0*              LISTEN
 tcp       0      0 *              LISTEN
 tcp       0      0      ESTABLISHED
 tcp       0      0     ESTABLISHED
 tcp       0      0     ESTABLISHED
 tcp       0      0      ESTABLISHED
 tcp       0      0      ESTABLISHED
 tcp       0      0      ESTABLISHED
 tcp       0      0      ESTABLISHED
 tcp       0      0      ESTABLISHED
 tcp       0      0      ESTABLISHED
 tcp     256      0         ESTABLISHED
 udp       0      0*              CLOSE
tcp6       0      0 ::ffff: ::ffff: TIME_WAIT
tcp6       0      0 ::ffff: ::ffff: TIME_WAIT
tcp6       0      0 ::ffff: ::ffff: TIME_WAIT
tcp6       0      0 ::ffff: ::ffff: TIME_WAIT
tcp6       0      0 ::ffff: ::ffff: TIME_WAIT
tcp6       0      0 ::ffff: ::ffff: TIME_WAIT
tcp6       0      0 ::ffff: ::ffff: TIME_WAIT
tcp6       0      1 ::ffff: ::ffff: CLOSE_WAIT
tcp6       0      0 ::ffff: ::ffff: TIME_WAIT
tcp6       0      0 ::ffff: ::ffff: TIME_WAIT
tcp6       0      0 ::ffff: ::ffff: TIME_WAIT
tcp6       0      0 ::ffff: ::ffff: TIME_WAIT
tcp6       0      0 ::ffff: ::ffff: TIME_WAIT
tcp6       0      0 ::ffff: ::ffff: TIME_WAIT
tcp6       0      0 ::ffff: ::ffff: TIME_WAIT
tcp6       0      0 ::ffff: ::ffff: TIME_WAIT
tcp6       0      0 ::ffff: ::ffff: TIME_WAIT
root@android:/ # 

Lastly, we start an in-game chat, send a message and take a "netstat" ...
root@android:/ # netstat                                                      
Proto Recv-Q Send-Q Local Address          Foreign Address        State
 tcp       0      0*              LISTEN
 tcp       0      0 *              LISTEN
 tcp       0      0      ESTABLISHED
 tcp       0      0     ESTABLISHED
 tcp       0      0     ESTABLISHED
 tcp       0      0       TIME_WAIT
 tcp       0      0      ESTABLISHED
 tcp       0      0      ESTABLISHED
 tcp       0      0      TIME_WAIT
 tcp       0      0      ESTABLISHED
 tcp       0      0      ESTABLISHED
 tcp       0      0      ESTABLISHED
 tcp       0      0         ESTABLISHED
 tcp       0      0        ESTABLISHED
 tcp       0      0      ESTABLISHED
 tcp     103      0         ESTABLISHED
 tcp       0      0      ESTABLISHED
 udp       0      0*              CLOSE
tcp6       0      0 ::ffff: ::ffff: TIME_WAIT
tcp6       0      0 ::ffff: ::ffff: TIME_WAIT
tcp6       0      0 ::ffff: ::ffff: TIME_WAIT
tcp6       0      0 ::ffff: ::ffff: TIME_WAIT
tcp6       0      0 ::ffff: ::ffff: TIME_WAIT
tcp6       0      1 ::ffff: ::ffff: CLOSE_WAIT
tcp6       0      0 ::ffff: ::ffff: TIME_WAIT
tcp6       0      0 ::ffff: ::ffff: TIME_WAIT
tcp6       0      0 ::ffff: ::ffff: TIME_WAIT
tcp6       0      0 ::ffff: ::ffff: TIME_WAIT
tcp6       0      0 ::ffff: ::ffff: TIME_WAIT
tcp6       0      0 ::ffff: ::ffff: TIME_WAIT
tcp6       0      0 ::ffff: ::ffff: TIME_WAIT
tcp6       0      0 ::ffff: ::ffff: TIME_WAIT
tcp6       0      0 ::ffff: ::ffff: TIME_WAIT
tcp6       0      0 ::ffff: ::ffff: TIME_WAIT
root@android:/ #

Obviously ESTABLISHED means theres a connection between our local host and a remote host.
TIME_WAIT means there WAS a connection but it is in the process of being closed. Thus you can also sometimes see previous TCP connections.
These netstat listings are snapshots at a particular time so you have to be quick and/or take several samples.

Now we can do a "whois" lookup on these IP addresses and find out a bit more info. Admittedly, these connections could be from ANY process on the emulator but as WWF is the only installed app, it's likely that it's the process responsible for these communications.
Without getting into too much detail, some of these remote IP addresses resolved back to companies such as Google, / Twitter, Yahoo, Amazon, Facebook, Softlayer Technologies and Akamai.

You could also fire up Wireshark or tcpdump and capture packets for analysis but depending on your jurisdiction this may be illegal (ie intercepting communications without consent). Anyhow, monkey won't be doing that - his delicate simian features wouldn't survive prison!

To use the DDMS Call / SMS / GPS functionality ...
Select "Window" ... "Show View" ... "Other" ... and then start typing "Emulator Control".
This should bring up a new tab where you can send SMS / simulate a voice call / set the emulator's GPS location.
Making voice call logs and sending SMS worked well with my emulator. However, I have not been able to verify that the GPS functionality works. Launching Googlemaps in the emulator's web browser and clicking on the crosshairs icon results in Googlemaps reporting "Your location could not be determined" ... *shrug*
FYI A couple of times, using the emulator control functionality also made the emulator laggy / caused a crash.

DDMS also allows users to grab screenshots from the emulator via the camera button.
However, we can also do screenshots from our emulator window via the Alt-PrintScreen button combo in a VMware environment.

5. Viewing the source code

- Open the .apk file using Ubuntu Archive Manager (or similar unzip app)
- Extract "classes.dex" (to minimize confusion I extracted it as "words-classes.dex" to "/home/cheeky/")
- Run "d2j-dex2jar.jar" from the install directory (eg "/home/cheeky/dex2jar-")
cheeky@ubuntu:~/dex2jar-$ ./ /home/cheeky/words-classes.dex
dex2jar /home/cheeky/words-classes.dex -> words-classes-dex2jar.jar

There will now be a "words-classes-dex2jar.jar" in the current direcotry (eg "/home/cheeky/dex2jar-")
- Run JD-GUI (either from command line or by double-clicking the extracted exe) and open the "words-classes-dex2jar.jar" file.
Be aware that not all of the code may be easily understandable due to obfuscation (eg class "a" has non-descriptive method/function names such as "a", "b" etc)
Anyhoo, now you can have fun basking in all things Java ... smells wonderful huh? ;)

6. Other .apk activities

We can use the "aapt" SDK tool to determine the Android permissions for the "wwf.apk" we pulled earlier ...
Note: the "aapt" tool is installed by default with the bundle and located in the latest android version sub-directory (android-4.4W in our case).
cheeky@ubuntu:~$ /home/cheeky/adt-bundle-linux-x86-20140702/sdk/build-tools/android-4.4W/aapt dump permissions wwf.apk
package: com.zynga.words
uses-permission: android.permission.INTERNET
uses-permission: android.permission.ACCESS_WIFI_STATE
uses-permission: android.permission.ACCESS_NETWORK_STATE
uses-permission: android.permission.READ_PHONE_STATE
uses-permission: android.permission.READ_CONTACTS
uses-permission: android.permission.SEND_SMS
uses-permission: android.permission.WRITE_EXTERNAL_STORAGE
uses-permission: android.permission.VIBRATE
uses-permission: android.permission.RECEIVE_BOOT_COMPLETED
uses-permission: android.permission.GET_ACCOUNTS
uses-permission: android.permission.AUTHENTICATE_ACCOUNTS
uses-permission: android.permission.USE_CREDENTIALS
uses-permission: android.permission.NFC
uses-permission: android.permission.ACCESS_FINE_LOCATION
permission: com.zynga.words.permission.C2D_MESSAGE
uses-permission: com.zynga.words.permission.C2D_MESSAGE

For a listing of possible Android permissions, see here.
For more details on the "aapt" tool, see here.

It can also be handy to explore what other files are included in an .apk.
For example, the "res" directory holds resources such as pics, sounds.
See here and here for further details on the apk archive structure and the apk building process.

Opening the "wwf.apk" file using Ubuntu Archive Manager, we also note that under the "res/raw/" directory there exists the "dict" file - that sounds pretty squirrelly eh?
Unfortunately, opening it in a Hex editor shows that it's encoded somehow :'(
So no free WWF word list for you! Hey, it was worth a shot ...

7. Creating a Chat Extraction script

OK we're both losing the will to go on, so I'll start finishing up by mentioning the WWF chat artefacts and describing the accompanying extraction script.

Where are the chat artefacts stored?
Under "/data/data/com.zynga.words/databases/" there is a "WordsFramework" file.

I discovered this by "pulling" all the files from the "databases" directory and looking at them using the Firefox SQLite Manager or the Bless hex editor.
We can also use the Linux "file" command to figure out what type of file it is.
cheeky@ubuntu:~$ file WordsFramework
wwf/WordsFramework: SQLite 3.x database, user version 220

Opening it up in Firefox SQLite Manager, we can see that theres 2 tables of interest - "users" and "chat_messages"
Here's a diagram showing how the 2 tables go together ... well, it's been abbreviated down to what monkey considers the important fields anyway ...

WWF chat schema

From our previous adventures in Python SQLite (eg Facebook Messenger post), we know how to query/extract this kind of stuff. Here's the query that gets us the chat artefacts ...
SELECT chat.chat_message_id, chat.game_id, chat.created_at,, chat.message, chat.user_id, users.email_address, users.phone_number, users.facebook_id, users.facebook_name, users.zynga_account_id
FROM chat_messages as chat, users
WHERE users.user_id = chat.user_id ORDER BY chat.created_at;

The script ("") opens the specified "WordsFramework" file, runs the above query and prints out the chat messages in chronological order.
If there's multiple game conversations going on, the analyst can (manually) use the "game_id" to filter out conversations from the TSV output.

It has been developed/tested for Python 2.7.3 on a 32-bit Ubuntu 12.04 LTS VM.
It can be downloaded from Github here.

Making the script executable (via "sudo chmod a+x") and running it with no arguments shows the help text.
cheeky@ubuntu:~/wwf$ ./
Running wwf-chat-parser v2014-07-11
Usage: -d wordsframework_db -o chat_output.tsv

  -h, --help    show this help message and exit
  -d FRAMEWKDB  WordsFramework database input file
  -o OUTPUTTSV  Chat output in Tab Separated format

Here's what the command line output looks like when using the "WordsFramework" file from our emulated chat ...
cheeky@ubuntu:~/wwf$ ./ -d /home/cheeky/WordsFramework -o wwf-output.tsv
Running wwf-chat-parser v2014-07-11

Extracted 9 chat records


With that many columns returned by the query, outputting to the command line just looked too crappy/confusing.
So instead, the script creates a Tab Separated (TSV) file for the output instead.
Here's what the TSV output file would look like if imported into an LibreOffice Calc spreadsheet ...

TSV output of ""

Some comments/observations:
- Data is fictional and is included just to illustrate which fields are extracted. Any id numbers and names have been changed to protect the Simians.
- The "created_at" times appear to be referenced to GMT (Don't trust me ... verify it for yourself).
- The "zynga_account" and "email_address" fields are only populated for the device owner (ie "emulator-monkey"). The opponent's corresponding details aren't populated.
- The "phone_number" field does not appear to be populated at all (FYI the emulator's "Settings" ... "About phone" ... "My phone number" is 1-555-521-5554).
- The highlighted rows 5-6 are from a chat between "emulator-monkey" and "3rd-party-monkey" and show how it is possible to use the "game_id" value to determine conversation threads.
- Monkey does not use Facebook so the Facebook ID and name are included for completeness but not tested. It is apparently possible to use your Facebook login to login in to WWF.
- This script relies on allocated chats (ie chats from active games). There might be chat strings still present in the "WordsFramework" database from previous completed/expired games but I haven't had time to research this area. Running "strings" on the WordsFramework file and/or opening it in a Hex editor might help in that regard.

8. Resources

Here are some resources that I found useful while researching for this post ...
The Official Android SDK documentation (eg how to install the SDK, run the emulator, install the app etc.)

Cindy Murphy's (@CindyMurph) webcast/slides on reversing Android malware

Pau Oliva Fora's (@pof) RSA presentation on reverse engineering Android Apps

Thomas Cannon's (@thomas_cannon) blog post on reverse engineering Android
and his "Gaining access to Android" presentation from DEFCON 20

Lee Reiber's (@Celldet) Forensic Focus webcast on malware detection

9. Final Words

We have been able to use the increased privileges of the Android emulator to uncover and harvest Android application artefacts.
This can be used by researchers to develop forensic extraction scripts without requiring actual rooted physical devices (or expensive commercial forensic tools).

Hopefully, this post will help to address the mobile device "app gap" that currently exists between commercial forensic tools and the sheer number of apps available on the GooglePlay market.
But if not, at least we got to chase some squirrels and learn some new things about WWF chats.

Some initial research shows that Microsoft's Windows Phone emulator cannot currently be used in a similar manner because the MS emulator does not allow you to load Windows marketplace apps onto it. It seems like it's mainly for testing apps that you write yourself. It has trust issues apparently.
As for iOS, monkey doesn't have access to OS X or any Apple devices (shocking!) so can't say whether the iOS emulator supports loading apps from the store and/or is "jailbroken" by default.

OK due to time/space constraints and a tired monkey ("What do you mean Red Bull doesn't come in Banana flavour?!"), we're gonna stop here ... There are probably a few squirrels that escaped but at this point, something is better than nothing eh?