Sunday 5 October 2014

Windows Phone 8.0 SMS, Call History and Contacts Scripts


Apparently, you can't trust any old monkey with your Windows Phone ...

Following on from our previous Windows Phone post and after some excellent testing feedback, it's time to release some Windows Phone 8.0 scripts for extracting SMS, Call History and Contacts. How much would you expect to pay for these marvellous feats of monkey code? 3 bananas? 2 bananas? How about for FREE :)
But wait .. there's more! As an added bonus we'll throw in a Facebook message JSON extraction script.

Special Thanks to Cindy Murphy (@cindymurph) and the Madison, WI Police Department (MPD) for the initial test data and encouragement.
Thanks also to Brian McGarry (Garda) and JoAnn Gibb (Ohio Attorney Generals Office) for providing further testing data/feedback.

The scripts are available from my GitHub page and have been developed/tested on Windows 7 running Python 2.7 against data from Nokia Lumia 520's running Windows Phone 8.0.

UPDATE (12/7/15):
Have now updated the "wp8-sms.py", "wp8-callhistory.py" and "wp8-contacts.py" scripts to read large files in chunks. This has resulted in a quicker processing time for large files (ie whole image files). Updated code is now available from my Github page. See this post for more details,

SMS Script

The wp8-sms.py script initially searches a given store.vol for "SMS" strings and stores the associated time and phone number information for each corresponding "SMS" record. Next it searches for "SMStext" strings and extracts the FILETIME2, the sent/received text and any associated phone numbers. If a phone number is not found in the "SMStext" record (ie sent SMS), the script uses the FILETIME2 value to lookup the corresponding "SMS" record's phone number field. For ease of display and documentation, the script outputs this data sorted by FILETIME2 in Tabbed Separated Variable (TSV) format.

This script has also been used to parse the pagefile.sys and various store.vol .log files for SMS records which were not present in the store.vol.

Usage:
python wp8-sms.py -f store.vol -o output-sms.tsv

Output format:
Text_Offset    UTC_Time2    Direction    Phone_No    Text
0xabcd    2014-10-01T19:34:57    Sent    1115551234    This is a sent SMS 
0xabc1    2014-10-01T19:37:07    Recvd    1115574321    Here is a received SMS

UPDATE (7/7/15):We have run the "wp8-sms.py script" on a complete 7 GB .bin image from a Windows Phone 8 device.
It processed 6000+ SMS hits in 290 seconds.
The system was a Xeon 6 core 3.5 GHz (circa 2011) with 12 GB RAM and a 160 GB SSD (which contained the .bin image). The OS was Windows 7 x64 and the version of Python used was 2.7.5.
According to Python's cProfile monitoring module, most of the time (~250 seconds) was spent in the "read" call (line 270). In order to reduce the read time, the script could read the .bin file in smaller chunks using multiple threads.



Call History Script

The wp8-callhistory.py script searches a given Phone file for the GUID "{B1776703-738E-437D-B891-44555CEB6669}" which occurs at the end of each call history record. It then works backwards to read the Phone/Name/ID/FILETIME/Flag fields for that record. Finally, it outputs the extracted records sorted by Start_Time in Tabbed Separated Variable (TSV) format.

Usage:
python wp8-callhistory.py -f Phone -o output-callhistory.tsv

Output format:
GUID_Offset    Flag    Start_Time    Stop_Time    ID    Phone_1    Name_1    Name_2    Phone_2
0x3c5ee    0    2014-10-01T03:06:04    2014-10-01T03:06:37    4321555111    (111) 555-1234    BananaMan    BananaMan    (111) 555-1234
0x3c123    1    2014-10-01T03:16:04    2014-10-01T03:18:07    4321555111    (111) 555-1234    BananaMan    BananaMan    (111) 555-1234

Note 1: Flag value: 0 = Outgoing, 1 = Incoming, 2 = Missed
Note 2: ID appears to be the reverse of Phone_1 and Phone_2.

Contacts Script

The wp8-contacts.py script searches a given store.vol for instances of the hex code [01 04 00 00 00 82 00 E0 00 74 C5 B7 10 1A 82 E0 08] which occurs at the end of each contact record. It then tries reading the previous Unicode string fields in reverse order. The last field should contain the Name but can also hold Email for an MPD Hotmail entry. The 3rd last field should contain the Phone number but can also hold Name for MPD Hotmail/other Garda type entries. The contact records are then sorted by the last field (Name) and output in Tabbed Separated Variable (TSV) format.

Usage:
python wp8-contacts.py -f store.vol -o output-contacts.tsv

Output format:
Offset    Last_Field(Name)    Third_Last_Field(Phone)0x711a0    BananaMan    (111) 555-1234
0x727bd    PooFlinger    (111) 555-4321

Facebook Messages Script

The wp8-fb-msg.py script parses selected Facebook JSON fields from ASCII & Unicode file dumps. It should also handle escaped (ie backslashed) fields. It was suggested by Brian McGarry after he observed various JSON encoded messages in a Windows Phone 8.0  pagefile.sys.

So while it's intended to be used against pagefile.sys, it can also be used against any file containing these JSON encoded messages (there's probably an input file size limit though). 
The script extracts the author_fbid, author_name, message and timestamp_src fields and outputs the records sorted by timestamp_src in Tabbed Separated Variable (TSV) format. It also prints the timestamp in a human readable format.

Here's a simple JSON encoded Facebook message example (in reality there's a LOT more fields than this):
{[{"author_fbid":123456789,"author_name":"Monkey", "message":"Where's my Bananas?!", timestamp":1392430316355}]}
For more information on JSON and Facebook messages see this somewhat related previous post

Usage:
python wp8-fb-msg.py -f pagefile.sys -o output-facebook.tsv -u

Note: the -u flag specifies to search for Unicode/UTF16LE encoded messages. The default (ie no -u flag) is to search for ASCII/UTF8 encoded messages.

Output format:
author_fbid_Offset    author_fbid    author_name    message    timestamp_src    timestamp_str
0xae    123456789    "Monkey"    "Where's my Bananas?!"    1392430316355    2014-02-15T02:11:56
0x1e    123456780    "BananaMan"    "Chill out Monkey boy. Magilla Gorilla says they're on the way."    1392430323543    2014-02-15T02:12:03

Final Thoughts

These scripts have been tested mostly against datasets from JTAG'd Nokia Lumia 520s. We can't guarantee they will work for other phones or for Windows Phone 8.1 but it's a good starting point considering the currently limited open source alternatives.
Anyhoo, it is suspected that other Windows Phone data will only require minor tweaks to the existing code rather than a complete rewrite. I'm pretty sure we're in the ballpark *famous last words* :)
As Windows Phones are a market minority and extracting the data out of them typically requires JTAG'ing, these scripts are aimed at a very small audience. Having said that, if they do help you out, it'd be great to hear about it in the comments section ...