Thursday 22 March 2012

Perl Parsing an SQLite Database File Header


Previously on Cheeky4n6Monkey ... we used the Perl DBI package to retrieve the contents of Firefox's (v3.5.17 and v11.0) "places.sqlite" and "downloads.sqlite". A secondary discovery was that depending on the version of the sqlite3 command line exe, one might be able to use it to read Firefox 11 ".sqlite" files ... or not. Just to clarify, sqlite3 v3.6.16 cannot read Firefox 11 files but sqlite3 (v3.7.10) can.
Anyway, rather than waste time trying different versions of sqlite3 on various ".sqlite" files, I thought a Perl script to parse the ".sqlite" file header might be handy.
Even if it's not particularly unique/useful (there's already an SQLite Manager Firefox plugin), at least we get more practice with Perl. Also, as far as I know, SQLite Manager does not show the SQLite Version nor does it show a meaningful Schema Format string.

The relevant SQLite file structure is detailed here. Rather than extract all of the fields, I thought I would initially concentrate on 6 fields:
- SQLite Version eg 3007007
- SQLite Schema Format eg SQLite v3.0.0
- Page Size (in bytes)
- Number of Pages
- Number of Pages on Freelist
- Text Encoding used eg UTF-8

The last 4 fields can be verified/viewed using SQLite Manager and the first 2 fields can be verified/viewed using the SIFT's Bless Hex Editor.

So lets get stuck into the coding!


Code

# CODE BEGINS AFTER THIS LINE
#!/usr/bin/perl -w

# Perl script to parse selected SQLite Database header fields
# Based on the Database Header section of http://www.sqlite.org/fileformat2.html

use strict;

use Getopt::Long;
use Encode;

my $version = "sqlite-parser.pl v2012-03-21";
my $help = 0;
my $filename = "";

GetOptions('help|h' => \$help,
    'file=s' => \$filename);

if ($help || $filename eq "")
{
    print("\nHelp for $version\n\n");
    print("Perl script to parse selected SQLite header fields\n");
    print("\nUsage: sqlite-parser.pl [-h|help] [-file filename]\n");
    print("-h|help .......... Help (print this information). Does not run anything else.\n");
    print("-file filename ... sqlite filename to be parsed.\n");
    print("\nExample: sqlite-parser.pl -file /cases/firefox/places.sqlite\n");
    exit;
}

print "\nRunning $version\n\n";

# Try read-only opening the SQLite file to extract various header information
my $rawsqliteheader;

open(my $sqlitefile, "<".$filename) || die("Unable to open $filename for header parsing\n");
#binmode($sqlitefile);
sysread ($sqlitefile, $rawsqliteheader, 100) || die("Unable to read $filename for header parsing\n");;
close($sqlitefile);

# First check that we have a valid header 1st 16 bytes should read "SQLite format 3\000" in UTF8
# or alternatively "53 51 4c 69 74 65 20 66 6f 72 6d 61 74 20 33 00" in hex
my $rawsqlitestring = substr($rawsqliteheader, 0, 16);
my $sqlitestring = decode("UTF8", $rawsqlitestring);
if ($sqlitestring eq "SQLite format 3\000")
{
    print "SQLite String Parsed OK - Continuing Processing of $filename ...\n";
}
else
{
    print "$filename does NOT have a Valid SQLite String in Header! Bailing out ...\n\n";
    exit;
}


# Extract the database page size in bytes
# Should/Must be a power of two between 512 and 32768 inclusive, or the value 1 representing a page size of 65536
my $rawdbpagesize = substr($rawsqliteheader, 16, 2);
my $dbpagesize = unpack("n", $rawdbpagesize); # Use "n" for 2 byte int
if ($dbpagesize eq 1)
{
    print("Database Page Size (bytes) = 65536\n");
}
else
{
    print("Database Page Size (bytes) = $dbpagesize\n");
}


# Extract the size of the database file in pages.
my $rawnumpages = substr($rawsqliteheader, 28, 4);
my $numpages = unpack("N", $rawnumpages); # use "N" for 4 byte int
if ($numpages ne 0)
{
    # Must check that changecounter = validversionfor
    # as validversionfor stores the current changecounter value after the SQLite version number was written
    # (eg at creation)
    my $rawchangecounter = substr($rawsqliteheader, 24, 4);
    my $changecounter = unpack("N", $rawchangecounter);

    my $rawvalidversionfor = substr($rawsqliteheader, 92, 4);
    my $validversionfor = unpack("N", $rawvalidversionfor);

#    print "changecounter = $changecounter\n";
#    print "validversionfor = $validversionfor\n";

    if ($changecounter eq $validversionfor)
    {
        print("Valid Number of Pages = $numpages\n");
    }
    else
    {
        print("Invalid Number of Pages! (mismatched changecounter value)\n");
    }
}
else
{
    print("Invalid Number of Pages! (zero)\n");
}


# Extract the total number of freelist pages.
my $rawnumfreelistpages = substr($rawsqliteheader, 36, 4);
my $numfreelistpages = unpack("N", $rawnumfreelistpages); # use "N" for 4 byte int
print("Total Number of Freelist Pages = $numfreelistpages\n");


# Extract the schema format number. Supported schema formats are 1, 2, 3, and 4.
my $rawschemaformatnum = substr($rawsqliteheader, 44, 4);
my $schemaformatnum = unpack("N", $rawschemaformatnum); # use "N" for 4 byte int
#print("Schema Format Number = $schemaformatnum\n");
if ($schemaformatnum == 1)
{
    print("Schema Format = SQLite v3.0.0\n");
}
elsif ($schemaformatnum == 2)
{
    print("Schema Format = SQLite v3.1.3 (2005)\n");
}
elsif ($schemaformatnum == 3)
{
    print("Schema Format = SQLite v3.1.4 (2005)\n");
}
elsif ($schemaformatnum == 4)
{
    print("Schema Format = SQLite v3.3.0 (2006) or higher\n");
}
else
{
    print("Invalid Schema Format!\n");
}


# Extract the database text encoding. A value of 1 means UTF-8. A value of 2 means UTF-16le. A value of 3 means UTF-16be.
my $rawtextencode = substr($rawsqliteheader, 56, 4);
my $textencode = unpack("N", $rawtextencode); # use "N" for 4 byte int

#print("Text Encoding = $textencode\n");
if ($textencode == 1)
{
    print("Text Encoding = UTF-8\n");
}
elsif ($textencode == 2)
{
    print("Text Encoding = UTF-16le\n");
}
elsif ($textencode == 3)
{
    print("Text Encoding = UTF-16be\n");
}
else
{
    print("Invalid Text Encoding!\n");
}


# Extract the SQLite Version number as a 4 byte Big Endian Integer at bytes 96-100
# The version number will be in the form (X*1000000 + Y*1000 + Z)
# where X is the major version number (3 for SQLite3), Y is the minor version number and Z is the release number
# eg 3007004 for 3.7.4
my $rawsqliteversion = substr($rawsqliteheader, 96, 4);
my $sqlversion = unpack("N", $rawsqliteversion);
print("SQLite Version is: $sqlversion\n\n");

# CODE ENDS HERE


Code Summary

As ".sqlite" files contain a UTF-8 encoded header string we need to use the "decode" function from the Encode Perl package. So the first thing to note is the "use Encode" statement.
Next, we have the familiar GetOptions and Help sections. Yawn!
And now the journey really begins ...
We call a read-only "open" on the user specified filename and then "sysread" in the first 100 bytes into the "$rawsqliteheader" variable.
Now that we have a copy of what should be the file header, we can now "close" the user specified file.

The first thing we do is look for the UTF-8 encoded "SQLite format 3\000" string. We use "substr" to copy the first 16 bytes into "$rawsqlitestring" from our 100 byte "$rawsqliteheader" buffer.
Next we call "decode" to convert "$rawsqlitestring" into a Perl usable string format and store the result in "$sqlitestring". See here for more information on encoding/decoding with Perl.
Finally, we test the value of "$sqlitestring" and "exit" if it does not match "SQLite format 3\000".
Pretty straight forward eh?

Similarly, for the Database Page Size field - we call "substr" to copy our bytes (in this case only 2) but instead of calling "decode" we use "unpack" for interpreting numbers. There's a bit of a hack going on with this particular field - if it's value is 1, the actual size used is supposed to be 65536 bytes. So we include an "if" statement to handle that.

I could go on but the other parsing sections do pretty much the same thing (except they call "substr" to extract 4 bytes from various offsets).


Testing

Here's the output of our "sqlite-parse.pl" script with a bogus fabricated file we prepared for an earlier post:

sansforensics@SIFT-Workstation:~$ sqlite-parser.pl -file /cases/cheeky-file.c4n6

Running sqlite-parser.pl v2012-03-21

/cases/cheeky-file.c4n6 does NOT have a Valid SQLite String in Header! Bailing out ...

sansforensics@SIFT-Workstation:~$


OK so that shows that our's is a discerning script ... we don't accept any old riff-raff claiming to be SQLite!
Now lets try it with a Firefox 11.0 "places.sqlite" file:

sansforensics@SIFT-Workstation:~$ sqlite-parser.pl -file /cases/firefox11/places.sqlite

Running sqlite-parser.pl v2012-03-21

SQLite String Parsed OK - Continuing Processing of /cases/firefox11/places.sqlite ...
Database Page Size (bytes) = 4096
Valid Number of Pages = 145
Total Number of Freelist Pages = 5
Schema Format = SQLite v3.1.3 (2005)
Text Encoding = UTF-8
SQLite Version is: 3007007

sansforensics@SIFT-Workstation:~$


That looks OK but we should validate our results using the SIFT's SQLite Manager Firefox plugin.

"SQLite Manager" Firefox Plugin Validating Our Script Output

As you can see, the SQLite Manager confirms our scripts "Database Page Size" (4096), "Valid Number of Pages" (145), "Total Number of Freelist Pages" (5) and "Text Encoding" (UTF-8) values.


Now we will use the SIFT's Bless Hex Editor to validate our script's results for the "SQLite Version"  and "Schema Format". You can access Bless from the Applications ... Programming ... Bless Hex Editor menu.

Bless Hex Editor Validating Our Script's outputted SQLite Version (3007007)

Bless Hex Editor Validating Our Script's outputted Schema Format (2)

OK, the Bless screenshots confirm our script's "SQLite Version" (3007007) and "Schema Format" (2, which corresponds to "SQLite v3.1.3"). Dare I say, we've been blessed? heheh

For extra giggity-giggity-giggles, lets now use our script with a Firefox 3.5.17 "places.sqlite" file:

sansforensics@SIFT-Workstation:~$ sqlite-parser.pl -file ffexamples/places.sqlite


Running sqlite-parser.pl v2012-03-21


SQLite String Parsed OK - Continuing Processing of ffexamples/places.sqlite ...
Database Page Size (bytes) = 4096
Valid Number of Pages = 203
Total Number of Freelist Pages = 0
Schema Format = SQLite v3.0.0
Text Encoding = UTF-8
SQLite Version is: 3007004


sansforensics@SIFT-Workstation:~$

We can see that Firefox 3.5.17 and Firefox 11.0 use different schemas and SQLite versions for their respective "places.sqlite" files. For Firefox 11.0, Schema Format = SQLite v3.1.3 (2005), SQLite Version is: 3007007. For Firefox 3.5.17, Schema Format = SQLite v3.0.0, SQLite Version is: 3007004.
This information would have been handy before we started trying to open Firefox 11.0 database files with a superseded copy of sqlite3 eh?

Anyway, so ends another post. Please let me know if you think/find it useful or if you would like any more header fields printed.Cheers!