The Ext2/3 Filesystems

The Ext2/3 filesystems are (unsurprisingly) of the FFS/UFS family. Important data is kept in duplicate in superblocks scattered around the filesystem, so that if the main superblock is ever inaccessible or corrupted, emergency copies are still available to parse the filesystem. (Back in the day, we also thought that such redundant copies could make for faster reads since back then we actually had some idea of disk geometries, but of course nowadays we don't know anything about the physical layout of spinning drives, and such considerations are not even relevant to SSDs.)

Before everything else

It is possible that boot code exists in the first 1024 bytes of the filesystem; this space is not used for anything else. Certainly this could provide a space for malware or other undesirables to use for storage, but it is likely that such malevolent code is operating as root, which means it effectively owns the system.

The superblock(s)

After the 1024 reserved bytes at the front of the filesystem there is the master superblock (the one for block group 0), and copies of the superblock are also placed at the front of (some) other block groups.

The superblock is 1k in size. A current version of the superblock's definition from /usr/include/ext2fs/ext2_fs.h (in the kernel, it is at linux/include/linux/ext2_fs.h) is here.

Parsing the superblock

Traditionally, parsing structures defined in /usr/include/* has been done along these lines.

Additionally, an entire library was created by Theodore Ts'o to handle everything Ext*; take a look at at this site for all of the tool set.

Parsing the superblock

To get the filesystem's size, we multiply the block size by the number of blocks. As FSFA points out on page 400, it's possible that volume slack might exist; certainly, it's not been too hard as a system administrator to find partitions that are larger than the resident filesystem (usually because of LVM).

It's most likely that superblocks will be scattered at intervals; when you run mkfs.ext2, you can see that duplicate superblocks are created by default at large intervals.

Also, note that UUIDs are created by default these days and attached to filesystems from the beginning.

Block groups

Block groups contain a superblock (optional), group descriptor table, block bitmap, inode bitmap, inode tables, and (of course) file content.

The block group descriptor structure gives some of the most important structural information. The bitmaps are the heart of the action. The block bitmap shows which blocks have been allocated in a block group; the inode bitmap shows which inodes have been allocated. It keeps the number of free blocks and free inodes, and a number of other bits of state, including some checksum information (those are the crc32c bits.)


Following FSFA page 409, finding the first block in a given group is simple enough:
first block = first data block + given group * blocks / group

Since we now know where the first block is at, and subsequence blocks are consecutive, we can just subtract to find the actual address.

The inode

The inode ("index node") is where we find the metadata about a file, except possibly for extended "xattr" data, which can be allocated another block in addition to the rest of the inode, thus affording quite a bit more space than Windows does for its attributes.

Inodes, continued

You can see the inode structure in /usr/include/ext2fs/ext2_fs.h; the relevant bits are here. As you would expect, the data typical to a call to stat(2) are here: the file mode, the uid, the size in bytes, the access time, the inode change time, the modification time, the gid, the links count, and the blocks count. Other data that is not typical in stat(2) would be a deletion time (obviously!), checksums, and the pointers to the blocks. Not so obviously, the xattr information is not clearly explicated here, though if you think about the history of extended attributes, it's clearly that a johnny-come-lately is going to have be shoehorned in somewhere.

Inodes, continued

The block pointers are an interesting concept. The first 12 are direct pointers; if a file needs no more than 12 blocks, then it is very efficient to use this direct pointers to access the file's data. Larger files will have to use either the indirect pointer, which points to a block with pointers to the used blocks; files that cannot be referenced without exceeding the direct and indirect pointers will have to use the doubly indirect pointers, where pointer is to a block of pointers to other pointer blocks. Traversing two levels of indirection is clearly not as efficient as just using the direct and even the indirect pointers, but then only a small percentage of files generally are large enough to require the latter.

Inodes and sparsity

There is a reserved value for blocks of all zeros; rather than actually allocating such a block, the filesystem is allowed to use a block pointer value of 0 to indicate that a block is all zeros. Sparse files take very special care: very few forensics tools are built with an adequate understanding of sparsity, and can easily introduce errors ranging from bad imaging to misunderstanding database files. If you do find sparse files in an investigation, please take extra care that you understand exactly what is going on.

Understanding file times

For digital forensics, it's critical to understand the meaning of the four titimes in an inode.

atime → the last time that an open(2) was applied to a file.
mtime → the last time that an write(2) was applied to a file, or a mmap(2) file was changed.
ctime → the last time that inode information was updated for a file.
dtime → if an inode previously referred to a file that has been deleted, then ext2 might have set the dtime field.


While the Unix paradigm for a file-like activity is a "bytestream" (think about open(2)/read(2)/close(2) model, which works so well for, e.g., both files and TCP data), the situation is a bit more muddled when thinking about naming.

A filesystem is a key/value store; the keys are the file paths, and the values are (generally) the contents, though it is interesting to consider where metadata such as xattrs fit into this model.

But a filepath is not a unique identifier for a file in a Unix filesystem.

There are three things that come into play here: mounting, which must always be on a spot on the single Unix file tree; hard links, which are true aliases for a file; and soft links, where are actually just ordinary files with a single bit set that indicates to the kernel to interpret the contents of the file as a file path.


FSFA, chapter 14