A fairly modern definition: A filesystem is a collection of files that we can treat as a unit.

An older definition: a contiguous area of disk space that we can use to store files, usually in a hierarchical fashion.

Examples of filesystems

Where do these filesystems exist?

The primary place is still in a contiguous area of disk space on a single disk drive.

Other places include memory ("RAM" disks), RAID devices, and "virtual" filesystems such as created by ideas like sshfs using FUSE. Filesystems on devices such as flash memory devices are also prevalent. There are even sometimes filesystems maintained by boot environments.

Even tape devices can be used to store and read filesystems, though generally not to as live writable filesystem.

Technology of disk-based systems

The steady march of technology has been pronounced in the area of spinning hard disk technology.

Spinning drives are generally connected with SATA; you also see quite a few SAS disk drives in some environments. Older standards like the original SCSI and PATA have largely fallen by the wayside.

Technology of disk-based systems

SAS drives are often found on high performance systems. SAS is based on the original SCSI standard, which was originally designed to be placed in long daisy chains.

Unfortunately, a confusing welter of standards were created in the SCSI world, and you have to be very careful about trying to plug random drives into random setup. The original type was "single-ended" and in a largish 50 pin connector (the infamous "Centronics" connector.) Then came the 68 pin and 80 pin connectors, and the idea of "differential" voltage checks rather than just high/low signals. Fortunately, we have converged on the SAS standard. Indeed, there are converters available from ordinary SATA drives to the SAS standard.


While [P]ATA/IDE drives are pretty much unavailable --- I bought some in the summer of 2012 and I had very limited choices of drives. (Indeed, to my surprise, the recent default Linux kernels do not support some of the older [P]ATA standards unless you specifically compile that support in), they are still found in older machines. They inherited an old ribbon cable format, and actually use parallel wires to deliver data. There are many type of "gotchas" when working with these drives.


SATA drives are now the strongest sellers, and are increasing rapidly as a percentage of the installed base of drives. Generally, cables from a new SATA generation will work with older drives (not a feature commonly found with SCSI drives.)

Direct and BIOS access to hard drives

While most of the Linux/Unix and modern Microsoft world uses direct access to hard drives, the Microsoft world long ago relied on BIOS access via INT 0x13. The practical concern about this is that BIOS methods were often not reliable, particularly when trying to acquire disk sizes. Modern drives (post-1994 or so) all use LBA, and CHS has been a figment of the imagination for many, many years.

HPA issues

One thing to be aware of when trying to acquire a disk image is that more modern drives can have an HPA. This is a section of the disk that has been reserved for other purposes, and standard tools often don't see it. The most common reason for having HPA is for a computer maker to have somewhere to store recovery programs and data. The HPA is located at the logical end of the drive (the highest LBA numbers), unless there is a DCO.

The way to detect HPA is to use two ATA commands, READ_NATIVE_MAX_ADDRESS and IDENTIFY_DEVICE. Both of these return maximum sectors, but the former always returns the device limit, whereas the latter will return only the number of sectors that are available. This is available in Linux with hdparm -N, and it usually works.

DCO (device configuration overlay)

Another thing to be aware is that newer drives can also have a DCO. This was designed to make physically different drives actually appear to be the same size by the expedient of wasting space. These can be detected by READ_NATIVE_MAX_ADDRESS, IDENTIFY_DEVICE, and DEVICE_CONFIGURATION_IDENTIFY. These can be found on a Linux system with hdparm --dco-identify, but this isn't very reliable since manufacturers are rather sensitive about this issue.

A good reference on the HPA/DCO issue is at Hidden Disk Areas: HPA and DCO (PDF format.)

Write blocking

A very important subject in the "dead" analysis world is that of write blocking. Generally, you want to do hardware level write blocking. Fortunately, in the last few years, USB write blockers have been coming into the market which simplify the process of write blocking. Relatively inexpensive products such as WiebeTech's blockers and Digital Intelligence's are now available. For this class, we have a Tableau TK8 and a Disc Jockey Pro.

Images and hashes

Acquisition of an image can be as simple as using the "dd" command on a drive. First, you would want to check for an HPA or DCO; then, using a write-blocker, copy the image to somewhere.

"Somewhere?" That's a good question. The old standby of writing to optical media, which is inherently less modifiable than most other technologies, has unfortunately been up-ended in many cases by the very rapid growth in the size of other media. You can walk into Costco and buy a 3 terabyte drive for $139.00, but the largest generally available optical technology is still stuck in the 100/128 gigabyte range (for instance, see this optical writer --- the media is still expensive, such as here or here.)

Wherever you end up placing your image, you will want to make multiple "integrity" hashes of it. Those hashes need to be recorded somewhere, and some people, such as Brian Carrier, recommend jotting them down in your notebook. Program such as md5sum and sha1sum are still adequate for simple error checking, but MD5 in particular is not trustworthy for anything other than that: MD5 vulnerable to collision attacks. Here's a good, practical page on the dangers of MD5: MD5 Collision.


For historical reasons, we have used the idea of a "partition", a logically contiguous (and maybe even physically so) area of drive space. While abstraction schemes such as LVM and even modern filesystems such as ZFS have been moving away from this idea, the old disk partition is still the most common means of storing filesystems.

MBR (DOS) partitions

The most common partitioning scheme is that of the MBR (or DOS) scheme. The MBR is physically in the first LBA sector (well, the first 512 bytes is more accurate now that 4k sector drives are now coming into the market...)

Layout of the MBR

$ od -x -Ad /tmp/firstblock 
0000000 48eb d090 00bc fb7c 0750 1f50 befc 7c1b
0000016 1bbf 5006 b957 01e5 a4f3 bdcb 07be 04b1
0000032 6e38 7c00 7509 8313 10c5 f4e2 18cd f58b
0000048 c683 4910 1974 2c38 f674 b5a0 b407 0203
0000064 0080 8000 0841 0006 0800 90fa f690 80c2
0000080 0275 80b2 59ea 007c 3100 8ec0 8ed8 bcd0
0000096 2000 a0fb 7c40 ff3c 0274 c288 f652 80c2
0000112 5474 41b4 aabb cd55 5a13 7252 8149 55fb
0000128 75aa a043 7c41 c084 0575 e183 7401 6637
0000144 4c8b be10 7c05 44c6 01ff 8b66 441e c77c
0000160 1004 c700 0244 0001 8966 085c 44c7 0006
0000176 6670 c031 4489 6604 4489 b40c cd42 7213
0000192 bb05 7000 7deb 08b4 13cd 0a73 c2f6 0f80
0000208 f084 e900 008d 05be c67c ff44 6600 c031
0000224 f088 6640 4489 3104 88d2 c1ca 02e2 e888
0000240 f488 8940 0844 c031 d088 e8c0 6602 0489
0000256 a166 7c44 3166 66d2 34f7 5488 660a d231
0000272 f766 0474 5488 890b 0c44 443b 7d08 8a3c
0000288 0d54 e2c0 8a06 0a4c c1fe d108 6c8a 5a0c
0000304 748a bb0b 7000 c38e db31 01b8 cd02 7213
0000320 8c2a 8ec3 4806 607c b91e 0100 db8e f631
0000336 ff31 f3fc 1fa5 ff61 4226 be7c 7d7f 40e8
0000352 eb00 be0e 7d84 38e8 eb00 be06 7d8e 30e8
0000368 be00 7d93 2ae8 eb00 47fe 5552 2042 4700
0000384 6f65 006d 6148 6472 4420 7369 006b 6552
0000400 6461 2000 7245 6f72 0072 01bb b400 cd0e
0000416 ac10 003c f475 00c3 0000 0000 0000 0000
0000432 0000 0000 0000 0000 738c d0f4 0000 0180
0000448 0001 fe83 1e3f 003f 0000 9920 0007 0000
0000464 1f01 fe05 ffff 995f 0007 f4db 1d12 0000
0000480 0000 0000 0000 0000 0000 0000 0000 0000
0000496 0000 0000 0000 0000 0000 0000 0000 aa55

Important note: The final two bytes are the "magic numbers" identifying this as an MBR.

Layout of the MBR, execution

The first 446 bytes of the MBR are "boot code"; i.e., they are executable code. For a Linux MBR, they look like this when disassembled:
$ udcli /tmp/firstblock 
0000000000000000 eb48             jmp 0x4a                
0000000000000002 90               nop                     
0000000000000003 d0bc007cfb5007   sar byte [eax+eax+0x750fb7c], 1
000000000000000a 50               push eax                
000000000000000b 1f               pop ds                  
000000000000000c fc               cld                     
000000000000000d be1b7cbf1b       mov esi, 0x1bbf7c1b     
0000000000000012 06               push es                 
0000000000000013 50               push eax                
0000000000000014 57               push edi                
0000000000000015 b9e501f3a4       mov ecx, 0xa4f301e5     
000000000000001a cb               retf                    
000000000000001b bdbe07b104       mov ebp, 0x4b107be      
0000000000000020 386e00           cmp [esi+0x0], ch       
0000000000000023 7c09             jl 0x2e                 
0000000000000025 7513             jnz 0x3a                
0000000000000027 83c510           add ebp, 0x10           
000000000000002a e2f4             loop 0x20               
000000000000002c cd18             int 0x18                
000000000000002e 8bf5             mov esi, ebp            
0000000000000030 83c610           add esi, 0x10           
0000000000000033 49               dec ecx                 
0000000000000034 7419             jz 0x4f                 
0000000000000036 382c74           cmp [esp+esi*2], ch     
0000000000000039 f6a0b507b403     mul byte [eax+0x3b407b5]
000000000000003f 028000008041     add al, [eax+0x41800000]
0000000000000045 0806             or [esi], al            
0000000000000047 0000             add [eax], al           
0000000000000049 08fa             or dl, bh               
000000000000004b 90               nop                     
000000000000004c 90               nop                     
000000000000004d f6c280           test dl, 0x80           
0000000000000050 7502             jnz 0x54                
0000000000000052 b280             mov dl, 0x80            
0000000000000054 ea597c000031c0   jmp dword 0xc031:0x7c59 
000000000000005b 8ed8             mov ds, eax             
000000000000005d 8ed0             mov ss, eax             
000000000000005f bc0020fba0       mov esp, 0xa0fb2000     
0000000000000064 40               inc eax                 
0000000000000065 7c3c             jl 0xa3                 
0000000000000067 ff740288         push dword [edx+eax-0x78]
000000000000006b c252f6           ret 0xf652              
000000000000006e c28074           ret 0x7480              
0000000000000071 54               push esp                
0000000000000072 b441             mov ah, 0x41            
0000000000000074 bbaa55cd13       mov ebx, 0x13cd55aa     
0000000000000079 5a               pop edx                 
000000000000007a 52               push edx                
000000000000007b 7249             jb 0xc6                 
000000000000007d 81fb55aa7543     cmp ebx, 0x4375aa55     
0000000000000083 a0417c84c0       mov al, [0xc0847c41]    
0000000000000088 7505             jnz 0x8f                
000000000000008a 83e101           and ecx, 0x1            
000000000000008d 7437             jz 0xc6                 
000000000000008f 668b4c10be       mov cx, [eax+edx-0x42]  
0000000000000094 057cc644ff       add eax, 0xff44c67c     
0000000000000099 01668b           add [esi-0x75], esp     
000000000000009c 1e               push ds                 
000000000000009d 44               inc esp                 
000000000000009e 7cc7             jl 0x67                 
00000000000000a0 0410             add al, 0x10            
00000000000000a2 00c7             add bh, al              
00000000000000a4 44               inc esp                 
00000000000000a5 0201             add al, [ecx]           
00000000000000a7 006689           add [esi-0x77], ah      
00000000000000aa 5c               pop esp                 
00000000000000ab 08c7             or bh, al               
00000000000000ad 44               inc esp                 
00000000000000ae 06               push es                 
00000000000000af 007066           add [eax+0x66], dh      
00000000000000b2 31c0             xor eax, eax            
00000000000000b4 89440466         mov [esp+eax+0x66], eax 
00000000000000b8 89440cb4         mov [esp+ecx-0x4c], eax 
00000000000000bc 42               inc edx                 
00000000000000bd cd13             int 0x13                
00000000000000bf 7205             jb 0xc6                 
00000000000000c1 bb0070eb7d       mov ebx, 0x7deb7000     
00000000000000c6 b408             mov ah, 0x8             
00000000000000c8 cd13             int 0x13                
00000000000000ca 730a             jae 0xd6                
00000000000000cc f6c280           test dl, 0x80           
00000000000000cf 0f84f000e98d     jz dword 0xffffffff8de901c5
00000000000000d5 00be057cc644     add [esi+0x44c67c05], bh
00000000000000db ff00             inc dword [eax]         
00000000000000dd 6631c0           xor ax, ax              
00000000000000e0 88f0             mov al, dh              
00000000000000e2 40               inc eax                 
00000000000000e3 6689440431       mov [esp+eax+0x31], ax  
00000000000000e8 d288cac1e202     ror byte [eax+0x2e2c1ca], cl
00000000000000ee 88e8             mov al, ch              
00000000000000f0 88f4             mov ah, dh              
00000000000000f2 40               inc eax                 
00000000000000f3 89440831         mov [eax+ecx+0x31], eax 
00000000000000f7 c088d0c0e80266   ror byte [eax+0x2e8c0d0], 0x66
00000000000000fe 890466           mov [esi], eax          
0000000000000101 a1447c6631       mov eax, [0x31667c44]   
0000000000000106 d266f7           shl [esi-0x9], cl       
0000000000000109 3488             xor al, 0x88            
000000000000010b 54               push esp                
000000000000010c 0a6631           or ah, [esi+0x31]       
000000000000010f d266f7           shl [esi-0x9], cl       
0000000000000112 7404             jz 0x118                
0000000000000114 88540b89         mov [ebx+ecx-0x77], dl  
0000000000000118 44               inc esp                 
0000000000000119 0c3b             or al, 0x3b             
000000000000011b 44               inc esp                 
000000000000011c 087d3c           or [ebp+0x3c], bh       
000000000000011f 8a540dc0         mov dl, [ebp+ecx-0x40]  
0000000000000123 e206             loop 0x12b              
0000000000000125 8a4c0afe         mov cl, [edx+ecx-0x2]   
0000000000000129 c108d1           ror dword [eax], 0xd1   
000000000000012c 8a6c0c5a         mov ch, [esp+ecx+0x5a]  
0000000000000130 8a740bbb         mov dh, [ebx+ecx-0x45]  
0000000000000134 00708e           add [eax-0x72], dh      
0000000000000137 c3               ret                     
0000000000000138 31db             xor ebx, ebx            
000000000000013a b80102cd13       mov eax, 0x13cd0201     
000000000000013f 722a             jb 0x16b                
0000000000000141 8cc3             mov ebx, es             
0000000000000143 8e06             mov es, [esi]           
0000000000000145 48               dec eax                 
0000000000000146 7c60             jl 0x1a8                
0000000000000148 1e               push ds                 
0000000000000149 b900018edb       mov ecx, 0xdb8e0100     
000000000000014e 31f6             xor esi, esi            
0000000000000150 31ff             xor edi, edi            
0000000000000152 fc               cld                     
0000000000000153 f3a5             rep movsd               
0000000000000155 1f               pop ds                  
0000000000000156 61               popad                   
0000000000000157 ff26             jmp dword near [esi]    
0000000000000159 42               inc edx                 
000000000000015a 7cbe             jl 0x11a                
000000000000015c 7f7d             jg 0x1db                
000000000000015e e84000eb0e       call 0xeeb01a3          
0000000000000163 be847de838       mov esi, 0x38e87d84     
0000000000000168 00eb             add bl, ch              
000000000000016a 06               push es                 
000000000000016b be8e7de830       mov esi, 0x30e87d8e     
0000000000000170 00be937de82a     add [esi+0x2ae87d93], bh
0000000000000176 00eb             add bl, ch              
0000000000000178 fe4752           inc byte [edi+0x52]     
000000000000017b 55               push ebp                
000000000000017c 42               inc edx                 
000000000000017d 2000             and [eax], al           
000000000000017f 47               inc edi                 
0000000000000180 656f             outsd                   
0000000000000182 6d               insd                    
0000000000000183 004861           add [eax+0x61], cl      
0000000000000186 7264             jb 0x1ec                
0000000000000188 20446973         and [ecx+ebp*2+0x73], al
000000000000018c 6b0052           imul eax, [eax], 0x52   
000000000000018f 6561             popad                   
0000000000000191 640020           add [fs:eax], ah        
0000000000000194 45               inc ebp                 
0000000000000195 7272             jb 0x209                
0000000000000197 6f               outsd                   
0000000000000198 7200             jb 0x19a                
000000000000019a bb0100b40e       mov ebx, 0xeb40001      
000000000000019f cd10             int 0x10                
00000000000001a1 ac               lodsb                   
00000000000001a2 3c00             cmp al, 0x0             
00000000000001a4 75f4             jnz 0x19a               
00000000000001a6 c3               ret                     
00000000000001a7 0000             add [eax], al           
00000000000001a9 0000             add [eax], al           
00000000000001ab 0000             add [eax], al           
00000000000001ad 0000             add [eax], al           
00000000000001af 0000             add [eax], al           
00000000000001b1 0000             add [eax], al           
00000000000001b3 0000             add [eax], al           
00000000000001b5 0000             add [eax], al           
00000000000001b7 008c73f4d00000   add [ebx+esi*2+0xd0f4], cl
00000000000001be 800101           add byte [ecx], 0x1     
00000000000001c1 0083fe3f1e3f     add [ebx+0x3f1e3ffe], al
00000000000001c7 0000             add [eax], al           
00000000000001c9 0020             add [eax], ah           
00000000000001cb 99               cdq                     
00000000000001cc 07               pop es                  
00000000000001cd 0000             add [eax], al           
00000000000001cf 0001             add [ecx], al           
00000000000001d1 1f               pop ds                  
00000000000001d2 05feffff5f       add eax, 0x5ffffffe     
00000000000001d7 99               cdq                     
00000000000001d8 07               pop es                  
00000000000001d9 00db             add bl, bl              
00000000000001db f4               hlt                     
00000000000001dc 121d00000000     adc bl, [0x0]           
00000000000001e2 0000             add [eax], al           
00000000000001e4 0000             add [eax], al           
00000000000001e6 0000             add [eax], al           
00000000000001e8 0000             add [eax], al           
00000000000001ea 0000             add [eax], al           
00000000000001ec 0000             add [eax], al           
00000000000001ee 0000             add [eax], al           
00000000000001f0 0000             add [eax], al           
00000000000001f2 0000             add [eax], al           
00000000000001f4 0000             add [eax], al           
00000000000001f6 0000             add [eax], al           
00000000000001f8 0000             add [eax], al           
00000000000001fa 0000             add [eax], al           
00000000000001fc 0000             add [eax], al           
00000000000001fe 55               push ebp                
00000000000001ff aa               stosb                   

Bytes 446 through 461 are the partition table entry #1 (primary partition):
0000446 80      byte 0 is the boot flag

0000447 01      bytes 1-3 are the beginning CHS
0000448 01 
0000449 00

0000450 83      byte 4 is partition type (Linux, in this case)

0000451 fe      bytes 5-7 are the ending CHS 
0000452 3f
0000453 1e

0000454 3f      bytes 8-11 are the starting LBA address
0000455 00
0000456 00       Note that the first partition "always" starts
0000457 00       at 0x3f = sector 63... 

0000460 20      bytes 12-15 are the size in sectors
0000461 99
0000462 07       0x00079920 = 497,952 512k sectors = 248,976 1k blocks
0000463 00


MBR, as viewed from fdisk

[root@sophie ~]# fdisk -l

Disk /dev/sda: 250.0 GB, 250000000000 bytes
255 heads, 63 sectors/track, 30394 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk identifier: 0xd0f4738c

   Device Boot      Start         End      Blocks   Id  System
/dev/sda1   *           1          31      248976   83  Linux
/dev/sda2              32       30394   243890797+   5  Extended
/dev/sda5              32       30394   243890766   8e  Linux LVM

Extended partitions

Unfortunately, extended partitions are far more complicated.

Let's look at the second partition:
0000462 00    Bootable flag

0000463 00    Starting CHS
0000464 01 
0000465 1f

0000466 05    Partition type (05 means "extended")

0000467 fe    Ending CHS
0000468 ff 
0000469 ff

0000470 5f    Starting LBA
0000471 99
0000472 07 
0000473 00

0000474 db    Size in sectors
0000475 f4
0000476 12       0x1d12f4db = 487,781,595 512k sectors 
0000477 1d                  = 2*243,890,797 1/2 1k blocks


Upshot for forensics

The most important thing to note is that often sectors 1-62 are not used by the operating system → 62 512k byte sectors are 31k, which is plenty of room for bad stuff!

Other things to note are that malware can play a *lot* of MBR tricks, if it's clever enough. Hiding sectors by making partition table entries unparseable by ordinary software is an uncommon but possible one.


MBR was limited to 2 terabytes with 512 sectors, and 8 terabytes with 4K sectors. With newer drives already available in 3 and 4 terabytes, and aggregration schemes such as RAID creating enormous logical volumes, a new scheme called GPT was necessary.

Abstracting the partition concept

Probably the most common abstraction is that of RAID (Redundant Array of Inexpensive Disks.) Here we take a number of disk devices, and aggregate them into a common RAID pool, and then partition that aggregation.

How can we do this? There are two main ways: hardware RAID with some sort of RAID controller doing the work, and presently logical drives to the system, and software RAID, where the logical drives are maintained by the system.

Forensics and RAID

RAID, and particularly hardware-based RAID-5 and its variants, make forensics on just drives very difficult. If you have a hardware RAID situation, the best recourse BY FAR is to keep those drives connected to that RAID controller!