Methods for compressing data. The Huffman greedy algorithm
uses character frequencies.
Code words are not prefixes of other code words. Can
show that optimal
compression can always be achieved with a prefix code, so only consider
them.
a b c = 0 101 100 = 0101100 so
|
aaa |
b |
e |
| 0001011101 = |
000 |
101 |
1101 |
binary tee whose leaves are the code words is a convenient tool to aid
coding.

An optimal code for a file is always represented as a full binary
tree. Full means every non-leaf child has two children. Note: Fixed-length
code tree doesn't. C: Alphabet ® |C| leaves
|C| - 1 interval.
where f(c) is the frequency of c and dt is the level down the
tree (bits required) of c.
Cost of the T tree. Huffman came up with a greedy
algorithm. Q:
priority queue keyed on f implemented via a heap. Take two least frequent
objects and merge them together.
Huffman (C)
n ¬ |C|
Q ¬ C
for i ¬ 1 to n -
1
do z ¬
Allocate-Node ()
x left[z] ¬ Extract-Min
(Q) // least frequent
y right[z]
¬ Extract-Min
(Q) // next least
f[z] ¬ f[x]
+ f[y] // update frequency
Insert ( Q, z
)
return Extract-Min (Q)
merge into z
Why?
Want a full binary tree
Want least frequent objects on the bottom
What happens if x and y are replaced in Q with their
merged values?
Running Time: Q into heap = O (n), n = |C|, n
times through
loop, min extract O ( lg n ) so O ( n lg n )

Lemma: Let x and y be in C and have the lowest
frequencies. Then $
optimal prefix code where x and y have the same length and their
prefix codes differ in the last bit. Why:
-
Must be on the bottom (least frequent)
-
Full tree, so they must be siblings, and so
differ in one bit.
See the book for full proof.
Lemma: Let T be a full binary tree representing an
optimal prefix code
over C. x, y are sibling leaves in T with f (x), f (y) and z is their
parent. Then T' = T - { x, y } with f (z) = f (x) + f (y) is also a
representation of an optimal prefix code.
Proof: B (T) in terms of B (T') " c Î
C - { x, Y } dT (c) = dT' (c) so
f (c) dT (c) = f (c) dT'
(c)
dT (x) = dT (y) = 1 +
dT' (z)
f (x) dT (x) + f (y) dT (y) =
( f (x) + f (y) ) (
1 + dT' (z) )
= f (z) dT' (z) + f (x) + f (y)
B (T) = B (T') + f (x) + f (y)
if T' is non-optimal then T", optimal and can
get B (T) = B
(T") + f (x) + f (y) contradicting T's optimality. T' optimal for C'
= C - { x, y }