Consider computing two tables of hashes, one of ``good messages'' and one of ``bad messages.'' When is the probability that the second table will have a value that also appears in the first?
This problem is actually easier to do then the original one. Intuitively, there will be a collision much before
the two tables are full. So we can assume that we perform
hash computations for the first table (where
). Some
of these values will be collisions, (because of the birthday problem, above) but not too many (because
t), so
we can assume that the number of DIFFERENT hash values in the first table is also approximately
.
For each value we compute in the second table, we have a probability
that it does not coincide with any
value in the first table. So, the possibility that NO value in the second table coincides with any value in
the first table is
(the second table also is constructed from
hash computations).
The probability of collisions between tables is therefore:
| (7) | |||
| (8) | |||
| (9) |
Well,
and
if
.
Substituting the above approximation in Equation 10, we
get that
, or
, so again
is the answer. Since each table has size
, we compute
hashes, twice more than before.
Take your math with coffee...