>> Okay, we're going to do part 2
of our talk on algorithms today.
And where we left off last time
was issues of proof with respect
to in particular the binary search code.
So issues of proof in an
algorithm are are we certain
that the algorithm terminates
or does it run forever?
Are we certain the algorithm delivers
the outcomes that we are asserting?
And correctness usually needs some
sort of support methodology for proving
that algorithms do what
they claim to do.
Quite often in the early going
that support methodology is some
form of mathematical induction.
Another thing is we want to be able
to analyze performance of algorithms.
How fast are they?
How much memory do they require?
And we need some basis
for comparing speeds
and space utilization
of different algorithms.
Which two algorithms may
accomplish the same task
but with different degrees
of efficiency.
And we need a way to measure
that and talk about it.
And this is going to be a common
theme from now through the rest
of your time studying computer science.
So it's not like you're
expected to learn it all today.
So in correctness loop termination,
what you want to typically do is look
at the state of the loop when you enter
the loop and the state exiting the loop.
Loop invariants are statements that
are true in each iteration of the loop
and they're very analogous to induction
steps in mathematical induction.
I'm just going to give
some examples here.
Sequential search.
We said before that sequential
search is so,
is so clear intuitively
that it actually works.
It's a little bit silly to talk
about a form of proof that it works
but we'll give it a try here.
So the, before entering the
loop t has not been found.
That's a statement which
we'll assert to be true.
Now a loop invariant is that a
current item has not been tested.
That invariant would be true
when you enter the loop body.
And at this juncture after the
test the loop invariant would be
that t is not the item, I'm
sorry the current item is not t.
So we did not find t. All right,
if we go to the next item
then entering the loop body,
that item has not been tested.
And so that loop invariant is true.
Entering the next iteration of the
loop, if we get to here it means
that we did not execute
a return statement
and therefore t was not
equal to the item.
And so that loop invariant is true
for every iteration of the loop.
So those were examples of loop
invariants that can help build a sort
of framework for proving
[inaudible] of an algorithm.
Binary search is as we guess last week
is certainly a more complicated problem
to prove correct.
So remember this thing,
this binary search,
this happens to be the lower
bound version of binary search
that I'm talking about here.
And what I've done is repeat
that algorithmic code but insert
in red some loop invariants
to go with it.
Now notice that we have 1,
2, really 3 lines of code
in the body of the algorithm.
I'm sorry, in the body of the y
loop but we have 7 loop invariants.
When you have more loop
invariants than you do lines
of code you know something
complicated is going on, right.
So what I want to do is just
discuss what these are with you
and then show you in the
narrative where you can read
in much more detail how these
things [inaudible] you into proof
of termination and proof of
correctness for a lower bound algorithm.
So and remember the location of
the loop invariant's important.
So in variants 1, 2 and 3 are intended
to be true upon entering
the while loop body.
Four, five, six and seven
are intended to be true
after executing the loop body.
So while low is less than high, well in
variant 1 says low is less than high,
okay, that's going to be true
because otherwise we wouldn't
have entered the loop.
V of the low minus 1 index is less than
t assuming that the index is valid.
So if low is zero then low minus
1 is not even a valid index
so that's vacuously true.
On the other hand if low is a positive
index value, which could happen,
you know, because of having
executed the loop a few times.
We need to prove that on
entering the loop body v
of low minus 1 is than t. And...
[ pause ]
Finally again if high is a valid index
and we need to show that t is less
than equal to the vector
value at the high index.
Now you might note here that 4, I'm
sorry, 6 and 7 are identical to 2 and 3
but they are after the change
has been made in the loop body.
So either low has been, the
value of low has been changed
or the value of high has been changed.
And then you want to claim that
those four things are true.
[ pause ]
So let me show you in the
narrative, oops, this is a repeat
of that slide picture there.
What we have is assertion
1, low is less than high.
And in assertion 2, assertion
3, assertion 3 is a tricky one.
It's the one that says t
is less than or equal to v
of high if the index is valid.
[ pause ]
So, assertion 4 depends on
evaluating, you have two cases.
If you have v of mid is less than
t than high has been, low has been,
I'm sorry, low has been redefined.
If otherwise, high has been redefined.
And in those, each of those two cases
you have to show 4, 5, 6 and 7 are true.
And once you divide it into cases,
they reduce to straightforward
substitution in algebra.
But it's logically a
little bit complicated just
to keep track of all of those cases.
So this is second case where v of
mid is bigger than or equal to 2.
So in other words, the v of
mid less than t is one case
in which the proof have to be verified.
And v of mid bigger or
equal to t is the other case
in which things have to be verified.
And in the first case
low has been redefined.
And the second case high
has been redefined.
And both need to be verified.
So it's not deep.
It's all just pretty straightforward
computation algebraically computing
things and reasoning about inequalities.
But the fact that it divides
into two cases makes it logically
a little more complicated.
The fact that you've got four
different things to prove in each
of those two cases means you've really
got eight things to prove and so forth.
So it's non-trivial to prove it.
So everyone should go to that
sometime, right it down in a notebook
and make sure you can follow that proof
that I've outlined there
in the narrative.
[ pause ]
Now this is where we start kind
of a different parallel topic.
And I'm going to review this notion
of computational complexity here.
And what we're after is comparing the
growth rate of two different functions.
The domains of these functions
are assumed to include all
or most non-negative integers if they,
by most I mean after a certain
point it includes all of them.
So all integers bigger
than 5 for example.
Typically it includes all
integers bigger than zero or 1.
And these things, these integers in the
algorithm setting represent the size
of some object that you're operating.
The values of these functions should
also be non-negative real numbers.
And typically what these in
our algorithm applications,
what these values are going to
represent is the elapsed time
when you make the computation.
If you're estimating speed
or the memory footprint size,
if you're estimating the amount of
memory required to make the computation.
And of course the speed
is a non-negative number.
The size of memory is
a non-negative number.
So the other thing is we'd like for this
to be independent of initial values.
So we'd like for it to be
independent of constant multipliers
and in general for a
lower order effects.
I realize that's vague.
We'll get some examples of what that
means but the basic idea is you would
like to be able to evaluate
the algorithm
without saying what computer language
is used to implement the algorithm
without saying what compiler is
used to create executable code.
And without specifying what
hardware you're going to run it on.
Now all of those things introduce
variables, variable amounts of time
and so forth that aren't relevant
to the basic algorithm itself.
And so you would like for
your [inaudible] complexity
to be independent of those things.
And that's really the motivation
behind computational complexity.
All right, our notation is
slightly different from these,
what you see in your
Discrete Mathematics book.
And slight, which is
also slightly different
from what you see in your textbook.
What I have done is use less than or
equal to, greater than or equal to
and equal to where in those other
sources they always use equal to.
And it's only a notational difference.
But I think it helps get the points
across about these three different
notions of asymptotics here.
The first one and the
one you see most often
in the literature especially the
popular computing literature is Big O.
So we have a function g and we
have a function f. So there's g
and there is f. We say that g is
less than or equal to Big O of f
if there constant c and n naught.
But c represents a constant in
the value set for the functions.
N represents a constant in
the domain set of functions.
And what this says is that g of n, the
actual value of g is less than or equal
to f of n, the actual value of f,
multiplied by this constant as long
as we are sufficiently far out
in the numbers past n naught.
So another way to say that would be g
is asymptotically bounded above by f.
So f is above g asymptotically
if this [inaudible].
And this is why I like to use the
less than or equal to symbol here
to remind you that this
doesn't mean equality.
Now Big Omega is just
the opposite of that.
Again we've got g and f and we're
using bigger than or equal to.
And there's a constant c which
is in the range of values.
And an n naught which is
in the domain value set.
And we want c times f to be less than or
equal to g for all sufficiently large n.
So in other words f is underneath g or
constant multiple of f is underneath g
for [inaudible] n. So in other words
that says g dominates f. The
first one says g is subordinate
to f. The second one says g is super
ordinate to f. Big Theta is really both.
So g is theta of f means that's two
constants c1 and c2 and an n naught.
Such that the first constant times
f is less than or equal to g.
And the second constant times
f is bigger than or equal to g
for all sufficiently large n. So
that, in other words g is trapped
between two different
constant multiples of f
which kind of makes a band of values.
And g is trapped in between
there forever.
So that's what Big Theta means.
There's good stuff in
the narrative about this.
I guess maybe I'll click
on the narrative
and just show you a couple of things.
[ pause ]
I'm maybe not dexterous
enough to do that.
So I'll, well I can just see them
through this little [inaudible].
Yes so the theorem says that if f is
Big O of g and g is Big O of f then,
I'm sorry, if f is a Big O of g and
g is Big O of h then f is Big O of h.
So Big O is transitive as a relation.
If f is Big Omega of g and g is
Big Omega of h than f is Big Omega
of h. It's transitive as a relation.
And finally the same is true for
Theta since those are all transitive.
There's an anti-symmetry
that reminds of you of less,
of the inequalities for
just normal numbers.
If f is Big O of g, I'm sorry, f is Big
O of g if and only if g is Big Omega
of f. So that's kind of like saying if
a is less [inaudible] then b is bigger
than or equal to a. And
there's a symmetry.
If f is and that's for equality which
is Theta, f is Theta of g if and only
if g is Theta of f. And
finally there's a notion
that if f is Big O of,
I'm sorry, that...
[ pause ]
Well I'm sorry, reflexivity says
that everything is correctly
related to itself.
So f is Big O of f, f is Big Omega of f
and f is Big Theta of
f. No surprise there.
So there's other little factoids in
there that you should read about.
Dichotomy.
Dichotomy says okay for numbers that
would say if a is less or equal to b
and b's less or equal to
a then a equals b, right.
Works the same for these asymptotics.
If f is less or equal to Big O of g
and g is less than or equal to Big O
of f then f is Big Theta of g.
[ pause ]
Okay so that's the asymptotics
of just functions, right,
kind of a theory that
you've seen before.
How does that relate
again to algorithms?
Well what you do is you've got,
okay you've got some algorithm,
let's say you'd like sequential search.
And you want n to be some
measure of the size of the input.
So for sequential search that would be,
let's say the size of the search space.
Going back to the bookshelf,
how many books are on the shelf?
That's n. Find some atomic
notion of computational activity.
That sounds like physics almost.
But what you want is something to
count that represents the amount
of work being done by the algorithm.
And ideally you want that
to be simple to identify
which will make your
consequential analysis easier.
Now once you've done that then f of n
is the number of those atomic things
that get done during the run of
the algorithm on input of size n.
So you now have a function whose
asymptotics you can try to figure out.
And the complexity of the algorithm
is the same as if by definition
of the complexity of that function.
So let's check out an example here.
So this is a straight
loop, goes from zero to n.
And three things happen inside the loop.
Then this loop value gets
executed n times, right.
And three things happen
inside of that loop body.
So 3n atomic computations occur.
And so you can conclude that
the complexity is Theta of 3n.
But notice that Theta of 3n
is the same as Theta of n
because 3 is just constant, right.
So by just using a different constant,
3n and n have the same asymptotics
which tells you you didn't really need
to count all three of the atomics here.
You could have gotten by with just
one and you get the same result,
namely Theta of n. That's why it's
good to be careful and judicious
in choosing the atomics
that you want to count.
Now this is a slightly more complicated
situation because we have a loop
that runs from zero to
n with some atomics.
But the problem is there's a
conditional break in this loop.
So if some condition holds true then
we don't continue executing any more.
So this loop might run n times or
it might run less than n times.
So because of that we cannot conclude
that the complexity of this is Theta
of n. But we can conclude that
it is less than or equal to o
of n. The fact is is may, there may be
an early bailout which would make it run
in time less than n. Another
example is how these things stack
up with each other.
So suppose we have one loop followed
by another loop sequentially.
The first loop runs n times
with no early bailout.
The second loop runs n
times with no early bailout.
And we count atomics here and there.
Well that's going to run 3n, that's
going to do 3n atomic computations.
This is going to be, going
to do 5n atomic computations.
So we can make things look like
it's 3n plus 5n which is 8n, right.
And because there's no early bailout we
can conclude that it's Theta of that.
But really 8n and n are the same.
And so it's still just Theta of n.
So even though we've got two
loops executing in sequence,
the asymptotic complexity
hasn't changed.
[ pause ]
This is a different situation
where you have a loop and inside
of that loop you have another loop.
So here we have a four
loop that runs n times.
And inside of that we have a
four loop that runs n times.
And let's say there's two atomics
outside the inner four loop
and three atomics inside
the inner four loop.
Well the inner four loop runs 3n times.
So the one execution of the body
of the outer loop is 2 plus 3n.
The outer loop runs n times.
And so we have 2 plus 3n times
n atomics none if the execution
of the entire nested loop structure.
And you fiddle that around and you
see that it uses 2n plus 3n squared.
And the fact is 2n plus 3n squared
has the same asymptotic run time
as just plain n squared.
There are exercises to make, help you
recall how, why that is true and some
of them are in the narrative.
They're going to more in your textbook.
So we can conclude that
this complexity is less than
or equal to Big O of n squared.
And by the way we can also conclude that
the complexity is equal to Big Theta
of n squared because there's no early
bailing out in any of these loops.
Those are both true statements,
I want to emphasize.
So it's true that it is
complexity Big O of n squared.
It's also true that its
complexity Big Theta of n squared.
And by the way we could also
say Big Omega of n squared.
And those all three would be true.
The strongest statement of
course is the one about Theta.
It tells you more than the other two.
So let's take a look
at sequential search.
But I'm going to simplify it this
time where there's no early return.
So here is kind of a simplified
version of sequential search.
T is our type, item is
the first thing in the set
on the bookshelf found as false.
While the item is in L we check
to see if T is equal to the item.
If it is we set found to true.
And then we go to the next item.
And we do this for every
item on the bookshelf
and return found after the loop is over.
If found ever gets set to true,
it never gets set to false again.
And so if the item is
there true gets returned.
If the item is never found
then found never gets set
to be true and false gets returned.
So this is, this is performing the same
function as the binary search we had
with the early return but it's
a little simpler to analyze.
And it's simpler to analyze because
this loop, how far does this loop run?
It runs exactly the number of times as
we have books on the bookshelf or items
in the set L. And so, all right so.
The notion of input size
over here is the size
of the container or the bookshelf.
The atomic computation we're going
to count is the comparison here.
We'll just count the number
of times we have a call
to the double equals operator.
That happens exactly once in
each trip through the loop body.
And so it's going to be one compare
times the number of iterations
which is 1 times n. No early bailouts
so the complexity of this is Theta
of n. Now let's go back to
a real sequential search
which has early bailout.
Just to remind you of the distinction
here is that instead of keeping track
of whether we found it or not
all the way through the loop,
we return through right out
of the depths of that loop
as soon as we find the item.
So approximately this
is going to be the same.
It's going to return
through if items in the set.
False if the item is not in the set.
But it runs slightly more efficiently
because once we find it we don't have
to keep looking for it is
the basic principle here.
So the analysis goes similarly
except that the number of iterations
in the loop body is not necessarily n.
The best we can say is that it's less
than or equal to n because of the
early, potentially the early return.
And therefore we can conclude that
the algorithm complexity is less than
or equal to Big O of n.
But we cannot conclude
that the algorithm complexity
is equal to the Theta of n.
So this is a weaker conclusion
about the algorithm.
Which is a little hard to get your arms
around because we've actually
improved the run time of the algorithm.
So if we say Theta of n, it means
that it always runs n times, right.
If we say o of n, it means
that it can't exceed n
but it might use less
time to complete its work.
[ pause ]
Which brings us to binary search.
And again we will use the
lower bound algorithm.
So our notion if input size
is the size of the bookshelf,
the number of items in the container.
The comparison will be
our atomic computation.
This time it's though, be less than
operator not be equal as operator.
That's get called exactly one time
each execution of the, oh by the way,
I said here I called
the less than operator
but of course what I mean
is the less than operator
in the type t for the search space.
So I'm not counting an
integer less than operator,
only a type t less than operator.
But integer less than operator
is more of a control structure
for the [inaudible] and
I'm not counting that.
Not that it would hurt anything
because there's exactly one of those
and exactly one of those for
each, each run of the loop body.
So anyway, our f of n is
one compare times the number
of generations of the loop body.
Now I'm going to just assert that
this loop body runs log n times.
We'll talk about why
that's true in a second.
And so we are going to
claim Theta of log n to our,
the asymptotic run time
of this algorithm.
And therefore the algorithm
complexity is Theta of log n.
And this is what we want in this case.
We do not want to try to improve
this and get Big O of log n by trying
to detect whether or not we found
it at some intermediate stage.
And I'm just going to take a
little digression to explain why.
We, we were sampling
only the, at the midpoint
of the current range, search range.
So we, the question is how
many samplings do we make
when there's only log
n iterations of this?
So we only sample log
n values in the range.
Say n is a 1,000, we're
talking log 2 here.
Log n is 10.
We only sample 10 items.
There's 1,000 items on the bookshelf.
The odds are we didn't sample the
one we were looking for, right.
Because a 1 in this 10 in 1,000 or
1 in 100, 1% chance that we sampled
that of what we were looking for.
So if we build in to test if we
found it, that test is going to be,
is going to fail 99 times out of 100.
So it's going to be essentially
wasted computation and so why do it?
So we don't go down that
road for binary search.
A sequential search we do because
every single item gets looked
at one at a time.
So as soon as you find it you
save yourself a lot of work
and you know you will
find it if it's there.
In binary search you know that you
almost certainly will not find it
so serendipitously when
you do calculate v of mid.
And so it doesn't make sense to
ask the question do I find it?
It's a very low probability
success on that.
So this is something that will come
up again probably several times.
But it's key, a key to help you
understand the subtleties between o of n
and Theta or Big O and Big Theta.
Is to just remember, hold onto the desk,
our optimal sequential
search algorithm is Big O
of n. Our optimal binary
search algorithm is Big Theta
of log n. All right,
that completes the...
[ Pause ]