Dynamic ProgrammingWe can introduce dynamic programming where we left off in Divide & Conquer, illustrating with the Fibonacci recursion. A Fibonacci calculator can be implemeted just by translating the recurrence into a recursive function: unsigned long fibrec(unsigned n) { if (n == 0) return 0; if (n == 1) return 1; return (fibrec(n - 1) + fibrec(n - 2)); } It is a simple matter to code up a driver calling this function, so we can submit and go home, right? If we do that, when our developer colleagues try to use it they find it extremely slow and return the code to us asking for some optimization. Exercise 1. Code a driver for fib_rec and experiment. The problem, it turns out, is that fibrec(n) requires fibrec(n-1) and fibrec(n-2), while fibrec(n-1) requires fibrec(n-2) and fibrec(n-3) - at the first layer of recursive calls fibrec(n-1) is calculated (recursively) twice. Continuing, we see the following tree of calls to fibrec (showing only the argument): (n) (n-1) (n-2) (n-2) (n-3) (n-3) (n-4) (n-3) (n-4) (n-4) (n-5) (n-4) (n-5) (n-5) (n-6) / \ / \ / \ / \ / \ / \ / \ / \ . . . . . . . . . . . . . . . . The recursive call order is a pre-order traversal of the call tree, so, for example, when fibrec(n-3) requires fibrec(n-4) that value is not on the call stack and hence there is no alternative but to calculate the value. This complete binary tree of calls must each be evaluated to get fibrec(n) until a base case is reached at the bottom of the tree. The call tree has 2n nodes, and therefore: Proposition. The runtime of fibrec(n) is Θ(2n). Exponential runtime is not acceptable - the runtime doubles when the input increases by 1. 1 MemoizationThe culprit in the horrible runtime calculation is the repeated re-calculation of values. A possible workaround is the memo pad: when a value is calculated, write it on a memo pad and when that value is needed again, look it up instead of re-calculating it. Here is sample code for such memoization of fibrec: // 1. Create a non-recursive executor function that creates a memo pad typedef fsu::HashTable<unsigned, unsigned long> AA; // memo type = associative array unsigned long fib_memo_AA(unsigned n); // executor unsigned long fib_memo_AA(unsigned n, AA& memo); // recursive call unsigned long fib_memo_AA(unsigned n) { unsigned b = 2; if (n > 2) b = n; AA memo(b); // <-- create memo pad memo[0] = 0; // <-- supply base case values (optional) memo[1] = 1; // <-- supply base case values (optional) return fib_memo_AA(n,memo); // <-- call recursive version, passing memo by reference } // 2. Implement the old "recursive" function with lookup added, passing memo by reference unsigned long fib_memo_AA(unsigned n, AA& memo) { unsigned long f; if (memo.Retrieve(n,f)) // memo lookup, includes base cases for recursion { return f; // if successful lookup we're done, otherwise ... } f = fib_memo_AA(n-1,memo) + fib_memo_AA(n-2,memo); // ... calculate ... memo[n] = f; // ... and store for future lookup ... return f; // ... before returning value } Lemma. If n > 1, the call fib_memo_AA(n) triggers 2n-1 calls to fib_memo_AA(n,memo): n-1 calls result in calculation and store and n calls result in memo lookup. Proof. The memo eventually stores all (n+1) values f0, f1, f2, ... , fn. Because the first two are stored by the executor function, the remaining n - 1 are stored by calls to the recursive function. All other calls must result in memo lookup. We count the recursive calls using mathematical induction. For the base case n = 2: Let r=0 count the recursive calls.
This gives r = 3 as required For the inductive step, suppose the result is true for all numbers less than n. Then
This yields a total of 2 additional recursive calls, or 2n-3+2 = 2n-1. ∎ Proposition. The runtime of fib_memo_AA(n) is Θ(n). Proof. By the Lemma, there are 2×n recursive calls, n-1 calculate/store operations and n table retrieve operations. Thus the cost is
The cost of the table operations is O(1) due to the setting up of the table parameters. Therefore
as required. ∎ Thus for example the runtime to calculate Fib(30) drops from more than 1,000,000,000 to less than 100, a 1,000,000 percent decrease. The team will be pleased! 1.1 Time/Space tradeoffThe exponential runtime of Fib(n) has been reduced from Θ(2n) to Θ(n) at the cost of an additional +Θ(n) space usage. (This is on top of the space required for the runtime stack to push the recursive calls.) Clearly any use of a memo technique on another algorithm will have a similar effect. If the runtime is reduced in degree this is almost always a reasonable tradeoff. (In some special circumstances this extra space requirement can also be reduced, as we shall see for Fib.) 1.2 Bottom-Up Re-WriteIt is often the case that after memoization of an algorithm an iterative bottom-up version can be constructed with three nice properties:
Continuing with the Fibonacci example, we can replace the somewhat elaborate hash table implementation of memo with a simple vector, even in the top-down version: // 1. Create a non-recursive executor function that creates a memo pad unsigned long fib_memo_Vector(unsigned n) { fsu::Vector<unsigned long> memo(1+n,0); // all elements initilazed to 0 if (n > 0) memo[1] = 1; return fib_memo_Vector(n,memo); } // 2. Implement the old "recursive" function with lookup added, passing memo by reference unsigned long fib_memo_Vector(unsigned n, fsu::Vector<unsigned long>& memo) { // base cases already in memo if (n < 2) return memo[n]; // handle separately because memo[0] == 0 unsigned long f; if (memo[n] != 0) // any updated element is > 0, except memo[0] { f = memo[n]; } else { f = fib_memo_Vector(n-1,memo) + fib_memo_Vector(n-2,memo); memo[n] = f; } return f; } The bottom-up version just builds the array in a loop: unsigned long fib_bu_Vector(unsigned n) { if (n == 0) return 0; if (n == 1) return 1; // can use ordinary array for ladder fsu::Vector We could export the vector and have a very fast calculator of f0, f1, f2, ... , fn. Note also that the runtime stack space is reduced to +Θ(1) and the runtime analysis is completely straightforward:
We will leave Fibo alone for now and return at the end of this chapter.
2 Knapsack ProblemThe Knapsack Problem, technically the 0-1 Knapsack Problem, is a classic demonstration of the use of dynamic programming. It also is an important kind of problem to solve in many domains, as it optimizes value among available resources with a constrained consumption ability. Unfortunately, most explanations of this problem offer up a cheesy analogy of a burglar in an art gallery with a constrained carry capacity. I found one video that omits this dubious tradition and another that, once you get past a bit of profanity and the cheese, is also a good down-to-earth explanation. These are: The "0-1" in the Knapsack Problem simply refers to the assumption that resources are discrete and indivisible - you have to take a resource or leave it, you can't take less than the whole. (There is a "Fractional Knapsack Problem" where the resources are divisible - for example, money, energy, fuel, and yes for the burglar analogy, gold dust. That version is easier to solve, and the greedy approach works in that case. See Greedy notes.) Knapsack variables and notation:
We assume that weights are non-negative integers and values are non-negative real numbers. The weight (and value) for a subset S⊆R are defined as the sum of all values (weights) of items in the subset:
Knapsack Goal: Find a subset S of R such that v(S) is maximized subject to the constraint w(S) ≤ W. In other words we want to find:
For the present we will work on finding the optimal value V. Once we have algorithms for that, we will return to examine what is needed to produce the solution for goal 2. We can devise a recursive solution to Knapsack (which we call KS for the rest of this section) as follows: Let KS(n,W) denote a solution, and assume we can solve the smaller problems KS(k,W) for k < n. Then we should be able to construct a solution to KS(n,W) in two cases:
The base case is KS(0,W) = 0 (and considering W also a variable) KS(n,0) = 0). In other words, if there are no resources there's nothing to put in the knapsack, and if the max weight is zero the knapsack will not hold anything. This recursive solution can be coded as follows (using vectors to input the data): double KS_rec(size_t n, size_t W, const fsu::Vector<double>& v, const fsu::Vector<size_t>& w) { double result; if (n == 0 || W == 0) result = 0; else if (w[n] > W) result = KS_rec(n-1,W,v,w); else result = fsu::Max(KS_rec(n-1,W,v,w), v[n] + KS_rec(n-1,W-w[n],v,w)); return result; } This recursive function can be used with a simple driver and it does solve the problem. However it suffers from the same defect as the recursive Fibonacci calculator - many sub-problems get solved over and over, making the worst case runtime exponential in n. Exercise 2. Code up a driver and test KS_rec. As we introduce improved versions, add them to the driver, ultimately testing them all. 2.1 Knapsack Memoized - HashTableFollowing the process illustrated for Fibo in Section 1, we can memoize the Knapsack algorithm. Because it is not immediately clear how much memo will be needed, the most straightforward path to memoization is to use an associative array with an initial guess for memo size and an upgrade subroutine when expansion is needed: // memoized recursive solution (AA version) typedef fsu::HashTable<fsu::String,double> AA; double KS_memo_AA(size_t n, size_t W, const fsu::Vector<double>& v, const fsu::Vector<size_t>& w); // executor double KS_memo_AA(size_t n, size_t W, const fsu::Vector<double>& v, const fsu::Vector<size_t>& w, AA & memo); // recursive call double KS_memo_AA(size_t n, size_t W, const fsu::Vector<double>& v, const fsu::Vector<size_t>& w) { size_t b = fsu::Max(n,W); // <-- WAG AA memo(b); double rval = KS_memo_AA(n, W, v, w, memo); // memo.Analysis(std::cout); // <-- illustrating possible check on table efficiency return rval; } double KS_memo_AA(size_t n, size_t W, const fsu::Vector<double>& v, const fsu::Vector<size_t>& w, AA& memo) { static float expansion_factor = 1.5; fsu::String key = fsu::ToDec(n) + "." + fsu::ToDec(W); double result; if (memo.Retrieve(key,result)) return result; if (n == 0 || W == 0) result = 0; else if (w[n] > W) result = KS_memo_AA(n-1,W,v,w,memo); else result = fsu::Max(KS_memo_AA(n-1,W,v,w,memo), v[n] + KS_memo_AA(n-1,W-w[n],v,w,memo)); memo[key] = result; if (memo.Size() > expansion_factor * memo.NumBuckets()) memo.Rehash(expansion_factor * memo.Size()); return result; } // KS_memo_AA */ To use a pair of integers as a hashtable key we convert them each to a digit string and concatenate with a dot between. This simple expedient can likely be improved. 2.2 Knapsack Memoize - MatrixAssociative arrays work as memo pads and (correctly parametrized) do not add to the asymptotic runtime. However if one can predict the memory needs exactly in advance, then a vector or matrix structure will be much simpler and have considerably smaller overhead associated with storage and retrieval. In the case of Knapsack, we predictably might need memory for all sub-problems KS(i,j) for 0 ≤ i ≤ n and 0 ≤ j ≤ W. This storage requirement is met by an (n+1)×(W+1) matrix:// memoized recursive solution (Matrix version) typedef fsu::Matrix<double> Matrix; double KS_memo_Matrix(size_t n, size_t W, const fsu::Vector<double>& v, const fsu::Vector<size_t>& w); // executor double KS_memo_Matrix(size_t n, size_t W, const fsu::Vector<double>& v, const fsu::Vector<size_t>& w, Matrix& memo); // recursive call double KS_memo_Matrix(size_t n, size_t W, const fsu::Vector<double>& v, const fsu::Vector<size_t>& w) { fsu::Matrix<double> memo(n+1,W+1,0.0); double rval = KS_memo_Matrix(n, W, v, w, memo); return rval; } double KS_memo_Matrix(size_t n, size_t W, const fsu::Vector<double>& v, const fsu::Vector<size_t>& w, Matrix& memo) { double result = memo[n][W]; if ( result != 0 ) return result; if (n == 0 || W == 0) result = 0; else if (w[n] > W) result = KS_memo_Matrix(n-1,W,v,w,memo); else result = fsu::Max(KS_memo_Matrix(n-1,W,v,w,memo), v[n] + KS_memo_Matrix(n-1,W-w[n],v,w,memo)); memo[n][W] = result; return result; } // KS_memo_Matrix */ Note that the memory requirement for KS_memo_Matrix is explicitly +Θ(nW) - a hefty amount when both n and W are large. 2.3 Knapsack Bottom-UpThe bottom-up ("brute force") algorithm can be easily derived from KS_memo_Matrix - just calculate the matrix left-to-right, top-to-bottom, with a pair of nested loops. This has all the advantages of simplicity and transparent analysis we obtained earlier for Fibo. However, note that it does commit us to calculating solutions to all sub-problems, which may be wasteful.// bottom-up double KS_bu(size_t n, size_t W, const fsu::Vector Proposition. KS_bu has runtime Θ(n×W) and runspace +Θ(n×W). Proof. The memory used is an (n+1)×(W+1) matrix and the execution is essentially a traversal of that matrix. ∎ 2.4 Storage requirementThe amount of storage a given Knapsack problem actually requires is highly data dependent. In the matrix versions, we have allocated a substantial amount of storage based on the worst case of requiring all possible sub-problem solutions. However, that seems to be rarely the case in practice. In many situations the amount of storage requried is "sparse" compared to n×W. Note that the associative array version of memoization illustrated above begins by assuming only +O(b) space, where b is the larger of n and W, and checks for the need for expansion during the computation. That version may be the most efficient. Faster hashing of keys would speed it up. The hashing example used in the code above is likely not optimal. Having excessive storage allocated also affects runtime. For example, the loop dimensions in KS_bu are determined by the matrix dimensions used for storage. While definitive results are not available, some randomly generated sample Knapsack problems yield: W = 500 n size of memo (n+1)×(W+1) = worst-case storage -- ------------ ----------- 10 702 5,511 20 1,973 10,521 30 3,990 15,531 40 6,118 20,541 50 5,831 25,551 Another short series of experiments with more items of smaller average value yields: W = 500 n size of memo (n+1)×(W+1) = worst-case storage -- ------------ ----------- 100 43,255 50,601 200 90,242 100,701 300 136,118 150,801 400 178,983 200,901 500 226,371 251,001 These memory use measurements are much closer to the worst-case. Having many items with smaller average weight makes the search more difficult. In fact, it may be that the ratio of actual memory used to the worst-case max is a useful measure of the complexity of the particular knapsack problem. More investigation is needed. 2.5 Recovering the resource solutionRecovery of the actual solution to a Knapsack problem is necessary to make practical use of the optimization - it's all well and good to know what the maximum value is, but that is of little practical value without a recipe for actually using the resources optimally. We have a set of items, or resources, which we have treated as an ordered set
and we have now several algorithms for calculating
What is missing is a specification of which resources from R to be selected to achieve a maximum value V. We will use a "bit set" (actually an fsu::BitVector object) to code this information. Discovering the "optimal packing" S that realizes the maximum value is done after the maximum is computed by examining the history of selection during the run of the algorithm. One way to understand the process is with a recursive calculation on the matrix M that is computed by KS_bu: void Reconstruct_rec (size_t i, size_t j, fsu::BitVector& bv) { // assumes all bits in bv are unset if (i == 0) { return; } if (the solution to the (i,j) subproblem used resource i) { bv.Set(i); Reconstruct(i-1, j-w[i], bv); } else { bv.Unset(i); Reconstruct(i-1,j,bv); } } used as a subroutine for KS_bu with modified prototype double KS_bu(size_t n, size_t W, const fsu::Vector<double>& v, const fsu::Vector<size_t>& w, fsu::BitVector& bv) The result is a bit encoding for the selection of resources used to create the maximum value. Note however that the recursive subroutine suffers from the usual malady, hence: Exercise 3. Devise iterative (non-recursive) subroutines for KS_bu, KS_memo_Matrix, and KS_memo_AA that return the optimal packing solution found by the original algorithm encoded in the BitVector bv. Be sure the subroutines have linear runtime O(n + W) and +O(1) space usage. Use these upgraded prototypes:double KS_memo_AA (size_t n, size_t W, const fsu::Vector<double>& v, const fsu::Vector<size_t>& w, fsu::BitVector& bv) double KS_memo_Matrix (size_t n, size_t W, const fsu::Vector<double>& v, const fsu::Vector<size_t>& w, fsu::BitVector& bv) double KS_bu (size_t n, size_t W, const fsu::Vector<double>& v, const fsu::Vector<size_t>& w, fsu::BitVector& bv) Note that only the executor requires the extra BitVector argument and that the subroutines run in the executor. See the course LIB/tests for a driver mainKS.cpp for these three implementations and a random knapsack data generator ranKS.cpp that is handy for testing. LIB/notes_support contains sample executable knapsack_i.x. Exercise 4. Locate the directory LIB/notes_support in your course library and copy/execute these demonstration programs:fib_demo_i.x # demonstrates various versions of Fibonacci calculator knapsack_i.x # demonstrates all 3 dynamic programming solutions of Knapsack ranKS_i.x # random knapsack problem generator As usual, the "_i" indicates Linux/Intel executable. Entering the command without arguments prints some documentation to screen. 3 When does Dynamic Programming work? (Optimal Substructure + Non-Proliferation)Given a problem with a simple recursive solution, when is that likely not acceptable? In other words, when is a divide and conquer approach not likely to work out in practice? One way is for the divide step to produce sub-problems that have deeper sub-problems in common. When the same subproblem crops up in many of the recursive calls, time as wasted re-computing a solution to that subproblem repeatedly. We saw this occur with Fibo and Knapsack. We found ways to use memoized recursion and ultimately bottom-up computation to solve these problems relatively efficiently. A complementary question is, what properties of a problem make it suitable for a dynamic programming solution? Again looking at our examples, it is clear that we rely on two properties. The first is named for the developer of Dynamic Programming:
In the case of Fibo, we have the fact that the solution F(n) is obtained from solutions to two smaller problems: F(n-1) and F(n-2). In the case of Knapsack (KS), we have the formula
and we rely on the optimal substructure property to prove that "Result" is indeed an optimal solution KS(n,W). The second important feature of problems that are susceptable to dynamic programming is reasonable growth of the number of subproblems:
In the case of Fibo we had subproblems F(0), F(1), ..., F(n-1) required to solve (calculate) F(n): The subproblem growth is O(nd) with d = 1. For Knapsack the subproblems are held in an n×W matrix so the subproblem growth is O(nW) which is again linear if W is held constant and quadratic (d = 2) if W is not allowed to grow fast than n. Given a problem of size n, the number S(n) of distinct sub-problems will ultimately determine both the time and space complexity of a dynamic programming solution.
4 Other Applicatons for DPIn this section we set up a few important problems whose solution by DP is fairly straightforward. The first two are explored in detail in [Cormen]. The second two are explored in the notes on Strings. 4.1 ParenthesizationSuppose we have four matrices to multiply:
and suppose that A has dimension 1×n, B has dimension n×n, C has dimension n×n, and D has dimension n×1. (Thus A is a row vector, B and C are square n×n matrices, and D is a column vector.) Because matrix multiplication is associative, we have several choices how to carry out the product:
The cost of these various recipes is not the same! To see why, note that:
Calculating the number of operations for various parenthesizations yields these examples:
clearly much less costly for large n. An optimal parenthesization for a matrix product
is a choice of parentheses that minimizes the cost of the computation, as measured by the number of individual arithmetic operations required. This problem has a DP solution with O(n) runtime. (See [Cormen] for details.) 4.2 Optimal Binary Search TreesSuppose you want an index, based on Binary Search Tree technology, for a fixed set of keys and that you know from historical data the frequency of lookup for each key. Goal: construct the index to that the average search time is mimimized. Suppose the keys are
and that the set as been sorted, so that
Let pi be the probability that key ki appears in a search. (Then ∑i<n pi = 1.) Finally we assume that no other searches are made - user errors being trapped by a RegEx before the search. The cost of a search can be measured by the number of tree operations required to find the key. To keep the problem simple (once a solution is found the problem and solution are easily upgraded) we assume:
Based on the above, the cost of a (successful) search for a key ki is 1 + 3d(i) where d(i) is the depth of ki in the search tree. Therefore the expected cost of a search is given by
The problem is to find structure the BST so that E is minimized. Clearly we can concentrate on minumizing the sum itself:
A critical ingredient of the DP solution is showing that if the subtree rooted at ki is part of an optimal solution for the original problem then the subtree is an optimal solution to the subproblem for its subrange of keys. A recursive algorithm can then be devised that solves the optimization problem and then taken straight to a bottom-up solution. In pursuit of the latter, use the following notation:
If the subproblem has root kr then we can write:
(The root kr has cost 1 and depth 0, so it contributes pr to the total S. The sums S(i,r-1) and S(r+1,j) are from the search trees for their respective subproblems, assuming we count the search from their roots. Making these two subtrees the children of kr adds the two remaining sums to the S.) The formula simplifies to
(This verifies the optimal subproblem property.) The optimal S is then the minimum over all possible root choices:
The last recurrence translates to a bottom-up computation of S in time Θ(n3). A Θ(n) backtracking recurrence in the 3D solution space produces the insertion sequence of keys to create the optimal search tree found by the algorithm. 4.3 Longest Common SubsequenceThis problem is taken up in the Strings notes. 4.4 Edit DistanceThis problem is taken up in the Strings notes. 4.5 Bellman-Ford SSSPThis algorithm is taken up in the Weighted Graphs notes. 5* More on FibonacciWe left Fibo after giving the original naive recursive calculator a full recursion→memoization→bottom-up makeover that, if phrased as a calculator of all Fibonacci numbers ≤ n, has run time Θ(n), has constant extra space requirements (space usage = +Θ(1)), and is simple to analyze. There are two more nuances we might wish to address for Fibo:
For the remainder of this section, the Fibonacci sequence is denoted by f0, f1, f2, ... , fn, ... with f0 = 0, f1 = 1, and fn = fn-1 + fn-2 for n ≥ 2. 5.1 RunTime = Θ(n) , SpaceUse = +Θ(1)Looking at the fib_bu_Vector algorithm we see that the final return value uses only the last two calculated numbers. The simple expedient of remembering only the previous two calculated values reduces the space requirements to Θ(1): unsigned long fib_opt(unsigned n) { if (n == 0) return 0; if (n == 1) return 1; if (n == 2) return 1; // reduce size of vector from n to 3: fsu::Vector
5.2 Fibonacci IdentitiesProposition. The Fibonacci recursion F(n) satisfies the following identities for k ≥ 1.
We will use matrix algebra (confined to 2×2 matrices). Define A to the the matrix
Claim 1. An is equal to
Proof of Claim 1. We use mathematical induction. The base case n = 1 is apparent since A1 = A and f0 = 0, f1 = 1, f2 = 1. For the inductive step, assume the result is true for all positive integers less than n and consider the matrix An. Because matrix powers commute, An = An-1×A. Using the inductive hypothesis for An-1, the product is
(applying the definition of matrix multiplication and the Fibonacci recursion). ∎ Proof of Proposition. Even though matrix multiplication is not in general commutative, it is for powers of a matrix: Am+n = Am×An. Looking at the (2,2) entry and applying Claim 1, we have:
Substituting m = n = k we get
Substituting m = k + 1 and n = k we get
thus verifying both identities. ∎ 5.3 Memoized: RunTime = Θ(log n) , SpaceUse = +Θ(log n)The identities of 5.2 provide a completely straightforward path to memoization of Fib. It seems intuitively clear that number of intermediate values to be remembered in a memo is limited to O(log n) (we will prove that later) so it makes sense to declare the memo in the executor to have about log2 n buckets. Here is a working implementation: // top-down using table f[] typedef fsu::HashTable<unsigned, unsigned long> AA; unsigned long fiblog_memo_AA(unsigned n); // executor unsigned long fiblog_memo_AA(unsigned n, AA& f); // memoized recursive function unsigned long fiblog_memo_AA(unsigned n) { unsigned b = 2; if (n > 2) b = 1 + floor(log2(n)); AA f(b); f[0] = 0; f[1] = 1; f[2] = 1; return fiblog_memo_AA(n,f); } // fiblog_memo_AA executor unsigned long fiblog_memo_AA(unsigned n, AA& f) { unsigned long value; // cases already in memo - includes recursion base cases if (f.Retrieve(n,value)) return value; // 2 memoized recursive calls bool n_is_odd = ((n&1) != 0); unsigned k = n_is_odd ? (n+1)/2 : n/2; fiblog_memo_AA(k,f); // <- defines f[k]; fiblog_memo_AA(k-1,f); // <- defines f[k-1]; f[n] = n_is_odd ? f[k]*f[k] + f[k-1]*f[k-1] : (2*f[k-1] + f[k])*f[k]; return f[n]; } // fiblog_memo_AA memoized recursive The table lookups in the calculaton can be reduced by adding two variables to remember the recursive call return values: unsigned long fk = fiblog_memo_AA(k,f); // <- defines f[k]; unsigned long fk1 = fiblog_memo_AA(k-1,f); // <- defines f[k-1]; f[n] = n_is_odd ? fk*fk + fk1*fk1 : (2*fk1 + fk)*fk; One can show that the runtime and runspace requirements of fiblog_memo_AA are logarithmic. We will instead go directly to the conversion to bottom-up version, where both facts are transparent. 5.4 Bottom-Up: RunTime = Θ(log n) , SpaceUse = +Θ(log n)The last step in optimizing Fibo will be to convert to bottom-up. The main difficulty now is finding an appropriate scaffolding to support the computation that is size O(log n). The purpose of the scaffolding is to predict the parity bottom-up. These can be computed in reverse order by successively dividing the input n by 2 until reaching 1: ... // n >= 2 // 1: Calculate ceiling(log_2(n)) unsigned long log_2n = 0; { size_t l, p; for (l = 0, p = 1; p < n; ++l, p*= 2){} log_2n = l; } // 2: Calculate ladder fsu::Vector The ladder is used to make the even/odd decision required for Fib(n) in a bottom-up process. We will maintain 3 variables p0,p1,p2 that represent f[k-1], f[k], f[k+1] as we climb the ladder. Recall that the ladder is "upside down" with the bottom step ladder[i] = 1 and the top step ladder[0] = n: // 3: calculate recurrence unsigned long p2, p1, p0, sq; p2 = 1; // f(k+1) p1 = 1; // f(k) p0 = 0; // f(k-1) for(unsigned j = i; j > 0; ) { --j; if (1 == ladder[j] % 2) { sq = p2 * p2; // essential: p2 gets redefined p2 = sq + 2 * p2 * p1; p1 = sq + p1 * p1; } else { sq = p1 * p1; // efficient, but not essential p2 = p2 * p2 + sq; p1 = sq + 2 * p1 * p0; } p0 = p2 - p1; // f(k-1) } // for After step 3, p1 is the value of Fib(n). Putting all three components together we have: unsigned long fiblog_bu(unsigned n) { if (n == 0) return 0; if (n == 1) return 1; // 1: Calculate ceiling(log_2(n)) // 2: Calculate ladder // 3. Calculate recurrence return p1; }
Proposition. Fiblog_bu(n) has runtime Θ(log n) and runspace +Θ(log n). Proof. There are 3 blocks of code run sequentially. Each block consists of a loop of length no greater than 1 + ⌈log2n⌉, and at least one of them has length exactly ⌊log2n⌋. Therefore the runtime is Θ(log2n). The non-constant space requirement is embodied in the ladder vector. This vector is declared to size 1 + ⌈log2n⌉. ∎ 5.4 Really BIG IntegersFibo values get large rapidly. The largest Fibonacci number that can be stored in a 64-bit integer is f90 = 2,880,067,194,370,816,120. To calculate fn for n > 90 we need a larger integer type in which to calculate and store results. There is such a type in LIB/cpp: /* integer.h Definition of the class Integer Integer is a class of arbitrary precision base 10 integers, the number of digits limited only by the maximum size of an array of char. The class uses a private fsu::Vector The API for these "infinite precision" integers is essentially the same as that of native C++ integer types: namespace fsu { class Integer { friend std::istream& operator >> (std::istream& s, Integer& n); friend std::ostream& operator << (std::ostream& s, const Integer& n); friend Integer operator + (const Integer& n, const Integer& m); friend Integer operator - (const Integer& n, const Integer& m); friend Integer operator - (const Integer& n); friend Integer operator * (const Integer& n, const Integer& m); friend Integer operator / (const Integer& n, const Integer& m); friend Integer operator % (const Integer& n, const Integer& m); friend bool operator == (const Integer& n, const Integer& m); friend bool operator != (const Integer& n, const Integer& m); friend bool operator <= (const Integer& n, const Integer& m); friend bool operator < (const Integer& n, const Integer& m); friend bool operator >= (const Integer& n, const Integer& m); friend bool operator > (const Integer& n, const Integer& m); public: Integer (); // Initialize an Integer to zero Integer (long n); // Initialize: typecast long to Integer Integer (const Integer& y); // copy constructor ~Integer (); // destructor Integer& operator = (const Integer& y); Integer& operator += (const Integer& y); Integer& operator -= (const Integer& y); Integer& operator *= (const Integer& y); Integer& operator /= (const Integer& y); Integer& operator %= (const Integer& y); Integer& operator ++ (); Integer operator ++ (int); Integer& operator -- (); Integer operator -- (int); // added to test for palindroms Integer Reverse () const; // return Integer with digits reversed bool Palindrom () const; // return 1 iff Integer is a palindrom ... }; // Integer } namespace fsu Integer algorithms can often be translated to use type fsu::Integer with only a few editorial changes. In particular, the fiblog_memo_AA and fiblog_bu algorithms translate simply. The "input" types can remain the C++ natives (unsigned and size_t), but the "output" types need to be upgraded to fsu::Integer. The first two parts of the fiblog_bu algorithm need no change as they involve numbers related to the input n. The ladder is also still a vector of unsigned ints. Only the third step, calculating the recurrence, requires the large integer type: fsu::Integer Fiblog_bu(unsigned n) // return type is changed { if (n == 0) return 0; if (n == 1) return 1; // 1: Calculate ceiling(log_2(n)) - no changes // 2: Calculate ladder - no changes // 3. Calculate recurrence - change types of p0,p1,p2 to fsu::Integer return p1; } The memoized version also translates simply: typedef fsu::HashTable<unsigned,fsu::Integer> AAI; fsu::Integer Fiblog_memo_AA(unsigned n); // executor fsu::Integer Fiblog_memo_AA(unsigned n, AAI& f); // recursive call fsu::Integer Fiblog_memo_AA(unsigned n) { unsigned b = 2; if (n > 2) b = 1 + floor(log2(n)); AAI f(b); f[0] = 0; f[1] = 1; f[2] = 1; return Fiblog_memo_AA(n,f); } fsu::Integer Fiblog_memo_AA(unsigned n, AAI& f, size_t& callcount) { fsu::Integer value; if (f.Retrieve(n,value)) return value; unsigned k = (0 != (n&1))? (n+1)/2 : n/2; fsu::Integer fk = Fiblog_memo_AA(k,f,callcount); fsu::Integer fk1 = Fiblog_memo_AA(k-1,f,callcount); return f[n] = (n & 1) ? fk*fk + fk1*fk1 : (2*fk1 + fk)*fk; } Here are some sample values and runtimes: Fiblog_memo_AA(100) = [21 digits] = 354224848179261915075 Fiblog_memo_AA(1000) = [209 digits] = Fiblog_memo_AA(10000) = [2090 digits] = Fiblog_memo_AA(100,000) = [more than 20,000 digits] = A curve-fit estimate of the time to calculate Fib_rec(100,000) [i.e., the original recursive algorithm] is : estimated elapsed time = 1019481 years. |