User:Wvbailey/Euclid's algorithm
A sandbox page:
Algorithms for computers: an example
Algorithms for computers: A computer (or human “computor” [1]) is a restricted type of machine, a "discrete determinisitic mechanical device" [2] that follows blindly its "sequence of instructions"[3]. The issue becomes translating the algorithm into the language that the computer(computor) can effectively "execute".
Models of computation: Arguably[4] the most primitive model of these mechanisms is Lambek’s "abacus" -- a "countably infinite number of locations (holes, wires etc.) together with an unlimited supply of counters (pebbles, beads, etc). The locations are distinguishable, the counters are not". The holes will have unlimited capacity, and standing by is an agent who understands and is able to carry out the list of instructions [5]. As each location's quantity of "counters" can be mapped onto an integer (a positive whole number including 0), a location will be said to "contain a number".
Preliminaries: Knuth advises the reader that "the best way to learn an algorithm is to try it . . . immediately take pen and paper and work through an example"[6]. But the issue of translation of the algorithm requires a language understood by the computer/computor and their capability to execute[7]:
- From a starting instruction n, instruction-execution proceeds to HALT (STOP, End) in numerical sequence n, n+1, n+2 . . . HALT unless a conditional instruction dictates otherwise,
- A conditional instruction: IF [the test here] is TRUE THEN GOTO instruction yyy ELSE if FALSE continue in numerical sequence,
- A location: a letter in upper case, e.g. location S,
- Generalizations of numbers (variables): letters in lower case, e.g. number l located in L. For example, L might contain l = 3009.
- The "all important replacement operation (sometimes called assignment or substitution)", to quote Knuth,[8]: symbolized by := (Knuth uses ← ). For example, if the contents of location R is to be copied into location S, the copy instruction will be symbolized by S := R. Afterwards both R and S will contain the number r.
- An "operation" followed by assignment: If a number s in S is to be subtracted from a number l in L, the result l-s must be put somewhere. If this place is location R, then symbolically R := L - S; if this location is L (a process known as read-modify-write) then symbolically L := L – S.
Euclid’s algorithm
Euclid’s algorithm appears as Proposition II in Book VII ("Elementary Number Theory") of his Elements [9]. Euclid poses the problem: "Given two numbers not prime to one another, to find their greatest common measure". Euclid defines "A number [to be] a multitude composed of units": a counting number, a positive integer not including 0. And to "measure" is to place a shorter measuring length s successively (q times) along longer length l until the remaining portion r is less than the shorter length s[10]. In modern words, remainder r = l - q*s, q being the quotient[11].
For Euclid’s method to succeed, the starting lengths must satisfy three requirements: (i) the lengths must not be 0, AND (ii) The subtraction must be “proper”, a test must guarantee that the smaller of the two numbers is subtracted from the larger (alternately, the two can be equal so their subtraction yields 0), AND (iii) the two lengths are not prime to one another.
For Euclid, the first requirement was tacit and not discussed: a length of 0 would not meet the definition of “a number”, i.e. a multitude of units. However, Nichomachus’ example [12] hints at the problem of a remainder equal to “zero”, in fact the equivalent is what terminates his algorithm. Heath 1908:300 reports (quoting Nicomachus) “ ‘. . . but 7 cannot be subtracted from 7.’ The last phrase is curious, but the meaning of it is obvious, as also the meaning of the phrase about ending 'at one and the same number' ".
With regards to the second requirement, Euclid begins his proof with the possibility that at the outset the two lengths are the same. Indeed if this is the case then the method ends. The computer algorithm will have to account for this possibility.
With regards to the third requirement, Euclid stipulated this so that he could construct a reductio ad absurdum proof that the two numbers' common measure is in fact the greatest. While Nicomachus' algorithm is the same as Euclid's, when the numbers are prime to one another it yields the number "1" for their common measure. So to be precise the following is really Nicomachus' algorithm.
A inelegant program for Euclid's algorithm
- The reason why the following is declared "inelegant" is described in the following section.
The following algorithm is framed as Knuth's 4-step version of Euclid's and Nichomachus', but rather than using division to find the remainder it uses successive subtractions. The bold-face headings are adapted from Knuth 1973:2-4:
E0: [Insure l ≥ s.]
- 1 [Into two locations L and S put the numbers l and s that represent the two lengths]: INPUT L, S
- 2 [Insure the smaller of the two numbers is in s]: IF L > S THEN the contents of L is the larger number: skip over the exchange-steps 3, 4 and 5: GOTO step 6 ELSE continue,
- 3 [The contents of L is less than or equal to the contents of S: Swap the contents of L and S. A third location (R) is necessary.] R := S
- 4 S := L
- 5 L := R
E1:[Find remainder r]: Until the remaining length r in R is less than the shorter length s in S, repeatedly subtract the measuring number s in S from the measured number l in L.
- 6 [This step is required only because the starting/initial/input length l is also considered to be the remaining length r.] R := L
- 7 [Test the measuring-number s in S for greater than the remaining portion r in R.]: IF S > R THEN done measuring: go to step 12 ELSE measure again: continue,
- 8 [Measure again: Subtract smaller number s in location S from remaining portion l in L, and deposit the difference in R]: R := L - S
- 9 [Copy the remaining portion in R into L]: L := R
- 10 [Remainder-loop]: Go to step 7.
E2: [Is the remainder 0?]: After the inner-loop test terminates it sends the program forward to test whether or not (i) the algorithm is done: the last measure was exact and the remainder in R is 0, OR (ii) the algorithm must continue: the last measure left a remainder in R less than measuring number in S.
- 11 IF R = 0 then done: go to step 15, else continue,
E3.: [Interchange]:He The nut of Euclid's algorithm. Use the remainder r to measure what was previously the smaller number s: copy S into L and R into S and then go back to E1.
- 12 [Copy the previous measuring number s in S into L, and copy the remainder r from R into S]: L := S
- 13 S := R
- 14 [Repeat the measuring process]: Go to step 7
- 15 [Done, the contents of S, is the greatest common divisor]: PRINT S
- 16 HALT, END, STOP.
Summary of the inelegant program:
E0: [Ensure l ≥ s. ] 1 INPUT L, S 2 IF L > S THEN GOTO 6 3 R := S 4 S := L 5 L := R E1: [Find remainder r.] 6 R := L 7 IF S > R THEN GOTO 12 8 R := L - S 9 L := R 10 GOTO 7 E2: [Is remainder 0?] 11 IF R = 0 THEN GOTO 15 E3: [Interchange.] 12 L := S 13 S := R 14 GOTO 7 15 PRINT S 16 HALT, END, STOP.
An elegant program
The notion of "simplicty and elegance" appears informally in Knuth 1973:7 and precisely in Chaitin 2005:32:
- "I'll show that you can't prove that a program is 'elegant,' by which I mean that it's the smallest ppssible program for producing the output that it does."
The following version of Euclid's algorithm requires only 6 core instructions to do what 13 are required to do in the inelegant version; worse, the "inelegant" program requires more types of instructions:
10 REM Euclid's algorithm for greatest common divisor 20 PRINT "Type two integers greater than 0" 30 INPUT A%,B% 40 IF B%=0 THEN GOTO 100 50 IF A% > B% THEN GOTO 80 60 LET B%=B%-A% 70 GOTO 40 80 LET A%=A%-B% 90 GOTO 40 100 PRINT A% 110 END
How "elegant" works: In place of an outer "Euclid loop", "elegant" shifts back and forth between two "co-loops", an A% > B% loop that computes A% := A% - B%, and a B% ≤ A% loop that computes B% := B% - A%. This works because, when at last the minuend M is less than or equal to the subtrahend S ( Difference = Minuend - Subtrahend), the minuend can become s (the new measuring length) and the subtrahend can become the new l (the length to be measured); in other words the "sense" of the subtraction reverses.
Testing the algorithms
Does an algorithm do what its author wants it to do? The exceptional cases must be identified. Will the above algorithms perform properly when B > A, A > B, A = B? (Yes to all). What happens when A = 0?, B = 0? Both A and B = 0? (No: Inelegant computes forever in all cases; elegant computes forever when A% = 0. A fully-effective algorithm will have to solve these cases).
As suggested by Knuth one can proceed with paper and pencil to do a few test cases. One source[13] uses 3009 and 884. Another interesting case is the two relatively-prime numbers 14157 and 5950.
Proof of program correctness by use of mathematical induction: Knuth demonstrates the application of Mathematical induction to an "extended" version of Euclid's algorithm, and he proposes "a general method applicable to proving the validity of any algorithm" [14]. Tausworthe proposes that a measure of the complexity of a program be the length of its correctness proof [15].
When it comes to simulating the algorithm on a machine, can it be so general that the mechancial "platform" does not matter? In this regard, Philosopher Daniel Dennett identifies a key feature of an algorithm: what he calls its "substrate neutrality":
- "Substrate neutrality: an algorithm relies on its logical structure. Thus, the particular form in which an algorithm is manifested is not important"[16].
Van Emde Boas, on the other hand, observes "even if we base complexity theory on abstract instead of concrete machines, arbitrariness of the choice of a model remains. It is at this point that the notion of simulation enters"[17]. But what model should be used for the simulation? When speed is being measured, the matters. For example, "inelegant" will execute much faster if division is available.
Good algorithms
Knuth states that:
- "In practice we not only want algorithms, we want good algorithms. One criterion of goodness is the length of time taken to perform the algorithm . . .." (p. 7)
Knuth discusses the notion of fixing one parameter such as l (or a) and letting s (or b) range over all the positive integers, then take an average; he gives an example how to do this for one particular instruction (his E1')[18].
But a global count of all the steps inclusive the first to the HALT will do for illustration. For example, if "inelegant" is given the numbers 3009 and 884, it produces "17" as the g.c.d. and halts in 89 steps. "Elegant" requires only 71 steps. If the numbers 14157 and 5950 are input, inelegant halts at 371 steps while elegant halts at 321 steps.
Can the inelegant algorithm be improved?
Once the author of the algorithm judges "fit" and "effective" -- that is, it does what its author wants it to do -- then the question becomes, can it be improved?
Improvement may involve an acceptance of a tradeoff. Examination of where time is lost points to the loops, in particular an inner loop. If one were to combine the two-step sequence R := L - S followed by L := R, into a single L := L - S, then "inelegant's" execution-time would be shortened. (This requires other changes, see "less-elegant" below.
Given the two numbers 3009, 884, the count is "less-inelegant" 73, "elegant" 71; "less-elegant" looses. But given 14157, 5950 the results are "less-inelegant" 301, "elegant" 321; "elegant" looses. Which is better? Knuth's method of averaging over a large number of trials offers a possible measure of their goodness.
A less-inelegant program: E0. [Ensure l > s] and E1. [Find remainder.] 1 INPUT L, S 2 IF S > L THEN GOTO 6 3 L := L - S 4 GOTO 2 E2. [L will contains the remainder. Is remainder 0? 5 IF L = 0 THEN GOTO 10 E3. [Interchange] 6 R := L 7 L := S 8 S := R 9 GOTO 3 10 PRINT S 11 HALT, END, STOP
- ^ In his essay "Calculations by Man and Machine: Conceptual Analysis" Seig 2002:390 credits this distinction to Robin Gandy, cf Wilfred Seig, et. al., 2002 Reflections on the foundations of mathematics: Essays in honor of Solomon Feferman, Association for Symbolic Logic, A. K Peter,s Ltd, Natick, MA.
- ^ cf Gandy 1980:126, Robin Gandy Church's Thesis and Principles for Mechanisms appearing on pp.123-148 in J. Barwise et. al. 1980 The Kleene Symposium, North-Holland Publishing Company.
- ^ Or a "robot": "A computer is a robot that will perform any task that can be described as a sequence of instructions" cf Stone 1972:3
- ^ “In order to avoid discussion of electronic or mechanical details we may imagine [the machine] in crude, Stone Age terms” Boolos-Burgess-Jeffrey 2002:46; Minsky 1967:199ff, 255ff.
- ^ Lambek 1961:295. Lambek references Melzak who defines his Q-machine as "an indefinitely large number of locations . . . an indefinitely large supply of counters distributed among these locations, a program, and an operator whose sole purpose is to carry out the program." (Melzak 1961:283) B-B-J (loc. cit.) add the stipulation that the holes are "capable of holding any number of stones" (p. 46). Both Melzak and Lambek appear in The Canadian Mathematical Bulletin, vol. 4, no. 3, September 1961.
- ^ Knuth 1973:4
- ^ Stone xxx:xxx
- ^ Knuth 1973:3
- ^ Heath 1908:300; Hawking’s Dover 2005 edition derives from Heath.
- ^ " 'Let CD, measuring BF, leave FA less than itself.' This is a neat abbreviation for saying, measure along BA successive lengths equal to CD until a point F is reached such that the length FA remaining is less than CD; in other words, let BF be the largest exact multiple of CD contained in BA" (Heath 1908:297
- ^ For modern treatments using division in the algorithm see Hardy and Wright 1979:180, Knuth 1973:2.
- ^ Nichomachus’ example and Heath's discussion of it appears in Heath 1908:300 and Hawking 2005:67.
- ^ http://aleph0.clarku.edu/~djoyce/java/elements/bookVII/propVII2.html
- ^ Knuth 1973:13-18. He credits "the formulation of algorithm-proving in terms of asertions and induction" to R. W. Floyd, Peter Naur, C. A. R. Hoare, H. H. Goldstine and J. von Neumann. Tausworth 1977 borrows Knuth's Eulid example and extends Knuth's method in section 9.1 Formal Proofs (pages 288-298).
- ^ cf p. 294 in Robert C. Tausworthe 1977 Standardized Development of Computer Software: Part I Methods, Prentice-Hall, Inc., Englewood Cliffs, NJ, ISBN: 0-13-842195-1
- ^ cf page 51 in Daniel Dennet xxx Darwin's Dangerous Idea
- ^ df page 875 in Peter van Emde Boas' "Machine Models and Simulation" in Jan van Leeuwen ed., 1990, "Handbook of Theoretical Computer Science. Volume A: algoritms and Compexity", The MIT Press/Elsevier, ISBN: 0-444-88071-2 (Volume A).
- ^ See Knuth 1973:7. His particular algorithm's instruction E1 yields a surprising number 12*ln 2*π2.