Presentation of Algorithms and Mathematics Jonathan Shapiro School - - PowerPoint PPT Presentation
Presentation of Algorithms and Mathematics Jonathan Shapiro School - - PowerPoint PPT Presentation
Presentation of Algorithms and Mathematics Jonathan Shapiro School of Computer Science University of Manchester February 5, 2018 Announcements Blackboard is still not functioning for this course Slides and resources are found on the
Announcements
◮ Blackboard is still not functioning for this course ◮ Slides and resources are found on the course website
http://syllabus.cs.manchester.ac.uk/pgr/ 2017/COMP80142/
◮ Or, studentnet → Current PGR → Study &
curriculum → COMP80142 Materials.
It is not for lack of trying
Why this topic
We have to strike a balance,
◮ Precision afforded by formalism. ◮ Lucidity afforded by explanations using words.
Words are often much easier to comprehend than formalisms. Formalisms can achieve clarity and precision.
Why this topic (cont)
Somewhat motivated by my own personal frustrating seeing how algorithms and mathematics are presented,
◮ In student work, ◮ In work by established researchers.
Why this topic (cont)
Somewhat motivated by my own personal frustrating seeing how algorithms and mathematics are presented,
◮ In student work, ◮ In work by established researchers.
Computer scientists do many other things — present new languages, new applications, new data sets, new hardware designs, etc. All present challenges to describing clearly in text. These are what I know about best.
What is important to get across
An algorithm is of interest if it solves a problem.
◮ Perhaps a new problem ◮ Perhaps an existing problem, but with different
characteristics or properties. Often unclear in written descriptions:
◮ What the algorithm is trying to do which is different from
existing algorithms
◮ Why the reader should believe that it is correct or
reasonable.
◮ If it is supposed to be “better”, what is meant by better.
What is in an algorithm
◮ The problem it is designed to address. ◮ The steps that make up the algorithm. ◮ Input, output, and required data structures. ◮ The scope and limitations of application of the algorithm. ◮ Discussion of correctness (proofs, plausability arguments,
empirical testing).
◮ Discussion of complexity, both space and time. ◮ Experimental evaluation.
Not every presentation will contain all of these.
Avoid common flaws in evaluation of algorithms
◮ Compare your result with the best-performing algorithms. ◮ Compare your results on established benchmarks when
they exits.
Formalisms for presenting algorithms
The frequent way of presenting is pseudocode Algorithm: Bubble Sort Input: data xi, size m repeat Initialize noChange = true. for i = 1 to m − 1 do if xi > xi+1 then Swap xi and xi+1 noChange = false end if end for until noChange is true
Pseudocode
Pros:
◮ Easier for others to translate into working
code.
◮ Precise (if correct). ◮ Could be easier for the author to produce.
Cons:
◮ Harder for the reader to understand. ◮ Harder for the writer to debug; sometimes
contains bugs.
◮ Often too close to a programming paradigm
which may not be shared with the readers (e.g. JAVA).
Pseudocode
Pros:
◮ Easier for others to translate into working
code.
◮ Precise (if correct). ◮ Could be easier for the author to produce.
Cons:
◮ Harder for the reader to understand. ◮ Harder for the writer to debug; sometimes
contains bugs.
◮ Often too close to a programming paradigm
which may not be shared with the readers (e.g. JAVA). You really need explanations to describe why it works, and how.
Discussion question
Is it better for others to use your code or reimplement your algorithm,
◮ In order to replicate and test your claims ◮ In order to use your ideas
Formalisms
Following Zobel [Writing for Computer Science] List style: Algorithm presented as numbered list of steps with go-tos. Pseudocode: These require separate prose explanations in most cases. Prosecode: Number each step, never break a loop over steps, use subnumbering for parts of each step, and include explanatory text. Literate code: The algorithm is introduced gradually, intermingled with discussion of underlying ideas.
Two formalisations of the same algorithm
Optimal stratified sampling algorithm of Liu and Fearnhead, “Online Inference for Multiple Changepoint Problems”, Journal
- f the Royal Statistical Society: Series B, 69 (4). pp. 589-605.
ISSN 1467-9868. Example of pseudocode: From Joe Mellor’s thesis (next page) Example of literate code: See page 10 of the original source.
Algorithm 4.1 Stratified Optimal Resampling Require: Distribution {pi : 0 < i ≤ N}, of N particles with ordering σ(i) s.t. σ(i) < σ(j) for i < j Require: Parameter M < N Find α s.t. N
i=1 min
- 1, pi
α
- = M
Initialise u by drawing uniformly from [0, α]. for i = N do if pi ≥ α then qi = pi else u = u − pi if u ≤ 0 then qi = α u = u + α else qi = 0 end if end if end for
Pictures are good, too
Helps explain the structure of the algorithm.
Computing means and variances
A (trivial) motivation example:
◮ A collection of numbers (real numbers or integers); ◮ You want to calculate their mean and variance; ◮ There is right way and a wrong way to compute the
variance. How could I explain this?
Some mathematical notation
Let x be a collection of N numbers, where [x]i is the ith number. We will use the notation f(x) to denote the empirical mean of f; f(x) = 1 N
N
- i=1
f([x]i). So, the mean of x is given by x = 1 N
N
- i=1
[x]i. The variance can be written σ2 = (x − x)2 = x2 − x2.
Explanation based on pseudocode
(follows)
Algorithm 1 One-loop method (not recommended) Require: A vector of numbers x. Ensure: The mean and variance of the numbers in x.
1: mean = 0. 2: var = 0. 3: for i = 0 to x.Size do 4:
mean ← mean + x.Value(i)
5:
var ← var + x.Value(i) ∗ x.Value(i)
6: end for 7: mean ← mean/x.Size 8: var ← var/x.Size − mean ∗ mean 9: return mean and var
Algorithm 2 Two-loop method (recommended) Require: A vector of numbers x. Ensure: The mean and variance of the numbers in x.
1: mean = 0. 2: var = 0. 3: for i = 0 to x.Size do 4:
mean ← mean + x.Value(i)
5: end for 6: mean ← mean/x.Size 7: for i = 0 to x.Size do 8:
var ← var + [x.Value(i) − mean] ∗ [x.Value(i) − mean]
9: end for 10: var ← var/x.Size 11: return mean and var
More expository description
To compute the mean and variance of a collection of numbers, proceed in two steps. Step 1 Compute mean using x = 1 N
N
- i=1
[x]i. Step 2 Compute variance using σ2 = (x − x)2 = 1 N
N
- i=1
([x]i − x)2.
More expository description
To compute the mean and variance of a collection of numbers, proceed in two steps. Step 1 Compute mean using x = 1 N
N
- i=1
[x]i. Step 2 Compute variance using σ2 = (x − x)2 = 1 N
N
- i=1
([x]i − x)2. Do not use σ2 = x2 − x2 because that takes the difference between two large and similar numbers, which is numerically unstable.
Or maybe just words
◮ When computing the variance of a collections of numbers,
average the squared difference between the value and its mean.
Or maybe just words
◮ When computing the variance of a collections of numbers,
average the squared difference between the value and its mean.
◮ This will be more robust than averaging the square of the
values and subtracting that from the square of the mean.
Or maybe just words
◮ When computing the variance of a collections of numbers,
average the squared difference between the value and its mean.
◮ This will be more robust than averaging the square of the
values and subtracting that from the square of the mean.
◮ Reason is the latter takes the difference between two large
yet similar numbers, which is numerically unstable.
Which description is better?
◮ Which is easier to understand? ◮ Which is clearer? ◮ Which is more precise?
Writing practice
Finding the square root of an integer using only +, −, ∗, /. Given: An integer n bigger than 1. Find: An approximation to √n. To a given accuracy: ǫ. On the following slide is an algorithm presented in pseudocode. Rewrite it in stepwise or mathematical fashion.
Pseudocode description
Algorithm 3 Find the square root of a positive integer n (Baby- lonian method) Require: An integer n > 0. An relative accuracy ǫ > 0. Ensure: Approximation Sn to √n such that
- S2
n − n
- /n ≤ ǫ.
1: So ← InitialGuess(n). {See Algorithm 4} 2: repeat 3:
So ← Sn
4:
Sn ← 1
2 (So + n/So)
5: until
- S2
n − n
- /n ≤ ǫ
6: return Sn
Algorithm 4 Find an initial first guess of the square root of a pos- itive integer n. To be used with the Babylonian method square root finder. Require: An integer n > 0. Ensure: The smallest power of 2 which is larger than √n.
1: fours ← 4. 2: twos ← 2 3: while fours ≤ n do 4:
fours ← 4 × fours
5:
twos ← 2 × twos
6: end while 7: return twos {Returns the smallest power of two which is
larger than √n}
Students write
Stepwise description
From Wikipedia
- 1. Begin with an arbitrary positive starting value s0 (the closer
to the actual square root of n, the better).
- 2. Let si+1 be the (arithmetic) average between si and n/si.
- 3. Repeat step 2 until the desired accuracy is achieved.
Mathematical description
Also from Wikipedia s0 ≈ √ n si+1 = 1 2
- si + n
si
- √
n = lim
i→∞ si.
Finding an initial value
◮ My method was to use the smallest power of 2 which is
larger than n.
◮ This can be done with a loop in which 4s are multiplied
together until a number is found which is larger than n.
◮ If at the same time in this loop, 2s are multiplied together.
When the product of 4s first exceeds n, the product of 2s will first exceed √n.
1 2 3 4 5 6 7 8 9 10
Iteration
340 360 380 400 420 440 460 480 500 520
Estimate Square root of 125348 using Babylonian Method
Presenting Mathematics
Three great resources. Links on the course webpage.
◮ “Handbook of Writing for the Mathematical Sciences”,
Nicholas J. Higham, SIAM, 1998, Chapter 3.
◮ http://www.math.illinois.edu/ dwest/grammar.html ◮ “Mathematical Writing”, Donald E. Knuth, Tracy L.
Larrabee, and Paul M. Roberts, 1989. I will give some simple rules or suggestions.
Formalisms require explanation
Don’t let formal exposition replace expository writing. A complex mathematical analysis or proof requires guidance along the way. Use phrases like, The goal is to . . . We first need to show . . . This is the essential part of the analysis . . . Before we give the details, we will outline the strategy of the proof . . . Help the reader
Use examples
When introducing complex constructs it is very helpful to introduce simple examples before the general case. Example: Game theory
◮ A game is a tuple N, T, U · · · , , where, N is
number of players, T is a tree, etc, etc.
◮ First or after give examples of extensive-form
games which illustrate the definitions.
Mathematical expressions are parts of sentences
English grammar still applies. Right or Wrong? There is a formula telling how to multiply two N × N matrices A and B to get result C. Cij =
N
- k=1
AikBkj (1)
Mathematical expressions are parts of sentences
English grammar still applies. Wrong! There is a formula telling how to multiply two N × N matrices A and B to get result C. Cij =
N
- k=1
AikBkj (1) Even with a full stop at the end of the equation, I don’t like it.
Mathematical expressions are parts of sentences
English grammar still applies. Better There is formula a formula to multiply two N × N matrices A and B to get result C, which is Cij =
N
- k=1
AikBkj. (1)
Mathematical expressions are parts of sentences
English grammar still applies. Better There is formula a formula to multiply two N × N matrices A and B to get result C, which is Cij =
N
- k=1
AikBkj. (1)
- r The N × N matrix multiplication C = AB can be
expressed as Cij =
N
- k=1
AikBkj, (2) where Mij is the ith jth component of a matrix M. (Note the comma and the spelling of ith, not ith.)
What part of speech is an equation?
What part of speech is an equation?
An equation can be a noun phrase: The formula is A = B.
What part of speech is an equation?
An equation can be a noun phrase: The formula is A = B. An equation can be a clause (or sentence) with = as the verb: If A = B then B = A.
What part of speech is an equation?
An equation can be a noun phrase: The formula is A = B. An equation can be a clause (or sentence) with = as the verb: If A = B then B = A. So, consider, The floor of a real number x is ⌊x⌋ = max {n ∈ Z|, n ≤ x}. Where is the verb in the above sentence?
So, consider, The floor of a real number x is ⌊x⌋ = max {n ∈ Z|, n ≤ x}.
◮ The verb is “is” and this defines the floor to be an equation.
So, consider, The floor of a real number x is ⌊x⌋ = max {n ∈ Z|, n ≤ x}.
◮ The verb is “is” and this defines the floor to be an equation.
Illogical .
So, consider, The floor of a real number x is ⌊x⌋ = max {n ∈ Z|, n ≤ x}.
◮ The verb is “is” and this defines the floor to be an equation.
Illogical . What about The floor of a real number x denoted ⌊x⌋ is defined as max {n ∈ Z|, n ≤ x}. The Grammar According to West on “Double-Duty definitions”.
Make it readable
◮ Don’t start a sentence with a symbol.
It is hard to parse. xi
t is my most common
variable.
◮ Don’t have a symbol at the start of a clause after one at the
end of a clause. Bad: If x > 0 y < 0. Better: If x > 0 then y < 0.
◮ Don’t introduce too many symbols too close together.
A =
˜ Hexp (−xH∗
y )
A+B
, where ˜ H is . . . (long list), is sure hard to read.
Smaller rules
Use L
AT
EX: Nothing I have seen or used produces mathematics which looks as good or is ultimately as easy to use. Sometimes hard when you collaborate with WORD-using biologists, psychologists, etc. Find a good editing tool: I use GNU emacs, with AUCTex and RefTex modes. It can do a lot of L
A
T EXfor you, and it helps you label sections, figures, and equations consistently.
Common L
A
T EXerrors with displayed equations
See if you can find the problems (and suggest fixes) An equation we learn in Physics class is F = ma . This tells us how acceleration of a mass is related to the force acting on it. An equation we learn in Physics class is \begin{displaymath} F = m a \end{displaymath} This tells us how acceleration of a mass is related to the force acting on it.
Some more L
A
T EXthings
◮ The symbol ˜ produces an unbreakable space. So write
Figure˜3 or Equation˜\ref{eq:mainEquation} to prevent line-breaking to separate the word from the number.
◮ Dashes, there are three kinds.
- 1. Hyphen — used to group modifiers. Example is “worst-case
bound”. Without the hyphen, it is a worst bound and a case
- bound. In L
A
T EX, this is written with a single dash -.
- 2. En-dash — Number ranges use something called the
en-dashed. This is written with two dashs. For example, pages 22--31 produces pages 22–31. Also, according to Higham, this form is used to join compound words which do not modify each others, e.g. the Turing–Church Hypothesis.
- 3. Em-dash — A dash separating two clauses in a sentence