15-150 Fall 2020 Lecture 18 Sequences and parallelism Stephen - - PowerPoint PPT Presentation

15 150 fall 2020
SMART_READER_LITE
LIVE PREVIEW

15-150 Fall 2020 Lecture 18 Sequences and parallelism Stephen - - PowerPoint PPT Presentation

15-150 Fall 2020 Lecture 18 Sequences and parallelism Stephen Brookes announcements Election Day! TAs will be posting lab solution videos (canvas, Thursday or Friday) TAs offer NEW weekly REVIEW SESSION (Thursday, 6:30pm Pittsburgh


slide-1
SLIDE 1

15-150 Fall 2020

Stephen Brookes

Lecture 18 Sequences and parallelism

slide-2
SLIDE 2

announcements

  • Election Day!
  • TAs will be posting lab solution videos

(canvas, Thursday or Friday)

  • TAs offer NEW weekly REVIEW SESSION

(Thursday, 6:30pm Pittsburgh time)

slide-3
SLIDE 3

sequences

signature SEQ = sig type 'a seq exception Range val nth : int -> 'a seq -> 'a val length : 'a seq -> int val tabulate : (int -> 'a) -> int -> 'a seq val empty : unit -> 'a seq val map : ('a -> 'b) -> ('a seq -> 'b seq) val split : 'a seq -> 'a seq * 'a seq val reduce : ('a * 'a -> 'a) -> 'a -> 'a seq -> 'a val mapreduce : ('a -> 'b) -> ('b * 'b -> 'b) -> 'b -> 'a seq -> 'b end

note the type of mapreduce

slide-4
SLIDE 4
  • The SEQ signature can be implemented

in many different ways

  • Each implementation has its own

work/span characteristics

  • lists nth i S is O(n)
  • balanced trees nth i S is O(log n)
  • arrays nth i S is O(1)
slide-5
SLIDE 5

comments

  • Last time: vector-based sequences
  • When length S = n > 1 and split S = (L, R) we had

reduce g z S = g(reduce g z L, reduce g z R)

  • L and R have length ≈ n div 2

and we assumed g is constant-time, so we said

  • We forgot the work for split S, given as O(n) earlier,

so we should have said

  • The original answer is OK if split S has work O(1)

Wreduce(n) = 2Wreduce(n div 2) + 1 Wreduce(n) = 2Wreduce(n div 2) + O(n) Thanks to Sheng-Hsiang Sun for spotting this!

slide-6
SLIDE 6

your task

  • Given a structure Seq : SEQ

with known work/span characteristics

  • Design correct and efficient solutions

to some parallelizable problems

  • prove correctness
  • calculate work and span
slide-7
SLIDE 7

comments

  • We can also talk about our sequence
  • perations abstractly, in a way that’s

independent of the implementation

  • For example, just using SEQ functions:
  • Although map may not be defined this way

in the Seq structure, the equation is valid

(Both sides represent the same sequence)

  • But be careful: equal expressions may have

different work, span

(This is obvious, if you think about it!)

map f S = tabulate (fn i => f(nth i S)) (length S)

slide-8
SLIDE 8

behavior

nth i ⟨v0,…,vn-1⟩ = vi length ⟨v0,…,vn-1⟩ = n tabulate f n = ⟨f(0), …, f(n-1)⟩ empty( ) = ⟨ ⟩ if 0 ≤ i < n

slide-9
SLIDE 9

behavior

nth i ⟨v0,…,vn-1⟩ = vi length ⟨v0,…,vn-1⟩ = n tabulate f n = ⟨f(0), …, f(n-1)⟩ empty( ) = ⟨ ⟩ if 0 ≤ i < n split ⟨v0,…,vn-1⟩ = (⟨v0,…,vm-1⟩, ⟨vm,…,vn-1⟩) where m = n div 2

slide-10
SLIDE 10

reduce

fun reduce g z s = case (length s) of 0 => z | 1 => nth 0 s | _ => let val (s1, s2) = split s in g(reduce g z s1, reduce g z s2) end

reduce g z ⟨v1,…,vn⟩ = v1 g v2 … g vn

when g is total & associative, z an identity for g

slide-11
SLIDE 11

mapreduce

fun mapreduce f g z s = case (length s) of 0 => z | 1 => f(nth 0 s) | _ => let val (s1, s2) = split s in g(mapreduce f g z s1, mapreduce f g z s2) end

mapreduce f g z ⟨v1,…,vn⟩ = (f v1) g (f v2) … g (f vn)

when g is total & associative, z an identity for g

slide-12
SLIDE 12

lecture notes

  • I added a new section about

associativity and identity elements

  • Remember that reduce and mapreduce

should only be used with suitable g, z

  • The Lecture Notes include a proof of

correctness for reduce

  • Shows why you need these properties!
slide-13
SLIDE 13

lecture notes

  • I added a new section about

associativity and identity elements

  • Remember that reduce and mapreduce

should only be used with suitable g, z

  • The Lecture Notes include a proof of

correctness for reduce

  • Shows why you need these properties!

You should be reading the notes, too!

slide-14
SLIDE 14

example

reduce (op +) 0 ⟨v1, v2⟩ = (op +) (reduce (op +) 0 ⟨v1⟩, reduce (op +) 0 ⟨v2⟩) = (reduce (op +) 0 ⟨v1⟩) + (reduce (op +) 0 ⟨v2⟩) = (v1 + 0) + (v2 + 0) = v1 + v2

reduce g z behaves “correctly” when g is associative and z is an identity element

reduce (op +) 0 ⟨v1, v2⟩ = v1 + v2 + 0

slide-15
SLIDE 15

example

reduce (op +) 21 ⟨v1, v2⟩ = (op +) (reduce (op +) 21 ⟨v1⟩, reduce (op +) 21 ⟨v2⟩) = (reduce (op +) 21 ⟨v1⟩) + (reduce (op +) 21 ⟨v2⟩) = (v1 + 21) + (v2 + 21) = v1 + v2 + 42 reduce (op +) 21 ⟨v1, v2⟩ ≠ v1 + v2 + 21

slide-16
SLIDE 16

thinking abstractly

  • Use cost semantics to predict

work and span of code

  • before testing or correctness analysis
  • Use behavioral specs to help us

design correct code

  • Use inductive proofs to validate specs

and confirm cost analysis

slide-17
SLIDE 17

modular thinking

  • Don’t look inside the structure

implementing SEQ

  • Just refer to the signature…
  • … and the specs
slide-18
SLIDE 18

work/span

expression work span nth i s O(1) O(1) length s O(1) O(1) tabulate f n O(n) O(1) empty( ) O(1) O(1) map f s O(n) O(1) reduce g z s O(n) O(log n)

mapreduce f g z s

O(n) O(log n)

split s

O(1) O(1)

when length of s is n, and f, g are constant time Assume we have an implementation of SEQ with

+ same specs as before

slide-19
SLIDE 19

gravitation

  • Newtonian laws
  • Simulate the motion of planets
  • for n bodies, this is O(n2) work
  • Using sequences and parallel operations

is very natural (!)

  • each body can calculate its

step-by-step trajectory, independently

  • Will be faster than using lists

and sequential evaluation

slide-20
SLIDE 20

n bodies n2 forces

slide-21
SLIDE 21

Newton’s laws

  • Point masses attract each other with a force

proportional to the product of the masses and the inverse square of the distance

  • Spherical bodies behave like point masses

Newton, 1687 F = G m1 m2 / r2

slide-22
SLIDE 22

laws of motion

Law 4: There is no Law 4. Law 1: If an object experiences no net force, its velocity is constant:

  • it moves in a straight line, with constant speed.

Law 2: The acceleration of a body is parallel and proportional to the net force acting on the body, and inversely proportional to the mass of the body, i.e., F = m a. Law 3: When one body exerts a force F on a second body, the second body exerts an equal but opposite force −F on the first.

slide-23
SLIDE 23
slide-24
SLIDE 24

vectors

Velocity, force and acceleration are vectors

  • Vectors have magnitude and direction
  • Vectors can be added
  • Vectors can be multiplied by a scalar

velocity + velocity = velocity acceleration + acceleration = acceleration scalar * acceleration = acceleration scalar * velocity = velocity speed = magnitude of velocity

slide-25
SLIDE 25
  • ur version
  • 2-dimensional universe
  • Scalars are real numbers
  • Vectors are pairs of type real * real

Easy to generalize...

slide-26
SLIDE 26

bodies

  • A body has position, mass, and velocity
  • Positions are points, pairs of real numbers
  • A mass is a (positive) real number
  • A velocity is a 2D-vector
  • also represented as a pair of reals

type body = point * real * vect type vect = real * real type point = real * real

slide-27
SLIDE 27

vectors

type vect = real * real val zero : vect val add : vect * vect -> vect val scale : real * vect -> vect val mag : vect -> real signature VECT = sig end …

slide-28
SLIDE 28

structure Vect : VECT = struct type vect = real * real val zero = (0.0, 0.0) fun add ((x1, y1), (x2, y2)) = (x1+x2 , y1+y2) fun scale(c, (x, y)) = (c * x , c * y) fun mag (x, y) = Math.sqrt (x * x + y * y) end

slide-29
SLIDE 29

points

type point = real * real fun diff ((x1,y1):point, (x2,y2):point) : vect = (x2 - x1, y2 - y1) fun displace ((x,y):point, (x',y'):vect) : point = (x + x', y + y')

slide-30
SLIDE 30

bodies

type body = point * real * vect val sun = ((0.0,0.0), 332000.0, (0.0,0.0)) val earth = ((1.0, 0.0), 1.0, (0.0,18.0)) distance from sun to earth = one “astronomical unit” sun is 332000 times more massive the sun’s (relative) velocity is zero mass, velocity position, ( )

slide-31
SLIDE 31

motion

  • To calculate the motion of a body in a timestep
  • find the net acceleration due to other bodies
  • adjust the position and velocity of the body
slide-32
SLIDE 32

accel

accel : body -> body -> vect accel b1 b2 = acceleration on b1 due to gravitational attraction of b2

use default of zero when bodies are too close

slide-33
SLIDE 33

accel

fun accel (p1, _, _) (p2, m2, _) = let val d = diff(p1, p2) val r = mag d in if r < 0.1 then zero else scale(G * m2/(r*r*r) , d) end accel : body -> body -> vect accel b1 b2 = acceleration on b1 due to gravitational attraction of b2

use default of zero when bodies are too close

slide-34
SLIDE 34

accel

. p1 m1 p2 m2 . Gm2/r2 r = distance from p1 to p2 = acceleration on b1 due to b2 Gm2/r2 b1 b2

slide-35
SLIDE 35

accel

. p1 m1 p2 m2 . Gm1/r2 r = distance from p1 to p2 = acceleration on b2 due to b1 Gm1/r2 b1 b2

slide-36
SLIDE 36

accels : body -> body seq -> vect

accels

accels b s = net acceleration on b due to gravitational attraction

  • f the bodies in s
slide-37
SLIDE 37

accels : body -> body seq -> vect

accels

accels b <b1,...,bn> = accel b b1 + ... + accel b bn

accels b s = net acceleration on b due to gravitational attraction

  • f the bodies in s
slide-38
SLIDE 38

accels : body -> body seq -> vect

accels

accels b <b1,...,bn> = accel b b1 + ... + accel b bn

accels b s = net acceleration on b due to gravitational attraction

  • f the bodies in s

(vector sum)

slide-39
SLIDE 39

accels : body -> body seq -> vect

fun accels b s = mapreduce (accel b) add zero s

accels

accels b <b1,...,bn> = accel b b1 + ... + accel b bn

accels b s = net acceleration on b due to gravitational attraction

  • f the bodies in s

(vector sum)

slide-40
SLIDE 40

fun move (p, m, v) (a, dt) = let val dp = add(scale(dt,v), scale(0.5*dt*dt, a)) val dv = scale(dt, a) in (add(p, dp), m, add(v, dv)) end

move : body -> vect * real -> body

move (p, m, v) (a, dt) = (p', m, v') v' = v + a dt p' = p + v dt + 1/2 a dt2 Newtonian calculus, too!

move

slide-41
SLIDE 41

move

move (p, m, v) (a, dt) = (p’, m, v’) body at p, mass m, velocity v acted on by force F = m * a moves to p’ for time dt and its velocity changes to v’ v v’ a when m m p p’

slide-42
SLIDE 42

p v a p’ = p + v dt + 1/2 a dt2

updating position

slide-43
SLIDE 43

step : real -> body seq -> body seq

fun step dt s = map (fn b => move b (accels b s, dt)) s

parallel evaluation

  • each body calculates its own update

step dt ⟨b1, b2, ..., bN⟩ = ⟨b1’, b2’, ..., bN’⟩ where, for each i, bi’ = move bi (ai, dt) and ai = accels bi ⟨b1, b2, ..., bN⟩

step

slide-44
SLIDE 44

efficiency

  • What are the work and span for

accel bi bj accels bi ⟨b1, ..., bN⟩ move b (a, dt) step dt ⟨b1, ..., bN⟩

?

slide-45
SLIDE 45

work/span

expression work span nth i s O(1) O(1) length s O(1) O(1) tabulate f n O(n) O(1) empty( ) O(1) O(1) map f s O(n) O(1) reduce g z s O(n) O(log n)

mapreduce f g z s

O(n) O(log n)

split s

O(1) O(1)

when length of s is n, and f, g are constant time Assume we have an implementation of SEQ with

slide-46
SLIDE 46

accel

accel (p1, m1, v1) (p2, m2, v2) = let val d = diff(p1, p2) val r = mag d in if r < 0.1 then zero else scale(G * m2/(r*r*r) , d) end

work, span O(1) accel b1 b2 has work O(1) span O(1)

slide-47
SLIDE 47

accels

accels bi ⟨b1, ..., bN⟩ = mapreduce (accel b) add zero ⟨b1, ..., bN⟩

slide-48
SLIDE 48

accels

work, span O(1) accels bi ⟨b1, ..., bN⟩ = mapreduce (accel b) add zero ⟨b1, ..., bN⟩

slide-49
SLIDE 49

accels

work, span O(1) accels bi ⟨b1, ..., bN⟩ = mapreduce (accel b) add zero ⟨b1, ..., bN⟩ mapreduce f g z ⟨b1, ..., bN⟩ applies f N times in parallel and combines using g f b1 f b2 ... f bN g ... g g

log2 N

cost graph

slide-50
SLIDE 50

accels

work, span O(1) accels bi ⟨b1, ..., bN⟩ = mapreduce (accel b) add zero ⟨b1, ..., bN⟩ has work O(N), span O(log N) accels bi ⟨b1, ..., bN⟩ mapreduce f g z ⟨b1, ..., bN⟩ applies f N times in parallel and combines using g f b1 f b2 ... f bN g ... g g

log2 N

cost graph

slide-51
SLIDE 51

move

move (p, m, v) (a, dt) = let val p' = displace(p, add(scale(dt,v), scale(0.5*dt*dt, a))) val v' = add(v, scale(dt, a)) in (p', m, v') end

slide-52
SLIDE 52

move

move (p, m, v) (a, dt) = let val p' = displace(p, add(scale(dt,v), scale(0.5*dt*dt, a))) val v' = add(v, scale(dt, a)) in (p', m, v') end

work, span O(1)

slide-53
SLIDE 53

move

move (p, m, v) (a, dt) = let val p' = displace(p, add(scale(dt,v), scale(0.5*dt*dt, a))) val v' = add(v, scale(dt, a)) in (p', m, v') end

work, span O(1) work, span O(1)

slide-54
SLIDE 54

move

move (p, m, v) (a, dt) = let val p' = displace(p, add(scale(dt,v), scale(0.5*dt*dt, a))) val v' = add(v, scale(dt, a)) in (p', m, v') end

work, span O(1) work, span O(1) has work, span O(1)

move (p, m, v) (a, dt)

slide-55
SLIDE 55

step

step dt s = map (fn b => move b (accels b s, dt)) s

Let s be ⟨b1, ..., bN⟩

slide-56
SLIDE 56

step

work O(N), span O(log N) step dt s = map (fn b => move b (accels b s, dt)) s

Let s be ⟨b1, ..., bN⟩

slide-57
SLIDE 57

step

work O(N), span O(log N) map f s step dt s = map (fn b => move b (accels b s, dt)) s

Let s be ⟨b1, ..., bN⟩

slide-58
SLIDE 58

step

work O(N), span O(log N) map f s step dt s = map (fn b => move b (accels b s, dt)) s

Let s be ⟨b1, ..., bN⟩

N sequential calls N parallel calls step dt ⟨b1, ..., bN⟩ has work O(N2), span O(log N)

slide-59
SLIDE 59

cost analysis

O(1) O(1) O(N) O(log N) O(1) O(1) O(N2) O(log N)

accel bi bj accels bi ⟨b1, ..., bN⟩ move b (a, dt) step dt ⟨b1, ..., bN⟩ work span (using sequences)

slide-60
SLIDE 60

cost analysis

accels bi ⟨b1, ..., bN⟩ step dt ⟨b1, ..., bN⟩ work span fun accels b (L : body list) = foldr add zero (List.map (accel b) L) fun step dt (L : body list) = List.map (fn b => move b (accels b L, dt)) L O(N) O(N) O(N2) O(N2) (using lists)

slide-61
SLIDE 61

summary

  • Simulate the motion of planets
  • for N bodies, this is O(N2) work
  • Using sequences and parallel operations
  • each body can calculate its

step-by-step trajectory, independently

  • Using lists and sequential operations

step dt ⟨b1, ..., bN⟩ has work O(N2) step dt ⟨b1, ..., bN⟩ has span O(log N) step dt ⟨b1, ..., bN⟩ has span O(N2)

slide-62
SLIDE 62

illustration

  • Let’s look at a simple example
  • Just the SUN and the EARTH
  • We’ll see that the Earth’s trajectory

looks like an ellipse, as predicted by Kepler

slide-63
SLIDE 63

mini-solar system

val sun = ((0.0,0.0), 332000.0, (0.0,0.0)) val earth = ((1.0, 0.0), 1.0, (0.0,18.0)) val us : body seq = tabulate (fn 0 => sun | 1 => earth | _ => raise Range) 2 step us 0.01 =>* ⟨((5E~05,0.0),332000.0,(0.01,0.0)), ((~15.6,0.18),1.0,(~3320.0,18.0))⟩ us = ⟨sun, earth⟩

slide-64
SLIDE 64
  • rbit

fun orbit b (n, dt) = if n=0 then [ ] else let val (p', m, v') = move b (accel b sun, dt) in p' :: orbit (p', m, v') (n-1, dt) end;

  • rbit : body -> int * real -> point list
  • rbit b (n, dt) = first n positions of b in orbit around sun
slide-65
SLIDE 65

results

  • rbit earth (10, 0.01) =

[(~15.6,0.18),(~48.7318019171,0.359213099043), (~81.7884162248,0.537587775754), (~114.835559608,0.71589462109), (~147.878962872,0.894177309445), (~180.920348361,1.07244756107), (~213.960467679,1.25071021688), (~246.999717285,1.42896774708), (~280.03833222,1.60722158368), (~313.076463412,1.78547263142)]

slide-66
SLIDE 66

shaped like an ellipse

slide-67
SLIDE 67

conclusion

  • Sequences allow efficient parallel evaluation
  • O(log N) is better than O(N), O(N2)
  • In practice, may deliver noticeable speed-up
  • But there’s still room for improvement…
slide-68
SLIDE 68

an improvement

  • Barnes-Hut algorithm
  • Uses quad-tree representation for bodies
  • To compute gravity, replace a far-away

cluster of bodies with a single point mass at its barycenter

slide-69
SLIDE 69

barycenter

barycenter : (real * point) seq -> real * point barycenter ⟨(m1,p1),…,(mk,pk)⟩ = (M, P) M = m1+…+mk P = scale(1/M)(scale(m1,p1) +…+ scale(mk, pk))

slide-70
SLIDE 70

bounding boxes

signature Box = sig type box val inside : box -> point -> bool val quadrants : box -> box seq … end

slide-71
SLIDE 71

barnes-hut trees

datatype bhtree = Empty | Single of (real * point) | Quad of box * (real * point) * bhtree seq

INVARIANT: For every Quad (Box, (M, P), S)

  • length S = 4
  • all bodies in S are inside Box,
  • M is the total mass of these bodies
  • P is their barycenter
slide-72
SLIDE 72

building a bhtree

bh : body seq -> bhtree

slide-73
SLIDE 73

too far away

aspect : body * box * (real * point) -> real

.

.

.. . . .

.

aspect(b, Box, (M, P)) is smaller when Box has small diameter, P is far away from b, M not too large actual acceleration approximation ≈

slide-74
SLIDE 74

Barnes-Hut

bh_accels : real -> bhtree -> body -> vect ENSURES bh_accels theta T b2 = the acceleration on b2 due to the bodies in T, as computed by Barnes-Hut with threshold theta fun bh_accels theta Empty b2 = zero | bh_accels theta (Single b1) b2 = accel b1 b2 | bh_accels theta (Quad (Box, (M, P), S)) b2 = if aspect (b2, Box, (M, P)) > theta then mapreduce add zero (fn T => bh_accels theta T b2) S else accel (M, P) b2 end

slide-75
SLIDE 75

summary

  • We sketched how to implement the

Barnes-Hut algorithm in ML

  • Benefits of good choice of data structure,

use of invariant to guide code design

  • In practice, Barnes-Hut is often used to

produce graphics simulations

  • We think it’s a nice example of elegant

functional programming(!)

slide-76
SLIDE 76

comments

  • In practice it can be hard to work with reals
  • Rounding errors, sensitivity to evaluation order
  • Not easy to check for “equality”
  • Need to be aware of these issues

when writing code

slide-77
SLIDE 77

lessons

  • Design code to meet specifications
  • Use specs of helper functions

to prove correctness of code

  • Use work/span of helper functions

to determine work/span of code

  • Implement abstract types using invariants
  • An invariant expresses key insights

into how data is represented

  • Good examples: red-black trees, Barnes-Hut trees
slide-78
SLIDE 78

lessons

  • Write code to allow parallel evaluation
  • Choose implementations wisely

based on signature AND specs AND work/span Seq.map f List.map f val (x, y) = (e1, e2) val x = e1; val y = e2 parallel sequential

slide-79
SLIDE 79

exploration

  • Sequences can be implemented as lists,

arrays or balanced trees

  • Do it yourself: define structures
  • Compare the work/span characteristics

ListSeq : SEQ ArraySeq : SEQ BalancedTreeSeq : SEQ

slide-80
SLIDE 80

exploration

expression work span nth i s O(1) O(1) length s O(1) O(1) tabulate f n O(n) O(1) empty( ) O(1) O(1) map f s O(n) O(1) reduce g z s O(n log n) O(log n)

mapreduce f g z s

O(n log n) O(log n)

split s

O(n) O(1)

when length of s is n, and f, g are constant time What would change if we had an implementation of SEQ with

?