[PPT] - 15-150 Fall 2020 Lecture 18 Sequences and parallelism Stephen PowerPoint Presentation

SLIDE 1

15-150 Fall 2020

Stephen Brookes

Lecture 18 Sequences and parallelism

SLIDE 2

announcements

Election Day!
TAs will be posting lab solution videos

(canvas, Thursday or Friday)

TAs offer NEW weekly REVIEW SESSION

(Thursday, 6:30pm Pittsburgh time)

SLIDE 3

sequences

signature SEQ = sig type 'a seq exception Range val nth : int -> 'a seq -> 'a val length : 'a seq -> int val tabulate : (int -> 'a) -> int -> 'a seq val empty : unit -> 'a seq val map : ('a -> 'b) -> ('a seq -> 'b seq) val split : 'a seq -> 'a seq * 'a seq val reduce : ('a * 'a -> 'a) -> 'a -> 'a seq -> 'a val mapreduce : ('a -> 'b) -> ('b * 'b -> 'b) -> 'b -> 'a seq -> 'b end

note the type of mapreduce

SLIDE 4

The SEQ signature can be implemented

in many different ways

Each implementation has its own

work/span characteristics

lists nth i S is O(n)
balanced trees nth i S is O(log n)
arrays nth i S is O(1)

SLIDE 5

comments

Last time: vector-based sequences
When length S = n > 1 and split S = (L, R) we had

reduce g z S = g(reduce g z L, reduce g z R)

L and R have length ≈ n div 2

and we assumed g is constant-time, so we said

We forgot the work for split S, given as O(n) earlier,

so we should have said

The original answer is OK if split S has work O(1)

Wreduce(n) = 2Wreduce(n div 2) + 1 Wreduce(n) = 2Wreduce(n div 2) + O(n) Thanks to Sheng-Hsiang Sun for spotting this!

SLIDE 6

your task

Given a structure Seq : SEQ

with known work/span characteristics

Design correct and efficient solutions

to some parallelizable problems

prove correctness
calculate work and span

SLIDE 7

comments

We can also talk about our sequence
perations abstractly, in a way that’s

independent of the implementation

For example, just using SEQ functions:
Although map may not be defined this way

in the Seq structure, the equation is valid

(Both sides represent the same sequence)

But be careful: equal expressions may have

different work, span

(This is obvious, if you think about it!)

map f S = tabulate (fn i => f(nth i S)) (length S)

SLIDE 8

behavior

nth i ⟨v0,…,vn-1⟩ = vi length ⟨v0,…,vn-1⟩ = n tabulate f n = ⟨f(0), …, f(n-1)⟩ empty( ) = ⟨ ⟩ if 0 ≤ i < n

SLIDE 9

behavior

nth i ⟨v0,…,vn-1⟩ = vi length ⟨v0,…,vn-1⟩ = n tabulate f n = ⟨f(0), …, f(n-1)⟩ empty( ) = ⟨ ⟩ if 0 ≤ i < n split ⟨v0,…,vn-1⟩ = (⟨v0,…,vm-1⟩, ⟨vm,…,vn-1⟩) where m = n div 2

SLIDE 10

reduce

fun reduce g z s = case (length s) of 0 => z | 1 => nth 0 s | _ => let val (s1, s2) = split s in g(reduce g z s1, reduce g z s2) end

reduce g z ⟨v1,…,vn⟩ = v1 g v2 … g vn

when g is total & associative, z an identity for g

SLIDE 11

mapreduce

fun mapreduce f g z s = case (length s) of 0 => z | 1 => f(nth 0 s) | _ => let val (s1, s2) = split s in g(mapreduce f g z s1, mapreduce f g z s2) end

mapreduce f g z ⟨v1,…,vn⟩ = (f v1) g (f v2) … g (f vn)

when g is total & associative, z an identity for g

SLIDE 12

lecture notes

I added a new section about

associativity and identity elements

Remember that reduce and mapreduce

should only be used with suitable g, z

The Lecture Notes include a proof of

correctness for reduce

Shows why you need these properties!

SLIDE 13

lecture notes

I added a new section about

associativity and identity elements

Remember that reduce and mapreduce

should only be used with suitable g, z

The Lecture Notes include a proof of

correctness for reduce

Shows why you need these properties!

You should be reading the notes, too!

SLIDE 14

example

reduce (op +) 0 ⟨v1, v2⟩ = (op +) (reduce (op +) 0 ⟨v1⟩, reduce (op +) 0 ⟨v2⟩) = (reduce (op +) 0 ⟨v1⟩) + (reduce (op +) 0 ⟨v2⟩) = (v1 + 0) + (v2 + 0) = v1 + v2

reduce g z behaves “correctly” when g is associative and z is an identity element

reduce (op +) 0 ⟨v1, v2⟩ = v1 + v2 + 0

SLIDE 15

example

reduce (op +) 21 ⟨v1, v2⟩ = (op +) (reduce (op +) 21 ⟨v1⟩, reduce (op +) 21 ⟨v2⟩) = (reduce (op +) 21 ⟨v1⟩) + (reduce (op +) 21 ⟨v2⟩) = (v1 + 21) + (v2 + 21) = v1 + v2 + 42 reduce (op +) 21 ⟨v1, v2⟩ ≠ v1 + v2 + 21

SLIDE 16

thinking abstractly

Use cost semantics to predict

work and span of code

before testing or correctness analysis
Use behavioral specs to help us

design correct code

Use inductive proofs to validate specs

and confirm cost analysis

SLIDE 17

modular thinking

Don’t look inside the structure

implementing SEQ

Just refer to the signature…
… and the specs

SLIDE 18

work/span

expression work span nth i s O(1) O(1) length s O(1) O(1) tabulate f n O(n) O(1) empty( ) O(1) O(1) map f s O(n) O(1) reduce g z s O(n) O(log n)

mapreduce f g z s

O(n) O(log n)

split s

O(1) O(1)

when length of s is n, and f, g are constant time Assume we have an implementation of SEQ with

+ same specs as before

SLIDE 19

gravitation

Newtonian laws
Simulate the motion of planets
for n bodies, this is O(n2) work
Using sequences and parallel operations

is very natural (!)

each body can calculate its

step-by-step trajectory, independently

Will be faster than using lists

and sequential evaluation

SLIDE 20

n bodies n2 forces

SLIDE 21

Newton’s laws

Point masses attract each other with a force

proportional to the product of the masses and the inverse square of the distance

Spherical bodies behave like point masses

Newton, 1687 F = G m1 m2 / r2

SLIDE 22

laws of motion

Law 4: There is no Law 4. Law 1: If an object experiences no net force, its velocity is constant:

it moves in a straight line, with constant speed.

Law 2: The acceleration of a body is parallel and proportional to the net force acting on the body, and inversely proportional to the mass of the body, i.e., F = m a. Law 3: When one body exerts a force F on a second body, the second body exerts an equal but opposite force −F on the first.

SLIDE 23

SLIDE 24

vectors

Velocity, force and acceleration are vectors

Vectors have magnitude and direction
Vectors can be added
Vectors can be multiplied by a scalar

velocity + velocity = velocity acceleration + acceleration = acceleration scalar * acceleration = acceleration scalar * velocity = velocity speed = magnitude of velocity

SLIDE 25

ur version
2-dimensional universe
Scalars are real numbers
Vectors are pairs of type real * real

Easy to generalize...

SLIDE 26

bodies

A body has position, mass, and velocity
Positions are points, pairs of real numbers
A mass is a (positive) real number
A velocity is a 2D-vector
also represented as a pair of reals

type body = point * real * vect type vect = real * real type point = real * real

SLIDE 27

vectors

type vect = real * real val zero : vect val add : vect * vect -> vect val scale : real * vect -> vect val mag : vect -> real signature VECT = sig end …

SLIDE 28

structure Vect : VECT = struct type vect = real * real val zero = (0.0, 0.0) fun add ((x1, y1), (x2, y2)) = (x1+x2 , y1+y2) fun scale(c, (x, y)) = (c * x , c * y) fun mag (x, y) = Math.sqrt (x * x + y * y) end

SLIDE 29

points

type point = real * real fun diff ((x1,y1):point, (x2,y2):point) : vect = (x2 - x1, y2 - y1) fun displace ((x,y):point, (x',y'):vect) : point = (x + x', y + y')

SLIDE 30

bodies

type body = point * real * vect val sun = ((0.0,0.0), 332000.0, (0.0,0.0)) val earth = ((1.0, 0.0), 1.0, (0.0,18.0)) distance from sun to earth = one “astronomical unit” sun is 332000 times more massive the sun’s (relative) velocity is zero mass, velocity position, ( )

SLIDE 31

motion

To calculate the motion of a body in a timestep
find the net acceleration due to other bodies
adjust the position and velocity of the body

SLIDE 32

accel

accel : body -> body -> vect accel b1 b2 = acceleration on b1 due to gravitational attraction of b2

use default of zero when bodies are too close

SLIDE 33

accel

fun accel (p1, _, _) (p2, m2, _) = let val d = diff(p1, p2) val r = mag d in if r < 0.1 then zero else scale(G * m2/(r*r*r) , d) end accel : body -> body -> vect accel b1 b2 = acceleration on b1 due to gravitational attraction of b2

use default of zero when bodies are too close

SLIDE 34

accel

. p1 m1 p2 m2 . Gm2/r2 r = distance from p1 to p2 = acceleration on b1 due to b2 Gm2/r2 b1 b2

SLIDE 35

accel

. p1 m1 p2 m2 . Gm1/r2 r = distance from p1 to p2 = acceleration on b2 due to b1 Gm1/r2 b1 b2

SLIDE 36

accels : body -> body seq -> vect

accels

accels b s = net acceleration on b due to gravitational attraction

f the bodies in s

SLIDE 37

accels : body -> body seq -> vect

accels

accels b <b1,...,bn> = accel b b1 + ... + accel b bn

accels b s = net acceleration on b due to gravitational attraction

f the bodies in s

SLIDE 38

accels : body -> body seq -> vect

accels

accels b <b1,...,bn> = accel b b1 + ... + accel b bn

accels b s = net acceleration on b due to gravitational attraction

f the bodies in s

(vector sum)

SLIDE 39

accels : body -> body seq -> vect

fun accels b s = mapreduce (accel b) add zero s

accels

accels b <b1,...,bn> = accel b b1 + ... + accel b bn

accels b s = net acceleration on b due to gravitational attraction

f the bodies in s

(vector sum)

SLIDE 40

fun move (p, m, v) (a, dt) = let val dp = add(scale(dt,v), scale(0.5*dt*dt, a)) val dv = scale(dt, a) in (add(p, dp), m, add(v, dv)) end

move : body -> vect * real -> body

move (p, m, v) (a, dt) = (p', m, v') v' = v + a dt p' = p + v dt + 1/2 a dt2 Newtonian calculus, too!

move

SLIDE 41

move

move (p, m, v) (a, dt) = (p’, m, v’) body at p, mass m, velocity v acted on by force F = m * a moves to p’ for time dt and its velocity changes to v’ v v’ a when m m p p’

SLIDE 42

p v a p’ = p + v dt + 1/2 a dt2

updating position

SLIDE 43

step : real -> body seq -> body seq

fun step dt s = map (fn b => move b (accels b s, dt)) s

parallel evaluation

each body calculates its own update

step dt ⟨b1, b2, ..., bN⟩ = ⟨b1’, b2’, ..., bN’⟩ where, for each i, bi’ = move bi (ai, dt) and ai = accels bi ⟨b1, b2, ..., bN⟩

step

SLIDE 44

efficiency

What are the work and span for

accel bi bj accels bi ⟨b1, ..., bN⟩ move b (a, dt) step dt ⟨b1, ..., bN⟩

?

SLIDE 45

work/span

expression work span nth i s O(1) O(1) length s O(1) O(1) tabulate f n O(n) O(1) empty( ) O(1) O(1) map f s O(n) O(1) reduce g z s O(n) O(log n)

mapreduce f g z s

O(n) O(log n)

split s

O(1) O(1)

when length of s is n, and f, g are constant time Assume we have an implementation of SEQ with

SLIDE 46

accel

accel (p1, m1, v1) (p2, m2, v2) = let val d = diff(p1, p2) val r = mag d in if r < 0.1 then zero else scale(G * m2/(r*r*r) , d) end

work, span O(1) accel b1 b2 has work O(1) span O(1)

SLIDE 47

accels

accels bi ⟨b1, ..., bN⟩ = mapreduce (accel b) add zero ⟨b1, ..., bN⟩

SLIDE 48

accels

work, span O(1) accels bi ⟨b1, ..., bN⟩ = mapreduce (accel b) add zero ⟨b1, ..., bN⟩

SLIDE 49

accels

work, span O(1) accels bi ⟨b1, ..., bN⟩ = mapreduce (accel b) add zero ⟨b1, ..., bN⟩ mapreduce f g z ⟨b1, ..., bN⟩ applies f N times in parallel and combines using g f b1 f b2 ... f bN g ... g g

log2 N

cost graph

SLIDE 50

accels

work, span O(1) accels bi ⟨b1, ..., bN⟩ = mapreduce (accel b) add zero ⟨b1, ..., bN⟩ has work O(N), span O(log N) accels bi ⟨b1, ..., bN⟩ mapreduce f g z ⟨b1, ..., bN⟩ applies f N times in parallel and combines using g f b1 f b2 ... f bN g ... g g

log2 N

cost graph

SLIDE 51

move

move (p, m, v) (a, dt) = let val p' = displace(p, add(scale(dt,v), scale(0.5*dt*dt, a))) val v' = add(v, scale(dt, a)) in (p', m, v') end

SLIDE 52

move

move (p, m, v) (a, dt) = let val p' = displace(p, add(scale(dt,v), scale(0.5*dt*dt, a))) val v' = add(v, scale(dt, a)) in (p', m, v') end

work, span O(1)

SLIDE 53

move

move (p, m, v) (a, dt) = let val p' = displace(p, add(scale(dt,v), scale(0.5*dt*dt, a))) val v' = add(v, scale(dt, a)) in (p', m, v') end

work, span O(1) work, span O(1)

SLIDE 54

move

move (p, m, v) (a, dt) = let val p' = displace(p, add(scale(dt,v), scale(0.5*dt*dt, a))) val v' = add(v, scale(dt, a)) in (p', m, v') end

work, span O(1) work, span O(1) has work, span O(1)

move (p, m, v) (a, dt)

SLIDE 55

step

step dt s = map (fn b => move b (accels b s, dt)) s

Let s be ⟨b1, ..., bN⟩

SLIDE 56

step

work O(N), span O(log N) step dt s = map (fn b => move b (accels b s, dt)) s

Let s be ⟨b1, ..., bN⟩

SLIDE 57

step

work O(N), span O(log N) map f s step dt s = map (fn b => move b (accels b s, dt)) s

Let s be ⟨b1, ..., bN⟩

SLIDE 58

step

work O(N), span O(log N) map f s step dt s = map (fn b => move b (accels b s, dt)) s

Let s be ⟨b1, ..., bN⟩

N sequential calls N parallel calls step dt ⟨b1, ..., bN⟩ has work O(N2), span O(log N)

SLIDE 59

cost analysis

O(1) O(1) O(N) O(log N) O(1) O(1) O(N2) O(log N)

accel bi bj accels bi ⟨b1, ..., bN⟩ move b (a, dt) step dt ⟨b1, ..., bN⟩ work span (using sequences)

SLIDE 60

cost analysis

accels bi ⟨b1, ..., bN⟩ step dt ⟨b1, ..., bN⟩ work span fun accels b (L : body list) = foldr add zero (List.map (accel b) L) fun step dt (L : body list) = List.map (fn b => move b (accels b L, dt)) L O(N) O(N) O(N2) O(N2) (using lists)

SLIDE 61

summary

Simulate the motion of planets
for N bodies, this is O(N2) work
Using sequences and parallel operations
each body can calculate its

step-by-step trajectory, independently

Using lists and sequential operations

step dt ⟨b1, ..., bN⟩ has work O(N2) step dt ⟨b1, ..., bN⟩ has span O(log N) step dt ⟨b1, ..., bN⟩ has span O(N2)

SLIDE 62

illustration

Let’s look at a simple example
Just the SUN and the EARTH
We’ll see that the Earth’s trajectory

looks like an ellipse, as predicted by Kepler

SLIDE 63

mini-solar system

val sun = ((0.0,0.0), 332000.0, (0.0,0.0)) val earth = ((1.0, 0.0), 1.0, (0.0,18.0)) val us : body seq = tabulate (fn 0 => sun | 1 => earth | _ => raise Range) 2 step us 0.01 =>* ⟨((5E~05,0.0),332000.0,(0.01,0.0)), ((~15.6,0.18),1.0,(~3320.0,18.0))⟩ us = ⟨sun, earth⟩

SLIDE 64

rbit

fun orbit b (n, dt) = if n=0 then [ ] else let val (p', m, v') = move b (accel b sun, dt) in p' :: orbit (p', m, v') (n-1, dt) end;

rbit : body -> int * real -> point list
rbit b (n, dt) = first n positions of b in orbit around sun

SLIDE 65

results

rbit earth (10, 0.01) =

[(~15.6,0.18),(~48.7318019171,0.359213099043), (~81.7884162248,0.537587775754), (~114.835559608,0.71589462109), (~147.878962872,0.894177309445), (~180.920348361,1.07244756107), (~213.960467679,1.25071021688), (~246.999717285,1.42896774708), (~280.03833222,1.60722158368), (~313.076463412,1.78547263142)]

SLIDE 66

shaped like an ellipse

SLIDE 67

conclusion

Sequences allow efficient parallel evaluation
O(log N) is better than O(N), O(N2)
In practice, may deliver noticeable speed-up
But there’s still room for improvement…

SLIDE 68

an improvement

Barnes-Hut algorithm
Uses quad-tree representation for bodies
To compute gravity, replace a far-away

cluster of bodies with a single point mass at its barycenter

SLIDE 69

barycenter

barycenter : (real * point) seq -> real * point barycenter ⟨(m1,p1),…,(mk,pk)⟩ = (M, P) M = m1+…+mk P = scale(1/M)(scale(m1,p1) +…+ scale(mk, pk))

SLIDE 70

bounding boxes

signature Box = sig type box val inside : box -> point -> bool val quadrants : box -> box seq … end

SLIDE 71

barnes-hut trees

datatype bhtree = Empty | Single of (real * point) | Quad of box * (real * point) * bhtree seq

INVARIANT: For every Quad (Box, (M, P), S)

length S = 4
all bodies in S are inside Box,
M is the total mass of these bodies
P is their barycenter

SLIDE 72

building a bhtree

bh : body seq -> bhtree

SLIDE 73

too far away

aspect : body * box * (real * point) -> real

.

.. . . .

.

aspect(b, Box, (M, P)) is smaller when Box has small diameter, P is far away from b, M not too large actual acceleration approximation ≈

SLIDE 74

Barnes-Hut

bh_accels : real -> bhtree -> body -> vect ENSURES bh_accels theta T b2 = the acceleration on b2 due to the bodies in T, as computed by Barnes-Hut with threshold theta fun bh_accels theta Empty b2 = zero | bh_accels theta (Single b1) b2 = accel b1 b2 | bh_accels theta (Quad (Box, (M, P), S)) b2 = if aspect (b2, Box, (M, P)) > theta then mapreduce add zero (fn T => bh_accels theta T b2) S else accel (M, P) b2 end

SLIDE 75

summary

We sketched how to implement the

Barnes-Hut algorithm in ML

Benefits of good choice of data structure,

use of invariant to guide code design

In practice, Barnes-Hut is often used to

produce graphics simulations

We think it’s a nice example of elegant

functional programming(!)

SLIDE 76

comments

In practice it can be hard to work with reals
Rounding errors, sensitivity to evaluation order
Not easy to check for “equality”
Need to be aware of these issues

when writing code

SLIDE 77

lessons

Design code to meet specifications
Use specs of helper functions

to prove correctness of code

Use work/span of helper functions

to determine work/span of code

Implement abstract types using invariants
An invariant expresses key insights

into how data is represented

Good examples: red-black trees, Barnes-Hut trees

SLIDE 78

lessons

Write code to allow parallel evaluation
Choose implementations wisely

based on signature AND specs AND work/span Seq.map f List.map f val (x, y) = (e1, e2) val x = e1; val y = e2 parallel sequential

SLIDE 79

exploration

Sequences can be implemented as lists,

arrays or balanced trees

Do it yourself: define structures
Compare the work/span characteristics

ListSeq : SEQ ArraySeq : SEQ BalancedTreeSeq : SEQ

SLIDE 80

exploration

expression work span nth i s O(1) O(1) length s O(1) O(1) tabulate f n O(n) O(1) empty( ) O(1) O(1) map f s O(n) O(1) reduce g z s O(n log n) O(log n)

mapreduce f g z s

O(n log n) O(log n)

split s

O(n) O(1)

when length of s is n, and f, g are constant time What would change if we had an implementation of SEQ with