Putting the Science in Computer Science What makes for a good - - PowerPoint PPT Presentation

putting the science in computer science
SMART_READER_LITE
LIVE PREVIEW

Putting the Science in Computer Science What makes for a good - - PowerPoint PPT Presentation

Putting the Science in Computer Science What makes for a good program, and how can we measure / evaluate programs for goodness? Write as many definitions of good as you can, and describe how you would measure each one.


slide-1
SLIDE 1

Firstname Lastname (Your response)

  • Th. 10 / 13

Putting the “Science” in Computer Science

What makes for a good program, and how can we measure / evaluate programs for “goodness”? Write as many definitions of “good” as you can, and describe how you would measure each one.

slide-2
SLIDE 2

Given a computational problem Is there a solution? What is it? How good is it?

slide-3
SLIDE 3

Is it efficient?

slide-4
SLIDE 4
  • 2

4 6 8

Problem Size

Data: which algorithm is best?

Lower is better

slide-5
SLIDE 5
  • 2

4 6 8

Problem Size

Data: which algorithm is best?

Lower is better

slide-6
SLIDE 6
  • 5

10 15 20

Problem Size

Data: which algorithm is best?

Lower is better

slide-7
SLIDE 7
  • 10

20 30 40

Problem Size

Data: which algorithm is best?

Lower is better

slide-8
SLIDE 8
  • 20

40 60 80

Problem Size

Data: which algorithm is best?

Lower is better

slide-9
SLIDE 9
  • 20

40 60 80 100 120

Problem Size

Data: which algorithm is best?

Lower is better

slide-10
SLIDE 10
  • 50

100 150

Problem Size

Data: which algorithm is best?

Lower is better

slide-11
SLIDE 11
  • 50

100 150 200

Problem Size

Data: which algorithm is best?

Lower is better

slide-12
SLIDE 12
  • 100

200 300

Problem Size

Data: which algorithm is best?

Lower is better

slide-13
SLIDE 13
  • 200

400 600 800

Problem Size

Data: which algorithm is best?

Lower is better

slide-14
SLIDE 14

Interpreting empirical data

We can measure

  • a particular algorithm
  • written in a particular language
  • as a particular program
  • compiled using a particular version of a particular compiler
  • with particular settings (e.g., enabling / disabling optimizations)
  • running on a particular data set, of a particular size
  • on a particular computer
  • with particular resources (CPUs, memory, hard drive, …)
  • under a particular version of a particular operating system
  • in a particular environment


e.g., with other programs running in the background

Key take-away: it’s messy and incomplete!

slide-15
SLIDE 15

Interpreting a theoretical model

A theory abstracts away certain details. cost metric:

  • corresponds to one “step”
  • highlights the essence of the work

e.g., multiplications, comparisons, function calls…

  • serves as a proxy for an empirical measurement

Instead of measuring time, we count steps.

e.g., “This algorithm costs n2 multiplications.”

Key take-away: it’s lossy!

slide-16
SLIDE 16

good data + good theory = good science

we can make predictions and 
 we can communicate with other scientists

slide-17
SLIDE 17

Decidability

e.g., Can it be solved at all?

Complexity Class

e.g., Can it be solved in polynomial time?

Asymptotic Analysis

e.g., O(n) time, where n is list size

Exact Theory

e.g., 7n + 2 multiplications, where n is list size

Empirical Data

e.g., This run took 17.3 seconds on this data.

General Specific

CS 81
 CS 140
 MA 167 CS 42
 CS 70 CS 105
 HPC

Recurrence 
 relation “Big O”

slide-18
SLIDE 18

Asymptotic Analysis (Big O)

slide-19
SLIDE 19

Asymptotic analysis

We’re always answering the same question: Not:

  • Exactly how many steps will it execute?
  • How many seconds will it take?
  • How many megabytes of memory will it need?

How does the cost scale
 (when we try larger and larger inputs)?

slide-20
SLIDE 20

Tie informal defjnition of “Big O”

A reasonable upper bound on 
 (an abstraction of) 
 a problem’s difficulty or
 a solution’s performance, 
 for reasonably large input sizes.

slide-21
SLIDE 21

In the limit (for VERY LARGE inputs)

The running time is bounded
 regardless of the input size. O(1) An input twice as big takes
 no more than twice as long. O(n) An input twice as big takes
 no more than four times as long. O(n2) An input one bigger takes
 no more than twice as long. O(2n)

slide-22
SLIDE 22

What are the consequences?

If We Only Care About Scalability…

Constant factors can be ignored.

n and 6n and 200n scale identically (“linearly”)


Small summands can be ignored.
 n2 and n2 + n + 999999 are indistinguishable when n is huge.

slide-23
SLIDE 23

Grouping Algorithms by Scalability

takes 6 steps takes 1 (big) step no more than 4000 steps somewhere between 2 and 47 steps, depending on the input

O(1)

takes 100n + 3 steps takes n/20 + 10,000,000 steps anywhere between 3 and 68 steps per item, for n items.

O(n)

takes 2n2 + 100n + 3 steps takes n2/17 steps somewhere between 1 and 40 steps per item, for n2 items anywhere between 1 and 7n steps per item, for n items.

O(n2)

slide-24
SLIDE 24

How hard is the problem?

O(nn) O(n!) O(2n) O(n3) O(n2) O(n log(n)) O(n) O(√n) O(log(n)) O(1) Intractable problems
 (exponential) Tractable problems
 (polynomial) No problem!

slide-25
SLIDE 25

logs aren’t scary!

log is the inverse of exponentiation. How many times can I cut N in half? Can I avoid looking at all the input?!

Tiey’re our friends.

0.00 23.33 46.67 70.00 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49 51 53 55 57 59 61 63

log

log2(1) = 0 // 20 = 1 log2(2) = 1 // 21 = 2 log2(3) ≈ 1.58 log2(4) = 2 // 22 = 4 log2(5) ≈ 2.32 log2(6) ≈ 2.58 log2(7) ≈ 2.81 log2(8) = 3 // 23 = 8

s-media-cache-ak0.pinimg.com/736x/5d/f7/6d/5df76d1672ccdffc74af2e2bf55330aa.jpg
slide-26
SLIDE 26

How hard are these problems?

cost metric cost double multiplications sum additions half-count divisions

slide-27
SLIDE 27

How hard are these problems?

cost metric cost double multiplications O(1) sum additions O(n) half-count divisions O(log n)

slide-28
SLIDE 28

double


multiplications

sum


additions

half-count


divisions

T(0) n/a T(1) T(2) T(3) T(4) … T(n)

What’s the cost, T, for each function?

(define (double n) (* n 2)) (define (sum n) (if (= n 0) (+ n (sum (- n 1))))) (define (half-count n) (if (= n 1) (+ 1 (half-count (quotient n 2)))))

slide-29
SLIDE 29

double


multiplications

sum


additions

half-count


divisions

T(0) 1 n/a T(1) 1 1 T(2) 1 2 1 T(3) 1 3 1 T(4) 1 4 2 … … … … T(n) 1 n ⌊log2 n⌋

What’s the cost, T, for each function?

(define (double n) (* n 2)) (define (half-count n) (if (= n 1) (+ 1 (half-count (quotient n 2))))) (define (sum n) (if (= n 0) (+ n (sum (- n 1)))))

slide-30
SLIDE 30

Recurrence Relations
 (translating code to math)

slide-31
SLIDE 31

Translating recursion to recurrence relations

  • 1. Translate the base case(s), using specific input sizes

How many steps does this base case take?

  • 2. Translate the recursive case(s), using input size N

Define T(N) in terms of smaller cost

(define (sum n) (if (= n 0) (+ n (sum (- n 1))))) base case → recursive case →

T(0) = 1 T(N) = 3 + T(N-1)

recurrence relation input size

For a given cost metric: additions

slide-32
SLIDE 32
  • 1. Translate the base case(s), using specific input sizes

How many steps does this base case take?

  • 2. Translate the recursive case(s), using input size N

Define T(N) in terms of smaller cost

Translating recursion to recurrence relations

T(N) = 1 + T(N-1) T(N) = 1 + 1 + T(N-2) T(N) = 1 + 1 + 1 + T(N-3) T(N) … T(N) = 1 + 1 + 1 + … 1 + T(N-N) = 1*1 + T(N-1) = 2*1 + T(N-2) = 3*1 + T(N-3) … = N*1 + T(N-N) = N ∈ O(N)

(define (sum n) (if (= n 0) (+ n (sum (- n 1)))))

T(0) = 0 T(N) = 1 + T(N-1)

closed form asymptotic form base case → recursive case → recurrence relation input size

For a given cost metric: additions

slide-33
SLIDE 33
  • 1. Translate the base case(s), using specific input sizes

How many steps does this base case take?

  • 2. Translate the recursive case(s), using input size N

Define T(N) in terms of smaller cost

Translating recursion to recurrence relations

(define (sum n) (if (= n 0) (+ n (sum (- n 1)))))

T(0) = 1 T(N) = 2 + T(N-1)

For a given cost metric: arithmetic operations and comparisons

base case → recursive case → recurrence relation input size

slide-34
SLIDE 34
  • 1. Translate the base case(s), using specific input sizes

How many steps does this base case take?

  • 2. Translate the recursive case(s), using input size N

Define T(N) in terms of smaller cost

Translating recursion to recurrence relations

T(N) = 3 + T(N-1) T(N) = 3 + 3 + T(N-2) T(N) = 3 + 3 + 3 + T(N-3) T(N) … T(N) = 3 + 3 + 3 + … 3 + T(N-N)

(define (sum n) (if (= n 0) (+ n (sum (- n 1)))))

T(0) = 1 T(N) = 3 + T(N-1) = 1*3 + T(N-1) = 2*2 + T(N-2) = 3*2 + T(N-3) … = N*3 + T(N-N) = 3N +1 ∈ O(N)

closed form asymptotic form base case → recursive case → recurrence relation input size

For a given cost metric: arithmetic operations and comparisons

slide-35
SLIDE 35
  • 1. Translate the base case(s), using specific input sizes

How many steps does this base case take?

  • 2. Translate the recursive case(s), using input size N

Define T(N) in terms of smaller cost

Translating recursion to recurrence relations

(define (half-count n) (if (= n 1) (+ 1 (half-count (quotient n 2)))))

For a given cost metric: divisions

base case → recursive case →

T(1) = 1 T(N) = 3 + T(N-1)

recurrence relation input size

slide-36
SLIDE 36
  • 1. Translate the base case(s), using specific input sizes

How many steps does this base case take?

  • 2. Translate the recursive case(s), using input size N

Define T(N) in terms of smaller cost

Translating recursion to recurrence relations

(define (half-count n) (if (= n 1) (+ 1 (half-count (quotient n 2)))))

T(1) = 0 T(N) = 1 + T(N/2)

For a given cost metric: divisions

base case → recursive case → recurrence relation input size

T(N) = 1 + T(N/2) T(N) = 1 + 1 + T(N/4) T(N) = 1 + 1 + 1 + T(N/8) T(N) … T(N) = 1 + 1 + 1 + … 1 + T(N/N) = 1 + T(N/2) = 2 + T(N/4) = 3 + T(N/8) … = log2 N + T(N/N) = log2 N ∈ O(log N)

closed form asymptotic form

slide-37
SLIDE 37

Tiree problems to consider

uniq

Given a list L, create a new list L' that contains only the unique elements of L, in the order they appear in L.

sublists

Given a list L, generate a list L' of all sublists that can be made from the elements of L (elements must appear in same order).

reachable?

Given a graph G, starting point a and ending point b, determine whether b is reachable from a in G.

slide-38
SLIDE 38

> (uniq '(c a l i f o r n i a)) '(c l f o r n i a) > (uniq '(m u d d)) '(m u d) > (uniq '(m i s s i p p i)) '(m s p i)

slide-39
SLIDE 39

> (sublists '()) '(()) > (sublists '(1)) '((1) ()) > (sublists '(1 2)) '((1 2) (1) (2) ())

slide-40
SLIDE 40

> (reachable? 'C 'A ) #t > (reachable? 'A 'C ) #f

A C B D

100 90 85 10 10 10 10

A C B D

100 90 85 10 10 10 10

slide-41
SLIDE 41

How hard are these problems?

cost metric worst-case input worst-case cost uniq number of comparisons made sublists number of 
 sublists created reachable? number of 
 edges traversed

slide-42
SLIDE 42

How hard are these problems?

cost metric worst-case input worst-case cost uniq number of comparisons made all are unique O(n) sublists number of 
 sublists created everything is the worst
 (and the best!) O(2n) reachable? number of 
 edges traversed b is not 
 reachable from a O(n)

slide-43
SLIDE 43

“Use it or lose it”


(a recursive search technique)

slide-44
SLIDE 44

Review: recursion

a problem-solving strategy — fjll in the pieces

Base case(s) Recursive case(s)

  • 1. It: a piece of the problem
  • 2. Lose-it solution: How could we solve a smaller version without it?
  • 3. How would we combine it and lose-it to solve the full problem?
slide-45
SLIDE 45

> (sum '()) > (sum '(1 2 3)) 6 > (sum '(7 7 7 7 7 7)) 42

slide-46
SLIDE 46

Review: recursion

a problem-solving strategy — fjll in the pieces

Base case(s) Recursive case(s)

  • 1. It: a piece of the problem
  • 2. Lose-it solution: How could we solve a smaller version without it?
  • 3. How would we combine it and lose-it to solve the full problem?

empty list → 0 the first element sum of the rest of the elements in the list add them

slide-47
SLIDE 47

use it or lose it

a recursive strategy — fjll in the pieces

Base case(s) Recursive case(s)

  • 1. It: a piece of the problem
  • 2. Lose-it solution: How could we solve a smaller version without it?
  • 3. Use-it solution: How could we solve a smaller version with it?
  • 4. How would we combine the solutions to solve the full problem?

multiple recursive calls!

slide-48
SLIDE 48

> (uniq '(c a l i f o r n i a)) '(c l f o r n i a) > (uniq '(m u d d)) '(m u d) > (uniq '(m i s s i p p i)) '(m s p i)

slide-49
SLIDE 49

unique: use it or lose it

a recursive strategy — fjll in the pieces

Base case(s) Recursive case(s)

  • 1. It: a piece of the problem
  • 2. Smaller version: What does the input look like without it?
  • 3. Lose-it solution: How could we solve a smaller version without it?
  • 4. Use-it solution: How could we solve a smaller version with it?
  • 5. How would we combine the solutions to solve the full problem?

empty list → empty list the first element the rest of the list unique-ify the rest of the list pre-pend “it” to the “lose-it” solution if “it” is in “lose-it”, then “lose-it”; else “use-it”

> (uniq '(c a l i f o r n i a)) '(c l f o r n i a) > (uniq '(m u d d)) '(m u d) > (uniq '(m i s s i p p i)) '(m s p i)

slide-50
SLIDE 50

a recursive strategy

unique: use it or lose it

(define (uniq L) (if (empty? L) '() (let* ([it (first L)] [lose-it (uniq (rest L))] [use-it (cons it lose-it)]) (if (member it lose-it) lose-it use-it))))

slide-51
SLIDE 51

a recursive strategy

unique: use it or lose it

(define (uniq L) (if (empty? L) '() (let* ([it (first L)] [lose-it (uniq (rest L))] [use-it (cons it lose-it)]) (if (member it lose-it) lose-it use-it))))

slide-52
SLIDE 52

> (sublists '()) '(()) > (sublists '(1)) '((1) ()) > (sublists '(1 2)) '((1 2) (1) (2) ())

slide-53
SLIDE 53

sublists: use it or lose it

a recursive strategy — fjll in the pieces

Base case(s) Recursive case(s)

  • 1. It: a piece of the problem
  • 2. Smaller version: What does the input look like without it?
  • 3. Lose-it solution: How could we solve a smaller version without it?
  • 4. Use-it solution: How could we solve a smaller version with it?
  • 5. How would we combine the solutions to solve the full problem?

empty list → list of empty list the first element the rest of the list find sublists the rest of the list pre-pend “it” to each element of the “lose-it” solution append the lose-it solution to the use-it solution

> (sublists '()) '(()) > (sublists '(1)) '((1) ()) > (sublists '(1 2)) '((1 2) (1) (2) ())

slide-54
SLIDE 54

a recursive search strategy

Solution technique: “use it or lose it”

(define (sublists L) (if (empty? L) '(empty) (let* ([it (first L)] [lose-it (sublists (rest L))] [use-it (map (lambda (l) (cons it l)) lose-it)]) (append use-it lose-it))))

slide-55
SLIDE 55

reachable?: use it or lose it

a recursive strategy — fjll in the pieces

Base case(s) Recursive case(s)

  • 1. It: a piece of the problem
  • 2. Smaller version: What does the input look like without it?
  • 3. Lose it: How could we solve a smaller version without it?
  • 4. Use it: How could we solve a smaller version with it?
  • 5. How would we combine the solutions to solve the full problem?

no nodes are reachable in the empty graph an edge, c → d the graph without that edge, called “sub-graph” is a reachable from b in sub-graph?

is c reachable from a AND is b reachable from d in sub-graph?

use-it OR lose-it a node is always reachable from itself