Going deep and learning to love the haters Advice for graduate - - PowerPoint PPT Presentation

going deep and learning to love the haters
SMART_READER_LITE
LIVE PREVIEW

Going deep and learning to love the haters Advice for graduate - - PowerPoint PPT Presentation

Going deep and learning to love the haters Advice for graduate school Kay Ousterhout UC Berkeley PhD Depth fi rst Starting grad school Draw an outline Starting a project Ask questions Log log log Running experiments Love the haters


slide-1
SLIDE 1

Going deep and learning to love the haters

Advice for graduate school

Kay Ousterhout UC Berkeley PhD

slide-2
SLIDE 2
slide-3
SLIDE 3

Presenting Getting feedback Running experiments Starting a project Starting grad school

Depth first Draw an outline Ask questions Log log log Love the haters Write sh***y slides

slide-4
SLIDE 4

Depth First

slide-5
SLIDE 5

Pick a project, any project

Depth first

slide-6
SLIDE 6

“Facts precede concepts”

Depth first

slide-7
SLIDE 7

Go deep

(learn as much as possible, even the boring stuff)

Depth first

slide-8
SLIDE 8

2013: Monotasks: promising performance improvement How much might they improve performance by? Pretty confusing performance measurements Concurrently: trying to publish work based on Yahoo! traces

Depth first

slide-9
SLIDE 9

2013: Monotasks: promising performance improvement 2015: NSDI paper on understanding performance Monotasks more useful as mechanism to reason about performance?

Depth first

slide-10
SLIDE 10

Depth first

Pick a project, any project “Facts precede concepts” Go deep

slide-11
SLIDE 11

Draw an outline

slide-12
SLIDE 12

Draw an outline

Make an outline

7.3 How do task constraints affect performance? 7.4 How do scheduler failures impact job response time? 7.6 How does Sparrow compare to Spark’s native, centralized scheduler? 7.7 How well can Sparrow’s distributed fairness enforcement maintain fair shares? 7.8 How much can low priority users hurt response times for high priority users? 5.2 How much to stragglers affect job completion time? 5.3 Are these results inconsistent with past work?

Write it as questions

e v a l

slide-13
SLIDE 13

Draw an outline Draw all of your graphs Ask:

Is ths the graph we want in the paper? Do you agree these are the expected results? Is there any thing else I should graph?

slide-14
SLIDE 14

Draw an outline Draw all of your graphs Fail fast Know what to measure Sanity

slide-15
SLIDE 15

“Use your intuition to ask questions …not to answer them”

slide-16
SLIDE 16

Ask questions

Measure one level deeper Corroborate results with multiple views of data Running experiments you trust is hard Distrust good results

slide-17
SLIDE 17

Log log log

slide-18
SLIDE 18

Log log log

Script experiments: Copy configuration Copy Git commit history

slide-19
SLIDE 19

Log log log

Write-optimized

(I used Latex. Don’t do this.)

Keep a log on a computer

One useful thing for debugging is vmstat -SM 1, which basically runs free -m at regular intervals and prints the output. Sangjin suggested this: echo 1000 > /proc/sys/vm/vfs_cache_pressure Another idea in a similar vein is to attempt the suggestions described here: http://serverfault.com/questions/516074/why-are-applications-in-a-memory- Also re-launch as ext4 / xfs. Also consider turning journalling off; this means that for every write, a bunch

  • f extra data needs to be written. To figure out whether there’s journalling in

the file system, do dumpe2fs /dev/xvdb To turn journalling off, you can run this command: tune2fs -O^has_journal /dev/xdy’’

slide-20
SLIDE 20

Love the haters

slide-21
SLIDE 21

Love the haters

You are e a sc scien entist st. . No Not a sa sales es per erso son.

slide-22
SLIDE 22

Love the haters

Talk to the people who hate your work No No em email Assume they’re correct Explain it back to them

slide-23
SLIDE 23

Love the haters

Put the limitations in the paper

jobs on higher priority users. Constraints Our current design does not handle inter- job constraints (e.g. “the tasks for job A must not run on racks with tasks for job B”). Supporting inter-job con- straints across frontends is difficult to do without signif- icantly altering Sparrow’s design. Gang scheduling Some applications require gang scheduling, a feature not implemented by Sparrow. Gang scheduling is typically implemented using bin-packing algorithms that search for and reserve time slots in which an entire job can run. Because Sparrow queues tasks on several machines, it lacks a central point from which to perform bin-packing. While Sparrow often places all jobs on entirely idle machines, this is not guaranteed, and deadlocks between multiple jobs that require gang scheduling may occur. Sparrow is not alone: many clus- ter schedulers do not support gang scheduling [8, 9, 16]. Query-level policies Sparrow’s performance could be

slide-24
SLIDE 24

Write sh***y slides

slide-25
SLIDE 25

Write sh***y slides

Outline the presentation first

(every slide should look bad)

Give this version to someone you trust

slide-26
SLIDE 26

Write sh***y slides

Outline the presentation first

(every slide should look bad)

Give this version to someone you trust

slide-27
SLIDE 27

Write sh***y slides

Outline the presentation first

(every slide should look bad)

Give this version to someone you trust

slide-28
SLIDE 28

Write sh***y slides

Outline the presentation first

(every slide should look bad)

Give this version to someone you trust

slide-29
SLIDE 29

Presenting Getting feedback Running experiments Starting a project Starting grad school

Depth first Draw an outline Ask questions Log log log Love the haters Write sh***y slides

Pick any project, learn as much as possible Draw graphs, get buy-in Measure deeper Fear good results Save commits + conf, Ctrl+F! Talk to the haters Put limitations in the paper Get feedback early Avoid paralysis

slide-30
SLIDE 30

Don’t give up

slide-31
SLIDE 31

Presenting Getting feedback Running experiments Starting a project Starting grad school

Depth first Draw an outline Ask questions Log log log Love the haters Write sh***y slides

Pick any project, learn as much as possible Draw graphs, get buy-in Measure deeper Fear good results Save commits + conf, Ctrl+F! Talk to the haters Put limitations in the paper Get feedback early Avoid paralysis