Applications (2 of 2): Applications (2 of 2): Recognition, - - PowerPoint PPT Presentation

applications 2 of 2 applications 2 of 2
SMART_READER_LITE
LIVE PREVIEW

Applications (2 of 2): Applications (2 of 2): Recognition, - - PowerPoint PPT Presentation

Applications (2 of 2): Applications (2 of 2): Recognition, Transduction, Discrimination, Segmentation Alignment etc Segmentation, Alignment, etc. Kenneth Church Kenneth.Church@jhu.edu Kenneth.Church@jhu.edu Dec 9, 2009 1 Solitaire


slide-1
SLIDE 1

Applications (2 of 2): Applications (2 of 2):

Recognition, Transduction, Discrimination, Segmentation Alignment etc Segmentation, Alignment, etc. Kenneth Church Kenneth.Church@jhu.edu Kenneth.Church@jhu.edu

Dec 9, 2009 1

slide-2
SLIDE 2

Solitaire Multiplayer Games: Auctions (Ads)

http://www.scienceoftheweb.org/15‐396/lectures/lecture09.pdf

Right Rail Right Rail: Avoid

Dec 9, 2009 2

Mainline Ad distortions from commercial interests

slide-3
SLIDE 3

A Single Auction A Stream of Continuous Auctions g

  • Standard Example of Second Price Auction

Sta da d a p e o Seco d ce uct o

– Single Auction for a Single Apple

  • Theoretical Result

– Second Price Auction Truth Telling – http://en.wikipedia.org/wiki/Vickrey_auction – Optimal Strategy:

  • Bid what the apple is worth to you
  • Don’t worry about what it is worth to others
  • Don t worry about what it is worth to others

– First Price Auction Truth Telling

  • Does theory generalize to a continuous stream?

Does theory generalize to a continuous stream?

Dec 9, 2009 3

slide-4
SLIDE 4

Pricing: Cost Per Click (CPC) Pricing: Cost Per Click (CPC)

  • Bi= your bid
  • Equilibrium

i y

  • Bi+1 = next bid
  • CTRi = your click through rate

– Advertisers

  • Awareness
  • Sales
  • CTRi+1 = next click through rate
  • CPCi = your price

(if h d d li k )

Sa es

  • New Customers
  • ROI

– Users

– (if we show your ad and user clicks)

  • Improvement: CTR Q (Prior)
  • Single Auction:

Users

  • Minimize pain
  • Obtain Value

Market Maker

  • Single Auction:

– CPCi = Bi+1

  • Continuous Stream:

– Market Maker

  • Maximize

Revenue

  • Continuous Stream:

– CPCi = Bi+1 CTRi+1 / CTRi

  • Truth Telling?

Dec 9, 2009 4

slide-5
SLIDE 5

Multi‐Player Games

Many Technical Opportunities Many Technical Opportunities

  • Economics

– http://www.wired.com/culture/culturereviews/magazine/17‐06/nep_googlenomics?currentPage=all

  • Machine Learning

– Learning to Rank – Estimate CTR (Q/Priors) Estimate CTR (Q/Priors) – Sparse Data:

  • What is the CTR for a new ad?

– Errors can be expensive – Errors can be expensive

  • If CTR is too low for new ad Penalize Growth
  • If too high Reward Bad Guys to do Bad Things
  • Truth Telling for Continuous Auctions?
  • Truth Telling for Continuous Auctions?

– Probably not, especially if participants can estimate Q better than market maker

  • Machine Learning: Solitaire Multi‐Player Games

– Can I estimate Q better than you can? Man‐eating tiger

Dec 9, 2009 5

slide-6
SLIDE 6

Applications Applications

  • Recognition: Shannon’s Noisy Channel Model

Speech Optical Character Recognition (OCR) Spelling – Speech, Optical Character Recognition (OCR), Spelling

  • Transduction

– Part of Speech (POS) Tagging – Machine Translation (MT)

  • Parsing ???
  • Parsing: ???
  • Ranking

– Information Retrieval (IR) – Lexicography

Di i i i

  • Discrimination:

– Sentiment, Text Classification, Author Identification, Word Sense Disambiguation (WSD)

  • Segmentation

– Asian Morphology (Word Breaking), Text Tiling

  • Alignment: Bilingual Corpora, Dotplots
  • Compression
  • Language Modeling: good for everything

Dec 9, 2009 6

slide-7
SLIDE 7

Speech Language

Shannon’s: Noisy Channel Model Shannon s: Noisy Channel Model

  • I Noisy Channel O

Channel Model Language Model

y

  • I΄ ≈ ARGMAXI Pr(I|O) = ARGMAXI Pr(I) Pr(O|I)

Trigram Language Model

Application Independent

Word Rank More likely alternatives We 9 The This One Two A Three Please In need 7 are will the would also do

Channel Model

Application Input Output Independent

need 7 are will the would also do to 1 resolve 85 have know do… all 9 The This One Two A Three

Application Input Output Speech Recognition writer rider OCR (Optical C

all 9 Please In

  • f

2 The This One Two A Three Please In the 1

Character Recognition) all a1l Spelling Correction government goverment

7

the 1 important 657 document question first… issues 14 thing point to

Dec 9, 2009

slide-8
SLIDE 8

Speech Language Using (Abusing) Shannon’s Noisy Channel Model: Part of g ( g) y Speech Tagging and Machine Translation

  • Speech

p

– Words Noisy Channel Acoustics

  • OCR

– Words Noisy Channel Optics

  • Spelling Correction

W d N i Ch l T – Words Noisy Channel Typos

  • Part of Speech Tagging (POS):

– POS Noisy Channel Words – POS Noisy Channel Words

  • Machine Translation: “Made in America”

– English Noisy Channel French

8

g y

Didn’t have the guts to use this slide at Eurospeech (Geneva)

Dec 9, 2009

slide-9
SLIDE 9

Dec 9, 2009 9

slide-10
SLIDE 10

Spelling Correction

Dec 9, 2009 10

slide-11
SLIDE 11

Dec 9, 2009 11

slide-12
SLIDE 12

Dec 9, 2009 12

slide-13
SLIDE 13

Dec 9, 2009 13

slide-14
SLIDE 14

Dec 9, 2009 14

slide-15
SLIDE 15

Evaluation Evaluation

Dec 9, 2009 15

slide-16
SLIDE 16

Performance Performance

Dec 9, 2009 16

slide-17
SLIDE 17

The Task is Hard without Context The Task is Hard without Context

Dec 9, 2009 17

slide-18
SLIDE 18

Easier with Context Easier with Context

  • actuall actual actually

actuall, actual, actually

– … in determining whether the defendant actually will die will die.

  • constuming, consuming, costuming

i d i t d i d

  • conviced, convicted, convinced
  • confusin, confusing, confusion
  • workern, worker, workers

Dec 9, 2009 18

slide-19
SLIDE 19

Easier with Context

Dec 9, 2009 19

slide-20
SLIDE 20

Context Model Context Model

Dec 9, 2009 20

slide-21
SLIDE 21

Dec 9, 2009 21

slide-22
SLIDE 22

Dec 9, 2009 22

slide-23
SLIDE 23

Dec 9, 2009 23

slide-24
SLIDE 24

Dec 9, 2009 24

slide-25
SLIDE 25

Future Improvements Future Improvements

  • Add More Factors

Add More Factors

– Trigrams Thesaurus Relations – Thesaurus Relations – Morphology S t ti A t – Syntactic Agreement – Parts of Speech

b l

  • Improve Combination Rules

– Shrink (Meaty Methodology)

Dec 9, 2009 25

slide-26
SLIDE 26

Dec 9, 2009 26

slide-27
SLIDE 27

Conclusion (Spelling Correction) Conclusion (Spelling Correction)

  • There has been a lot of interest in smoothing

There has been a lot of interest in smoothing

– Good‐Turing estimation Knesser Ney – Knesser‐Ney

  • Is it worth the trouble?
  • Ans: Yes (at least for recognition applications)

Dec 9, 2009 27

slide-28
SLIDE 28

Dec 9, 2009 28

slide-29
SLIDE 29

Dec 9, 2009 29

slide-30
SLIDE 30

Dec 9, 2009 30

slide-31
SLIDE 31

Dec 9, 2009 31

slide-32
SLIDE 32

Dec 9, 2009 32

slide-33
SLIDE 33

Dec 9, 2009 33

slide-34
SLIDE 34

Dec 9, 2009 34

slide-35
SLIDE 35

Dec 9, 2009 35

slide-36
SLIDE 36

Dec 9, 2009 36

slide-37
SLIDE 37

Dec 9, 2009 37

slide-38
SLIDE 38

Dec 9, 2009 38

slide-39
SLIDE 39

Dec 9, 2009 39

slide-40
SLIDE 40

Dec 9, 2009 40

slide-41
SLIDE 41

Dec 9, 2009 41

slide-42
SLIDE 42

Dec 9, 2009 42

slide-43
SLIDE 43

Dec 9, 2009 43

slide-44
SLIDE 44

Dec 9, 2009 44

slide-45
SLIDE 45

Dec 9, 2009 45

slide-46
SLIDE 46

Dec 9, 2009 46

slide-47
SLIDE 47

Dec 9, 2009 47

slide-48
SLIDE 48

Dec 9, 2009 48

slide-49
SLIDE 49

Dec 9, 2009 49

slide-50
SLIDE 50

Dec 9, 2009 50

slide-51
SLIDE 51

Aligning Words Aligning Words

Dec 9, 2009 51

slide-52
SLIDE 52

Dec 9, 2009 52

slide-53
SLIDE 53

Dec 9, 2009 53

slide-54
SLIDE 54

Dec 9, 2009 54

slide-55
SLIDE 55

Dec 9, 2009 55

slide-56
SLIDE 56

Dec 9, 2009 56

slide-57
SLIDE 57

Dec 9, 2009 57

slide-58
SLIDE 58

Dec 9, 2009 58

slide-59
SLIDE 59

Dec 9, 2009 59

slide-60
SLIDE 60

Dec 9, 2009 60

slide-61
SLIDE 61

Dec 9, 2009 61

slide-62
SLIDE 62

Dec 9, 2009 62

slide-63
SLIDE 63

Dec 9, 2009 63

slide-64
SLIDE 64

Dec 9, 2009 64

slide-65
SLIDE 65

Dec 9, 2009 65

slide-66
SLIDE 66

Dec 9, 2009 66

slide-67
SLIDE 67

Dec 9, 2009 67

slide-68
SLIDE 68

Dec 9, 2009 68