Noisy Channel Models
CMSC 723 / LING 723 / INST 725 MARINE CARPUAT
marine@cs.umd.edu
Noisy Channel Models CMSC 723 / LING 723 / INST 725 M ARINE C - - PowerPoint PPT Presentation
Noisy Channel Models CMSC 723 / LING 723 / INST 725 M ARINE C ARPUAT marine@cs.umd.edu T oday HW1, Q&A Weighted FSAs Noisy Channel Models Project 1 HW1: your goals for the class based on word frequency i 58 research 17
CMSC 723 / LING 723 / INST 725 MARINE CARPUAT
marine@cs.umd.edu
HW1: your goals for the class based
i 58 to 57 and 53 in 45 the 33 , 27
26 learn 20 nlp 19 a 18 my 18 this 18
9 research 17 linguistics 15 language 14 some 14 processing 12 be 11 computational11 natural 11 want 11 have 10 how 10 is 10 like 10
HW1: your goals for the class based
learn 20 nlp 19 research 17 linguistics 15 language 14 processing 12 want 11 natural 11 computational11 like 10 understanding9 machine 9 security 2 hci 2 visualization1 social 1 search 1 probabilistic1 news 1 media 1 linguistic 1 interactive 1 interaction 1 human-in-the-loop 1 human-computer 1 techniques 6 projects 6 class 6 apply 6 models 5 interested 5 goal 5 work 4 systems 4 study 4 human 4 data 4 computer 4 applications 4
Suppose that 1/100,000 of the population has the ability to read other people's minds. You have a test that, if someone can read minds, reads positive with 95% probability; and, if someone cannot read minds, reads negative with 99.5% probability. I take the test and it reads positive. What is the probability that I can do mind reading? (Express your answer as real number in [0,1])
I am Sam Sam I am I do not like green eggs and ham <s> <s> <s> </s> </s> </s>
Training Corpus P( I | <s> ) = 2/3 = 0.67 P( Sam | <s> ) = 1/3 = 0.33 P( am | I ) = 2/3 = 0.67 P( do | I ) = 1/3 = 0.33 P( </s> | Sam )= 1/2 = 0.50 P( Sam | am) = 1/2 = 0.50 ... Bigram Probability Estimates
he saw me he ran home she talked
How does this FSA language model differ from a bigram model?
accepts
– But not necessary
have probability zero
weighted finite state automata
transducers
– Generates pairs of strings and assigns a weight to each pair – Weight can often be interpreted as conditional probability P(output-string | input-string)
NLP modeling
P(X) source model P(Y|X) channel model X* = argmax_x P(X|Y)
– Machine translation from French to English – Question Answering
– and how they relate to n-gram models
– Source model, Channel model, Decoding
hand…
– Chomsky and Halle Notation: a → b / c__d = rewrite a as b when occurs between c and d – E-Insertion rule
ε → e / x s z ^ __ s #
www.cs.umd.edu/class/fall2015/cmsc723/p1.html
Teams of 2 or 3 Due before class on Tu Sep 29 Submit code/outputs using handin (see details in piazza post)
and neural language modeling