ToTS Dyer, Simianer, Riezler, Blunsom, Hasler
Tuning SMT Systems on the Training Set
Chris Dyer, Patrick Simianer, Stefan Riezler, Phil Blunsom, Eva Hasler
Project Report MT Marathon 2011 FBK Trento
Tuning SMT Systems on the Training Set Chris Dyer, Patrick Simianer, - - PowerPoint PPT Presentation
ToTS Dyer, Simianer, Riezler, Blunsom, Hasler Tuning SMT Systems on the Training Set Chris Dyer, Patrick Simianer, Stefan Riezler, Phil Blunsom, Eva Hasler Project Report MT Marathon 2011 FBK Trento Tuning SMT Systems on the Training Set
ToTS Dyer, Simianer, Riezler, Blunsom, Hasler
Project Report MT Marathon 2011 FBK Trento
ToTS Dyer, Simianer, Riezler, Blunsom, Hasler
ToTS Dyer, Simianer, Riezler, Blunsom, Hasler
ToTS Dyer, Simianer, Riezler, Blunsom, Hasler
ToTS Dyer, Simianer, Riezler, Blunsom, Hasler
ToTS Dyer, Simianer, Riezler, Blunsom, Hasler
ToTS Dyer, Simianer, Riezler, Blunsom, Hasler
ToTS Dyer, Simianer, Riezler, Blunsom, Hasler
ToTS Dyer, Simianer, Riezler, Blunsom, Hasler
ToTS Dyer, Simianer, Riezler, Blunsom, Hasler
132,755 parallel sentences
ToTS Dyer, Simianer, Riezler, Blunsom, Hasler
132,755 parallel sentences German-to-English
ToTS Dyer, Simianer, Riezler, Blunsom, Hasler
ToTS Dyer, Simianer, Riezler, Blunsom, Hasler
ToTS Dyer, Simianer, Riezler, Blunsom, Hasler
First sample translations according to their model score.
ToTS Dyer, Simianer, Riezler, Blunsom, Hasler
First sample translations according to their model score. Then sample pairs.
ToTS Dyer, Simianer, Riezler, Blunsom, Hasler
First sample translations according to their model score. Then sample pairs.
ToTS Dyer, Simianer, Riezler, Blunsom, Hasler
First sample translations according to their model score. Then sample pairs.
ToTS Dyer, Simianer, Riezler, Blunsom, Hasler
First sample translations according to their model score. Then sample pairs.
ToTS Dyer, Simianer, Riezler, Blunsom, Hasler
ToTS Dyer, Simianer, Riezler, Blunsom, Hasler
Hiero SCFG rule identifier
ToTS Dyer, Simianer, Riezler, Blunsom, Hasler
Hiero SCFG rule identifier target n-grams within rule
ToTS Dyer, Simianer, Riezler, Blunsom, Hasler
Hiero SCFG rule identifier target n-grams within rule target n-gram with gaps (X) within rule
ToTS Dyer, Simianer, Riezler, Blunsom, Hasler
Hiero SCFG rule identifier target n-grams within rule target n-gram with gaps (X) within rule binned rule counts in full training set
ToTS Dyer, Simianer, Riezler, Blunsom, Hasler
Hiero SCFG rule identifier target n-grams within rule target n-gram with gaps (X) within rule binned rule counts in full training set rule length features
ToTS Dyer, Simianer, Riezler, Blunsom, Hasler
Hiero SCFG rule identifier target n-grams within rule target n-gram with gaps (X) within rule binned rule counts in full training set rule length features rule shape features
ToTS Dyer, Simianer, Riezler, Blunsom, Hasler
Hiero SCFG rule identifier target n-grams within rule target n-gram with gaps (X) within rule binned rule counts in full training set rule length features rule shape features word alignments in rules
ToTS Dyer, Simianer, Riezler, Blunsom, Hasler
Hiero SCFG rule identifier target n-grams within rule target n-gram with gaps (X) within rule binned rule counts in full training set rule length features rule shape features word alignments in rules
ToTS Dyer, Simianer, Riezler, Blunsom, Hasler
ToTS Dyer, Simianer, Riezler, Blunsom, Hasler
Compute ℓ2-norm of column vectors (= vector of examples/shards for each of n features), then ℓ1-norm of resulting n-dimensional vector.
ToTS Dyer, Simianer, Riezler, Blunsom, Hasler
Compute ℓ2-norm of column vectors (= vector of examples/shards for each of n features), then ℓ1-norm of resulting n-dimensional vector.
ToTS Dyer, Simianer, Riezler, Blunsom, Hasler
Compute ℓ2-norm of column vectors (= vector of examples/shards for each of n features), then ℓ1-norm of resulting n-dimensional vector.
ToTS Dyer, Simianer, Riezler, Blunsom, Hasler
ToTS Dyer, Simianer, Riezler, Blunsom, Hasler
ToTS Dyer, Simianer, Riezler, Blunsom, Hasler
ToTS Dyer, Simianer, Riezler, Blunsom, Hasler
Keep only those features with large enough ℓ2-norm computed over examples/shards.
ToTS Dyer, Simianer, Riezler, Blunsom, Hasler
Keep only those features with large enough ℓ2-norm computed over examples/shards. Then average feature values over examples/shards.
ToTS Dyer, Simianer, Riezler, Blunsom, Hasler
ToTS Dyer, Simianer, Riezler, Blunsom, Hasler
150k parallel sentences from news commentary data, German-to-English
ToTS Dyer, Simianer, Riezler, Blunsom, Hasler
150k parallel sentences from news commentary data, German-to-English pairwise ranking perceptron
ToTS Dyer, Simianer, Riezler, Blunsom, Hasler
150k parallel sentences from news commentary data, German-to-English pairwise ranking perceptron sample 100 translations from chart, use all 100*(99)/2 pairs
ToTS Dyer, Simianer, Riezler, Blunsom, Hasler
150k parallel sentences from news commentary data, German-to-English pairwise ranking perceptron sample 100 translations from chart, use all 100*(99)/2 pairs OR: use n-best list sparse rule-id features AND/OR dense features
ToTS Dyer, Simianer, Riezler, Blunsom, Hasler
150k parallel sentences from news commentary data, German-to-English pairwise ranking perceptron sample 100 translations from chart, use all 100*(99)/2 pairs OR: use n-best list sparse rule-id features AND/OR dense features 200 shards (25 machines with 8 cores)
ToTS Dyer, Simianer, Riezler, Blunsom, Hasler
ToTS Dyer, Simianer, Riezler, Blunsom, Hasler
ToTS Dyer, Simianer, Riezler, Blunsom, Hasler
ToTS Dyer, Simianer, Riezler, Blunsom, Hasler
Best rule X → X1 , dass X2, X1 that X2 Bad rule X → X1 oder X2, X1 and X2
ToTS Dyer, Simianer, Riezler, Blunsom, Hasler
Best rule X → X1 , dass X2, X1 that X2 Bad rule X → X1 oder X2, X1 and X2
ToTS Dyer, Simianer, Riezler, Blunsom, Hasler
Best rule X → X1 , dass X2, X1 that X2 Bad rule X → X1 oder X2, X1 and X2
ToTS Dyer, Simianer, Riezler, Blunsom, Hasler