Deep Learning Research for NLP Graham Neubig Language Processing - PowerPoint PPT Presentation
LTI Orientation Deep Learning Research for NLP Graham Neubig Language Processing Mary prevents Peter from scoring a goal. John passes the ball upfield to Peter, who shoots for the goal. The shot is deflected by Mary and the ball goes
LTI Orientation Deep Learning Research for NLP Graham Neubig
Language Processing Mary prevents Peter from scoring a goal. John passes the ball upfield to Peter, who shoots for the goal. The shot is deflected by Mary and the ball goes out of bounds.
Structured Prediction X Y
Supervised Learning X X Y Y
Supervised Learning w/ Neural Nets Learning θ X X Y Y
Structured Prediction w/ Neural Nets
Neural Structured Prediction w Y Model X Loss
Neural Structured Prediction w Y Model X Search Loss
The Problem of Discrete Decisions y i ( w ) = ‘dog’ ˆ y i ( w ) = ‘the’ ˆ y i ( w ) = ‘cat’ ˆ g ( w ) w
Soft Search [Goyal+18] (Faculty: Neubig) y i = ‘the’ ˆ argmax { e ( dog ) s i ( dog ) peaked softmax { e ( y ) · exp[ α · s i ( y )] e ( the ) X s i ( the ) Z y e ( cat ) α -soft argmax s i ( cat ) ˜ e i h i +1 h i
Smoothed Surface y i ( w ) = ‘dog’ ˆ y i ( w ) = ‘the’ ˆ y i ( w ) = ‘cat’ ˆ α = 10 α = 1 g ( w ) w
Prediction over Word Embedding Space [Kumar+18] (Faculty: Tsvetkov)
Structured Modeling w/ Neural Nets
Why Structured Neural Nets • In pre-neural NLP we did feature engineering to capture the salient features of text • Now, neural nets capture features for us • But given too much freedom they will not learn or overfit • So we do architecture engineering to add bias
Structure in Language Words Sentences S VP VP PP NP Alice gave a message to Bob Phrases Documents This film was completely unbelievable. The characters were wooden and the plot was absurd. That being said, I liked it.
BiLSTM Conditional Random Fields [Ma+15] (Faculty: Hovy) • Add an additional layer that ensures consistency between tags <s> I hate this movie <s> PRP VBP DT NN • Training and prediction use dynamic programming
Neural Factor Graph Models [Malaviya+18] (Faculty: Gormley, Neubig) • Problem: Neural CRFs can only handle single tag/word • Idea: Expand to multiple tags using graphical models
Stack LSTM [Dyer+15] (Faculty: Dyer (now DeepMind)) RED-L(amod) SHIFT … SHIFT REDUCE_L REDUCE_R S B p t } {z | } {z | TOP TOP amod an decision was made ∅ root overhasty TOP | REDUCE-LEFT(amod) Compositional {z Representation A SHIFT } …
Morphological Language Models [Matthews+18] (Faculty: Neubig, Dyer) • Problem: Language modeling for morphologically rich languages is hard • Idea: Specifically decompose input and output using morphological structure
Neural-Symbolic Integration
Neural-Symbolic Hybrids • Neural and symbolic models better at different things • Neural: smoothing over differences using similarity • Symbolic: remembering individual single-shot events • How can we combine the two?
Discrete Lexicons in Neural Seq2seq [Arthur+15] (Faculty: Neubig)
NNs + Logic Rules [Hu+16] (Faculty: Hovy, Xing) • Problem: It is difficult to explicitly incorporate knowledge into neural-net-based models • Idea: Use logical rules to constrain space of predicted probabilities
Latent Variable Models
Latent Variable Models X Y
Latent Variable Models X X Z Y ? Z ? ? Y
Neural Latent Variable Models Z X
Generating Text from Latent Space Z X X
Example: Discourse Level Modeling with VAE [Zhao+17] (Faculty: Eskenazi) • Use latent variable as a way to represent entire discourse in dialog
Handling Discrete Latent Variables [Zhou+17] (Faculty: Neubig)
Structured Latent Variables [Yin+18] (Faculty: Neubig) • Problem: Paucity of training data for structured prediction problems • Idea: Treat the structure as a latent variable in a VAE model
Better Learning Algorithms for Latent Variable Models [He+2019] (Faculty: Neubig) • Problem: When learning latent variable models, predicting the latent variables can be difficult • Solution: Perform aggressive update of the part of the model that predicts these variables
Any Questions?
Recommend
More recommend
Explore More Topics
Stay informed with curated content and fresh updates.