Recurrent Neural Networks (RNN) Artificial Intelligence @ Allegheny - - PowerPoint PPT Presentation

recurrent neural networks rnn
SMART_READER_LITE
LIVE PREVIEW

Recurrent Neural Networks (RNN) Artificial Intelligence @ Allegheny - - PowerPoint PPT Presentation

Recurrent Neural Networks (RNN) Artificial Intelligence @ Allegheny College Janyl Jumadinova March 9, 2020 Alex Graves, Supervised Sequence Labelling with Recurrent Neural Networks http://colah.github.io/posts/2015-08-Understanding-LSTMs/


slide-1
SLIDE 1

Recurrent Neural Networks (RNN)

Artificial Intelligence @ Allegheny College Janyl Jumadinova March 9, 2020

Alex Graves, “Supervised Sequence Labelling with Recurrent Neural Networks” http://colah.github.io/posts/2015-08-Understanding-LSTMs/

Janyl Jumadinova Recurrent Neural Networks (RNN) March 9, 2020 1 / 16

slide-2
SLIDE 2

Word2Vec Model

Word2Vec is used to learn vector representations of words, “word embeddings”. This is typically a preprocessing step, where the learned vectors are fed into a discriminative model (such as RNN). Word2vec is a computationally-efficient predictive model for learning word embeddings from raw text.

Janyl Jumadinova Recurrent Neural Networks (RNN) March 9, 2020 2 / 16

slide-3
SLIDE 3

Word2Vec Model

Word2Vec is used to learn vector representations of words, “word embeddings”. This is typically a preprocessing step, where the learned vectors are fed into a discriminative model (such as RNN). Word2vec is a computationally-efficient predictive model for learning word embeddings from raw text. (1) Continuous Bag-of-Words model (CBOW): predicts target words from context words. (2) Skip-Gram model: predicts source context words from target words.

Janyl Jumadinova Recurrent Neural Networks (RNN) March 9, 2020 2 / 16

slide-4
SLIDE 4

Word2Vec Model

https://www.tensorflow.org/tutorials/representation/word2vec

Janyl Jumadinova Recurrent Neural Networks (RNN) March 9, 2020 3 / 16

slide-5
SLIDE 5

Recurrent Neural Networks

Janyl Jumadinova Recurrent Neural Networks (RNN) March 9, 2020 4 / 16

slide-6
SLIDE 6

Recurrent Neural Networks

Janyl Jumadinova Recurrent Neural Networks (RNN) March 9, 2020 5 / 16

slide-7
SLIDE 7

Recurrent Neural Networks

Based on an encoder-decoder scheme, using Seq2Seq model.

Janyl Jumadinova Recurrent Neural Networks (RNN) March 9, 2020 6 / 16

slide-8
SLIDE 8

Recurrent Neural Networks

Janyl Jumadinova Recurrent Neural Networks (RNN) March 9, 2020 7 / 16

slide-9
SLIDE 9

Recurrent Neural Networks

Janyl Jumadinova Recurrent Neural Networks (RNN) March 9, 2020 8 / 16

slide-10
SLIDE 10

Recurrent Neural Networks

Janyl Jumadinova Recurrent Neural Networks (RNN) March 9, 2020 9 / 16

slide-11
SLIDE 11

Recurrent Neural Networks

Janyl Jumadinova Recurrent Neural Networks (RNN) March 9, 2020 10 / 16

slide-12
SLIDE 12

Long Short-Term Memory (LSTM)

Based on a standard RNN whose neuron activates with tanh Cristopher Olah, “Understanding LSTM Networks” (2015)

Janyl Jumadinova Recurrent Neural Networks (RNN) March 9, 2020 11 / 16

slide-13
SLIDE 13

Long Short-Term Memory (LSTM)

Janyl Jumadinova Recurrent Neural Networks (RNN) March 9, 2020 12 / 16

slide-14
SLIDE 14

Long Short-Term Memory (LSTM)

Each line carries an entire vector from the output of one node to the inputs of others. Pointwise operations are operations such as vector addition. Yellow boxes are learned neural network layers. A “Copy” line denote its content being copied and the copies going to different locations.

Janyl Jumadinova Recurrent Neural Networks (RNN) March 9, 2020 13 / 16

slide-15
SLIDE 15

Long Short-Term Memory (LSTM)

The cell state runs through the entire chain, with only some minor linear interactions.

Janyl Jumadinova Recurrent Neural Networks (RNN) March 9, 2020 14 / 16

slide-16
SLIDE 16

Long Short-Term Memory (LSTM)

The gate structures allow to remove or add information to the cell state.

Janyl Jumadinova Recurrent Neural Networks (RNN) March 9, 2020 15 / 16

slide-17
SLIDE 17

Long Short-Term Memory (LSTM)

The gate structures allow to remove or add information to the cell state. Disadvantage of RNN/LSTM Suffer from memory-bandwidth limited problems. Alternative? Transformer architecture (replace recurrence/convolution with attention).

Janyl Jumadinova Recurrent Neural Networks (RNN) March 9, 2020 15 / 16

slide-18
SLIDE 18

TensorFlow Tutorial

TensorFlow Recurrent Neural Networks

Janyl Jumadinova Recurrent Neural Networks (RNN) March 9, 2020 16 / 16