CS 4501 Machine Learning for NLP
Text Classification (I): Logistic Regression
Yangfeng Ji
Department of Computer Science University of Virginia
CS 4501 Machine Learning for NLP Text Classification (I): Logistic - - PowerPoint PPT Presentation
CS 4501 Machine Learning for NLP Text Classification (I): Logistic Regression Yangfeng Ji Department of Computer Science University of Virginia Overview 1. Problem Definition 2. Bag-of-Words Representation 3. Case Study: Sentiment Analysis
Department of Computer Science University of Virginia
1
3
4
◮ Example: a product review on Amazon
◮ Example: Y = {Positive, Negative}
1In this course, we use 풙 for both text and its representation with no distinction
5
◮ Example: a product review on Amazon
◮ Example: Y = {Positive, Negative}
1In this course, we use 풙 for both text and its representation with no distinction
5
푦∈Y
6
푦∈Y
푦∈Y
6
푦∈Y
7
푦∈Y
7
푦∈Y
◮ Bag-of-words representation
7
푦∈Y
◮ Bag-of-words representation
◮ Logistic regression models
7
푦∈Y
◮ Bag-of-words representation
◮ Logistic regression models ◮ Neural network classifiers
7
9
9
9
10
10
11
11
◮ words in texts
◮ word order ◮ sentence boundary ◮ paragraph boundary ◮ · · ·
12
14
14
Pos풙 = 1 > 풘T Neg풙 = 0
14
Pos풙 = 1 > 풘T Neg풙 = 0
14
15
15
15
푦풙 + 푏푦
17
푦풙 + 푏푦
17
푦풙 + 푏푦
18
푦풙 + 푏푦
푦풙 + 푏푦)
18
푦풙 + 푏푦
푦풙 + 푏푦)
푦 푃(푦 | 풙) = 1,
푦풙 + 푏푦)
푦′풙 + 푏푦′)
18
푦풙 + 푏푦
푦풙 + 푏푦)
푦 푃(푦 | 풙) = 1,
푦풙 + 푏푦)
푦′풙 + 푏푦′)
18
푦 = [푤1, 푤2, · · · , 푤푉, 푏푦]
푦풙)
푦′풙)
19
푦 = [푤1, 푤2, · · · , 푤푉, 푏푦]
푦풙)
푦′풙)
exp(푎)
19
20
20
exp(푧) 1+exp(푧) is the Sigmoid function 20
exp(푧) 1+exp(푧) is the Sigmoid function
20
푦풙)
푦′풙)
21
푦풙)
푦′풙)
◮ Will be discussed in lecture 04
21
푖=1, the likelihood
푚
푚
22
푦풙)
푦′풙)
푚
푚
푦(푖)풙(푖) − log
푦′풙(푖))
푖=1, ℓ(푾) is a function of
23
푚
푦(푖)풙(푖) + log
푦′풙)
24
[Jurafsky and Martin, 2019]
25
풘푦 ← 풘푦 − 휂
· 휕NLL({풘푦}) 휕풘푦
(18) [Jurafsky and Martin, 2019]
25
풘푦 ← 풘푦 − 휂
· 휕NLL({풘푦}) 휕풘푦
(18) [Jurafsky and Martin, 2019]
25
26
27
푚
푦(푖)풙(푖) + log
푦′풙)
29
푚
푦(푖)풙(푖) + log
푦′풙)
2
29
30
30
30
퐶 = 0.001 to approximate the case
31
interesting pleasure boring zoe write workings Without Reg 0.011
1.80
14.16
32
interesting pleasure boring zoe write workings Without Reg 0.011
1.80
14.16
32
interesting pleasure boring zoe write workings Without Reg 0.011
1.80
14.16
32
퐶 = 102
33
interesting pleasure boring zoe write workings Without Reg 0.011
1.80
14.16 With Reg 0.16 0.36
0.040
34
interesting pleasure boring zoe write workings Without Reg 0.011
1.80
14.16 With Reg 0.16 0.36
0.040
34
◮ Bag-of-words representations ◮ Text classifiers
◮ Overfitting ◮ 퐿2 regularization
36
Jurafsky, D. and Martin, J. (2019). Speech and language processing. Pang, B., Lee, L., and Vaithyanathan, S. (2002). Thumbs up?: sentiment classification using machine learning techniques. In Proceedings of the ACL-02 conference on Empirical methods in natural language processing-Volume 10, pages 79–86. Association for Computational Linguistics.
37