END-TO-END ARGUMENT MINING FOR DISCUSSION THREADS BASED ON PARALLEL CONSTRAINED POINTER ARCHITECTURE
Tokyo University of Agriculture and Technology, Japan. Gaku Morio (Master course 2nd) Katsuhide Fujita (Supervisor)
ArgMining 2018 @ EMNLP 2018
Tokyo University of Agriculture and Technology, Japan. Gaku Morio - - PowerPoint PPT Presentation
END-TO-END ARGUMENT MINING FOR DISCUSSION THREADS BASED ON PARALLEL CONSTRAINED POINTER ARCHITECTURE Tokyo University of Agriculture and Technology, Japan. Gaku Morio (Master course 2nd) Katsuhide Fujita (Supervisor) ArgMining 2018 @ EMNLP
ArgMining 2018 @ EMNLP 2018
2
3
Takayuki Ito, Yuma Imi, Takanori Ito, and Eizo Hideshima. Collagree: A faciliator-mediated large- scale consensus support system. In Proceedings of the 2nd International Conference of Collective Intelligence, 2014. Joonsuk Park and Claire Cardie. 2018. A corpus of erulemaking user comments for measuring evaluability of arguments. In Proceedings of the Eleventh International Conference on LREC, 2018.
4
Gaku Morio and Katsuhide Fujita. Predicting argumentative influence probabilities in large-scale online civic engagement. In Companion Proceedings of The Web Conference 2018, WWW ’18, pp. 1427–1434.
5
6
7
8
9
⋯
!" ($,')⋯
(' ($,")⋯
Attention softmax⋯ ⋯
+" +'⋯
1 !" ($,') ," word representations sentence representation + A:en;on Inter-Post Pointer Distribution softmax BiLSTM Sentence representation (" ($,') (' ($,') Output Layer (Type Classification) Post 251 3 4 2 1 Post 253 5 6 7 Repl y softmax Claim⋯
Output Layer (IPR Extraction) Output Layer (IPI Extraction) Inner-Post Pointer DistributionProceedings of the 2017 Conference on EMNLP, 2017.
10
11
discussion thread, the work don’t distinguish inner- and inter- post scheme.
the semantic types of claims and premises in an online persuasive forum,” in Proceedings of the 4th Workshop on Argument Mining. 2017, pp. 11–21.
12
Post:170 I think the municipal subway should introduce an around-the-clock
Yes, I think making the subway operating 24 hours is appealing. Post:171 I want to enjoy Nagoya until late at night.
Premise Claim
Depth = 0 Depth = 1
Inner-post relation (IPR)
structures in persuasive essays,” Computational Linguistics, vol. 43, no. 3, pp. 619–659, 2017.
i.e., claim and premise argument [Stab 2017]
13
Post:170 I think the municipal subway should introduce an around-the-clock
Yes, I think making the subway operating 24 hours is appealing.
Post:171
I want to enjoy Nagoya until late at night. Premise Claim Inter-post interaction (IPI) Target Callout
Depth = 0 Depth = 1
A callout should be a claim and has at most one target. This restriction keep relations a tree.
14
in cooperation with the local government.
proposition appears per sentence in most cases.
15
confirmed.
[Stab2017] [ours] 1 2
16
post
3 4 2 1
post
5 6 7
post
8 9
post
10 11 12 13 premise target callout
claim premise
IPI IPI IPR IPR claim/premise
17
advantages in terms of “low error propagation.”
relation target in arguments.
i.e., parallel architecture.
explicit constraints of discussion threads i.e., constrained pointer architecture.
the 2017 Conference on EMNLP, 2017.
18
1 !"
($,")
!'
($,")
("
($,")
2 3 4 ⊥ 5 6 7
⋯
!"
($,')
⋯
('
($,")
⋯
Attention
softmax
⋯ ⋯
+" +'
⋯
1 !"
($,')
,"
word representations sentence representation
+
Attention Inter-Post Pointer Distribution
softmax
BiLSTM Sentence representation
("
($,')
('
($,')
Output Layer (Type Classification)
Post 251
3 4 2 1
Post 2535
6 7 Rep ly
softmax Claim
⋯
Output Layer (IPR Extraction) Output Layer (IPI Extraction) Inner-Post Pointer Distribution
19
For example, assume given following thread with two posts.
e.g.
Post
3 4 2 1
Post
5 6 7 Reply
Sentence Thread
20
In the input module, each sentence is converted into sentence representation.
1 2 3 4 ⊥ 5 6 7
⋯
Post Post
Reply
Separation Symbol
Sentence
Embedding layer 3 4 2 1 5 6 7
21
Next, the encoding module with BiLSTM acquires context-aware sentence representations.
1 2 3 4 ⊥ 5 6 7
⋯
Post
3 4 2 1
Post
5 6 7 Reply
Sentence
⋯
BiLSTM
22
The output modules are PCPA’s classification module which has three output classification layers.
1 2 3 4 ⊥ 5 6 7
⋯
Post
3 4 2 1
Post
5 6 7 Reply
⋯
Component Classifier IPR Classifier IPI Classifier
1 2 3
23
First, we explain the Component Classifier.
1 2 3 4 ⊥ 5 6 7
⋯
Post
3 4 2 1
Post
5 6 7 Reply
⋯
Component Classifier IPR Classifier IPI Classifier
1 2 3
24
This layer classifies a sentence type (premise, claim or non-argumentative.)
1 2 3 4 ⊥ 5 6 7
⋯
Post
3 4 2 1
Post
5 6 7 Reply
⋯
softmax
claim claim premise premise premise premise premise Component Classifier
1
25
1 2 3 4 ⊥ 5 6 7
⋯
Post
3 4 2 1
Post
5 6 7 Reply
⋯
softmax
claim claim premise premise premise premise premise
This layer classifies a sentence type (premise, claim or non-argumentative.)
Component Classifier
1
26
Pointer Network can estimate the relation target by a pointer distribution.
1 2 3 4 ⊥ 5 6 7
⋯
Post
3 4 2 1
Post
5 6 7 Reply
⋯
Pointer Network
Next, the IPR Classifier discriminates inner-post relations using Pointer Networks.
IPR Classifier
2
27
1 2 3 4 ⊥ 5 6 7
⋯
Post
3 4 2 1
Post
5 6 7 Reply
⋯
Pointer Network
For example, let me explain how to search an inner-post relation (IPR) target of sentence “3.”
e.g. Pointer distribution
1
3
28
3
1 2 3 4 ⊥ 5 6 7
⋯
Post
3 4 2 1
Post
5 6 7 Reply
⋯
Pointer Network
In this case, the IPR target is “4.” with the max value of the pointer distribution.
e.g. Pointer distribution
1
3
29
3
There is a problem; we noticed that the computation space of an
scheme.
1 2 3 4 ⊥ 5 6 7
⋯
Post
3 4 2 1
Post
5 6 7 Reply
⋯
Pointer Network
Too wide!
30
Therefore, PCPA constrains computation
scan out of post distributions in IPR because IPR is an inner-post relation.
1 2 3 4 ⊥ 5 6 7
⋯
Post
3 4 2 1
Post
5 6 7 Reply
⋯
Pointer Network
Constrain!
31
1 2 3 4 ⊥ 5 6 7
⋯
Post
3 4 2 1
Post
5 6 7 Reply
⋯
Pointer Network
Finally, we explain the inter-post interaction (IPI) layer.
IPI Classifier
3
32
1 2 3 4 ⊥ 5 6 7
⋯
Post
3 4 2 1
Post
5 6 7 Reply
⋯
Pointer Network
For the IPI classifier, we employ a pointer network similar to the IPR. For example, let’s search IPI target from sentence “5.”
5
5
e.g.
33
1 2 3 4 ⊥ 5 6 7
⋯
Post
3 4 2 1
Post
5 6 7 Reply
⋯
Pointer Network
5
5
We can constrain! In the IPI, PCPA can also constrain computation space, and we don’t need to scan no relevant sentences like “6,7” because IPI is a post-to-post relation.
34
1 2 3 4 ⊥ 5 6 7
⋯
Post
3 4 2 1
Post
5 6 7 Reply
⋯
Pointer Network
5
5 Pointer distribution
In the IPI, PCPA can also constrain computation space, and we don’t need to scan no relevant sentences like “6,7” because IPI is a post-to-post relation.
35
1 2 3 4 ⊥ 5 6 7
⋯
Post
3 4 2 1
Post
5 6 7 Reply
⋯
Pointer Network 5 Pointer distribution Found!
IPI 5
In the IPI, PCPA can also constrain computation space, and we don’t need to scan no relevant sentences like “6,7” because IPI is a post-to-post relation.
36
37
& ∗ !" while the standard Pointer
& ∗ !" & .
& ∗ !" is large enough, though, the number of
sentences per post is not so large in real world.
38
39
Pointer Networks.
40
Claim F1 Premise F1 NA F1 IPR F1 IPI F1
PCPA (ours)
Pointer Network (Seq2Seq)
Pointer Network (no Seq2Seq)
MTL-BiLSTM
For each model, we show the best score, and * indicates significant. at ! < 0.01, two- sided Wilcoxon signed rank test.
in terms of IPR and IPI classifications.
41
→ Thread depth.
42
Ours Pointer Networks w/o seq2seq Pointer Networks MTL-BiLSTM
→ Thread depth.
43
Ours
Pointer Networks w/o seq2seq
Pointer Networks MTL-BiLSTM
44
time complexity.
post relation (IPR) and inter-post interaction (IPI).
45
46
47
48
49
50
51
Post:175
It's not realistic as long as we keep the municipal operations. We should entrust not only to the subway but such business parts to private sectors. Privatized parks are getting better and better
Depth = 1
52
53
Post:175 Depth = 1
It's not realistic as long as we keep the municipal operations. We should entrust not only to the subway but such business parts to private sectors. Privatized parks are getting better and better
54
55