Tong Gao
March 2020
Mat MattNet tNet: : Modu Modular Atten lar Attention tion - - PowerPoint PPT Presentation
March 2020 Mat MattNet tNet: : Modu Modular Atten lar Attention tion Network for Referring Network for Referring Expres Expression Comp sion Comprehe rehension nsion Tong Gao Background Referring expressions are natural language
Tong Gao
March 2020
–
–
1.
2.
3.
Constructed on Wording Embedding with 3 individual embeddings 𝑔
𝑛
–
–
V 1. Compute attention score based on V, 𝑟𝑡𝑣𝑐𝑘 2. Get subject visual representation 𝑤𝑗
𝑡𝑣𝑐𝑘
𝑤𝑗
𝑡𝑣𝑐𝑘
image
(Up to five)
(Up to five)
𝑘), (𝑝𝑙, 𝑠 𝑗)
RefCOCO, RefCOCO+ RefCOCOg Collected in Interactive game interface Non-interactive setting Average length of expressions 3.5 8.4 Same-type object 3.9 1.63 Absolute location words Yes No
RefCOCO, RefCOCO+ RefCOCOg Splitting For evaluation:
No overlap between training, validation and testing sets First partition:
training and validation sets
Second partition:
validation and test set
prior knowledge
features from ResNet
prediction)
should this case be considered?
locations, dependent on the width & height of given object 𝑝𝑗 – why not use 𝑋 and 𝐼?
attribute information?