Applications of Latent Entity Networks in Information Retrieval - - PowerPoint PPT Presentation

applications of latent entity networks in information
SMART_READER_LITE
LIVE PREVIEW

Applications of Latent Entity Networks in Information Retrieval - - PowerPoint PPT Presentation

Applications of Latent Entity Networks in Information Retrieval Andreas Spitz, Michael Gertz Heidelberg University, Institute of Computer Science Database Systems Research Group { spitz,gertz } @informatik.uni-heidelberg.de Workshop


slide-1
SLIDE 1

Applications of Latent Entity Networks in Information Retrieval

Andreas Spitz, Michael Gertz

Heidelberg University, Institute of Computer Science Database Systems Research Group {spitz,gertz}@informatik.uni-heidelberg.de

Workshop Internationale Klima- und Energiediskurse Darmstadt, May 26, 2017

slide-2
SLIDE 2

Motivation Latent Network Extraction Contextual Entity Topics Network Information Retrieval Summary Latent Entity Networks in Information Retrieval Andreas Spitz 1 of 11

slide-3
SLIDE 3

Motivation Latent Network Extraction Contextual Entity Topics Network Information Retrieval Summary

Motivation

Definition: Event

“Something that happens at a given place and time between a group of actors.”

[CSG+02]

Latent Entity Networks in Information Retrieval Andreas Spitz 2 of 11

slide-4
SLIDE 4

Motivation Latent Network Extraction Contextual Entity Topics Network Information Retrieval Summary

Motivation

Definition: Event

“Something that happens at a given place and time between a group of actors.”

[CSG+02] For large document collections such as corpora of newspapers, how can we...

  • obtain events from unstructured text?
  • identify connections across documents?
  • support entity-centric event search?

Latent Entity Networks in Information Retrieval Andreas Spitz 2 of 11

slide-5
SLIDE 5

Motivation Latent Network Extraction Contextual Entity Topics Network Information Retrieval Summary

Information Network Extraction from Text

Latent Entity Networks in Information Retrieval Andreas Spitz 3 of 11

slide-6
SLIDE 6

Motivation Latent Network Extraction Contextual Entity Topics Network Information Retrieval Summary

Information Network Extraction from Text

Latent Entity Networks in Information Retrieval Andreas Spitz 3 of 11

slide-7
SLIDE 7

Motivation Latent Network Extraction Contextual Entity Topics Network Information Retrieval Summary

Information Network Extraction from Text

Latent Entity Networks in Information Retrieval Andreas Spitz 3 of 11

slide-8
SLIDE 8

Motivation Latent Network Extraction Contextual Entity Topics Network Information Retrieval Summary

Information Network Extraction from Text

Latent Entity Networks in Information Retrieval Andreas Spitz 3 of 11

slide-9
SLIDE 9

Motivation Latent Network Extraction Contextual Entity Topics Network Information Retrieval Summary

Information Network Extraction from Text

[SG16]

Latent Entity Networks in Information Retrieval Andreas Spitz 3 of 11

slide-10
SLIDE 10

Motivation Latent Network Extraction Contextual Entity Topics Network Information Retrieval Summary

Edge Weight Generation

For edges (x, y) for which y is a page or sentence, count only (co-) occurrences: ω(x, y) =

  • 1

if y contains x

  • therwise

[SG16]

Latent Entity Networks in Information Retrieval Andreas Spitz 4 of 11

slide-11
SLIDE 11

Motivation Latent Network Extraction Contextual Entity Topics Network Information Retrieval Summary

Edge Weight Generation

For edges (x, y) for which y is a page or sentence, count only (co-) occurrences: ω(x, y) =

  • 1

if y contains x

  • therwise

For edges (x, y) between entity types and terms, aggregate co-occurrence instances I: sum over similarities derived from sentence distances s. ω(x, y) :=

  • i∈I

exp(−s(x, y, i))

[SG16]

Latent Entity Networks in Information Retrieval Andreas Spitz 4 of 11

slide-12
SLIDE 12

Motivation Latent Network Extraction Contextual Entity Topics Network Information Retrieval Summary

Entity Topics: Brexit

Topics for David Cameron (Q192) − UK (Q145)

0.00 0.25 0.50 0.75 1.00 Jun Jul Aug Sep Oct

date relative frequency of mentions

brexit nation favour demand govern referendum ukip vote westminst campaign prime minist leader resign pro−brexit

Latent Entity Networks in Information Retrieval Andreas Spitz 5 of 11

slide-13
SLIDE 13

Motivation Latent Network Extraction Contextual Entity Topics Network Information Retrieval Summary

Entity Topics: Olympic Games

Topics for Brazil (Q155) − IOC (Q40970)

0.00 0.25 0.50 0.75 1.00 Jun Jul Aug Sep Oct

date relative frequency of mentions

region decad crisis insist corrupt

  • lymp game athlet

sport event silver bronz gold medal medalist

Latent Entity Networks in Information Retrieval Andreas Spitz 6 of 11

slide-14
SLIDE 14

Motivation Latent Network Extraction Contextual Entity Topics Network Information Retrieval Summary

Event Extraction and Search

Intuition:

  • Events correspond to patterns in the

network (e.g., triangular structures)

  • Participating entities can be used to

complete events

Latent Entity Networks in Information Retrieval Andreas Spitz 7 of 11

slide-15
SLIDE 15

Motivation Latent Network Extraction Contextual Entity Topics Network Information Retrieval Summary

Event and Entity Search and Exploration

EVELIN: Exploration of Event and Entity Links in Information Networks Available for: Wikipedia: http://evelin.ifi.uni-heidelberg.de/ News Corpus: http://evelin.ifi.uni-heidelberg.de:7777

[SAG17]

Latent Entity Networks in Information Retrieval Andreas Spitz 8 of 11

slide-16
SLIDE 16

Motivation Latent Network Extraction Contextual Entity Topics Network Information Retrieval Summary

Summary

Latent Entity Networks:

  • fast entity and event exploration
  • can support most entity-related Information Extraction tasks
  • can be extended to any kind of entity
  • scalable and fast
  • language-agnostic with entity linking

Latent Entity Networks in Information Retrieval Andreas Spitz 9 of 11

slide-17
SLIDE 17

Motivation Latent Network Extraction Contextual Entity Topics Network Information Retrieval Summary

Available for download:

  • Wikipedia latent entity networks
  • Code for generating latent entity networks
  • Code for the query interface

http://dbs.ifi.uni-heidelberg.de/index.php?id=load

Latent Entity Networks in Information Retrieval Andreas Spitz 10 of 11

slide-18
SLIDE 18

Motivation Latent Network Extraction Contextual Entity Topics Network Information Retrieval Summary

Available for download:

  • Wikipedia latent entity networks
  • Code for generating latent entity networks
  • Code for the query interface

http://dbs.ifi.uni-heidelberg.de/index.php?id=load

Latent Entity Networks in Information Retrieval Andreas Spitz 10 of 11

slide-19
SLIDE 19

Motivation Latent Network Extraction Contextual Entity Topics Network Information Retrieval Summary

Bibliography I

Christopher Cieri, Stephanie Strassel, David Graff, Nii Martey, Kara Rennert, and Mark Liberman. Corpora for topic detection and tracking. In Topic Detection and Tracking. Springer, 2002. Andreas Spitz, Satya Almasian, and Michael Gertz. Evelin: Exploration of event and entity links in implicit networks. In WWW, 2017. Andreas Spitz and Michael Gertz. Terms over LOAD: Leveraging named entities for cross-document extraction and summarization of events. In SIGIR, 2016.

Latent Entity Networks in Information Retrieval Andreas Spitz 11 of 11