Developing Meeting Support Technologies: From Data to Demonstration - - PowerPoint PPT Presentation

developing meeting support technologies from data to
SMART_READER_LITE
LIVE PREVIEW

Developing Meeting Support Technologies: From Data to Demonstration - - PowerPoint PPT Presentation

Data Demonstrations Beyond Coda Developing Meeting Support Technologies: From Data to Demonstration (and Beyond) Jean Carletta The AMI Consortium and University of Edinburgh 17th Nordic Conference on Computational Linguistics May 15th 2009


slide-1
SLIDE 1

Data Demonstrations Beyond Coda

Developing Meeting Support Technologies: From Data to Demonstration (and Beyond)

Jean Carletta

The AMI Consortium and University of Edinburgh

17th Nordic Conference on Computational Linguistics May 15th 2009

Carletta Developing Meeting Support Technologies

slide-2
SLIDE 2

Data Demonstrations Beyond Coda

Outline

1

Data

2

Demonstrations

3

Beyond

4

Coda

Carletta Developing Meeting Support Technologies

slide-3
SLIDE 3

Data Demonstrations Beyond Coda

The AMI Consortium Goal: to develop technologies that assist people during meetings and that help them make use of recorded meeting archives.

Carletta Developing Meeting Support Technologies

slide-4
SLIDE 4

Data Demonstrations Beyond Coda

Outline

1

Data

2

Demonstrations

3

Beyond

4

Coda

Carletta Developing Meeting Support Technologies

slide-5
SLIDE 5

Data Demonstrations Beyond Coda

The basics

100 hrs of well-recorded meetings with native and non-native speakers of English

  • rthographically transcribed with word timings, ASR output

annotated by hand and machine for many communicative behaviours system for handling the dependencies among different versions

  • f annotations

70 hrs is groups playing roles in a fictitious design exercise

Carletta Developing Meeting Support Technologies

slide-6
SLIDE 6

Data Demonstrations Beyond Coda

Why role play?

In real meetings... Can’t control (or even know) participants’ motivations Can’t understand domain Can’t collect enough material in one domain to build

  • ntologies

Can’t easily measure meeting outcomes

Carletta Developing Meeting Support Technologies

slide-7
SLIDE 7

Data Demonstrations Beyond Coda

The roles

Project Manager Industrial Designer Interface Designer Marketing Expert

Carletta Developing Meeting Support Technologies

slide-8
SLIDE 8

Data Demonstrations Beyond Coda

Zingers

Carletta Developing Meeting Support Technologies

slide-9
SLIDE 9

Data Demonstrations Beyond Coda

IS1004d, 3:07 - 4:11

Carletta Developing Meeting Support Technologies

slide-10
SLIDE 10

Data Demonstrations Beyond Coda

Best of both worlds

100 hours of small group meetings

70% role-plays of remote control design teams 30% from a variety of genres (mostly real)

find out where our methods and results generalize find out where they don’t

Carletta Developing Meeting Support Technologies

slide-11
SLIDE 11

Data Demonstrations Beyond Coda

Hand Annotations

transcription with word-level timings from forced alignment (100%) timestamping against signal (10-30%)

head gestures; hand gestures for addressing and interactions with objects; location in room; gaze; emotion?

discourse structure (70%)

dialogue acts (some w/ addressing), named entities, topic segments, linked extractive and abstractive summaries

Carletta Developing Meeting Support Technologies

slide-12
SLIDE 12

Data Demonstrations Beyond Coda

Costs in person-hours per meeting hour

transcription 30 topic segments + abstractive summaries 6-10 dialogue acts w/ some relations 20 addressing 12 extractive summaries linked to abstract 1 named entities 2-5 hand gestures (rough timings) 6 head gestures (rough timings) 6 head gestures (precision timings) 20 movement around room 4 Total 110-115

Carletta Developing Meeting Support Technologies

slide-13
SLIDE 13

Data Demonstrations Beyond Coda

Outline

1

Data

2

Demonstrations

3

Beyond

4

Coda

Carletta Developing Meeting Support Technologies

slide-14
SLIDE 14

Data Demonstrations Beyond Coda

Core technologies

Visual focus of attention Automatic speech recognition Disfluency detection and removal Dialogue act segmentation and recognition Who was addressed Topic segmentation Extractive summarization Video editing

Carletta Developing Meeting Support Technologies

slide-15
SLIDE 15

Data Demonstrations Beyond Coda

Key Infrastructure: The Hub

Carletta Developing Meeting Support Technologies

slide-16
SLIDE 16

Data Demonstrations Beyond Coda

The problem with not being there

If most people are face-to-face, people who ”connect” in:

Carletta Developing Meeting Support Technologies

slide-17
SLIDE 17

Data Demonstrations Beyond Coda

The problem with not being there

If most people are face-to-face, people who ”connect” in: Can’t tell who’s speaking

Carletta Developing Meeting Support Technologies

slide-18
SLIDE 18

Data Demonstrations Beyond Coda

The problem with not being there

If most people are face-to-face, people who ”connect” in: Can’t tell who’s speaking Can’t see what’s going on

Carletta Developing Meeting Support Technologies

slide-19
SLIDE 19

Data Demonstrations Beyond Coda

The problem with not being there

If most people are face-to-face, people who ”connect” in: Can’t tell who’s speaking Can’t see what’s going on Can’t jump in as readily

Carletta Developing Meeting Support Technologies

slide-20
SLIDE 20

Data Demonstrations Beyond Coda

The problem with not being there

If most people are face-to-face, people who ”connect” in: Find it (even more) socially acceptable to multi-task

Carletta Developing Meeting Support Technologies

slide-21
SLIDE 21

Data Demonstrations Beyond Coda

One solution

Carletta Developing Meeting Support Technologies

slide-22
SLIDE 22

Data Demonstrations Beyond Coda

One solution

Carletta Developing Meeting Support Technologies

slide-23
SLIDE 23

Data Demonstrations Beyond Coda

Enriched conferencing

Carletta Developing Meeting Support Technologies

slide-24
SLIDE 24

Data Demonstrations Beyond Coda

What enriched conferencing needs

speech recognition or keyword spotting where people are looking (”visual focus of attention”) dialogue act segmentation and labeling addressee detection We have demonstrated it in real-time using keyword spotting and showing where just one of the people in the room is looking, and

  • ff-line with everything.

Carletta Developing Meeting Support Technologies

slide-25
SLIDE 25

Data Demonstrations Beyond Coda

Mobile conferencing application

Carletta Developing Meeting Support Technologies

slide-26
SLIDE 26

Data Demonstrations Beyond Coda

Content Linking

Carletta Developing Meeting Support Technologies

slide-27
SLIDE 27

Data Demonstrations Beyond Coda

Content Linking

Carletta Developing Meeting Support Technologies

slide-28
SLIDE 28

Data Demonstrations Beyond Coda

Content Linking

Media server

Previous meetings Current meeting Documents User interface System controller ASR User interface Summarizer Spurt & speaker segmenter KWS

Hub & middleware

Document bank creator and indexer Query aggregator Web search

Carletta Developing Meeting Support Technologies

slide-29
SLIDE 29

Data Demonstrations Beyond Coda

Meeting Browsing

Carletta Developing Meeting Support Technologies

slide-30
SLIDE 30

Data Demonstrations Beyond Coda

Meeting Browsing

Carletta Developing Meeting Support Technologies

slide-31
SLIDE 31

Data Demonstrations Beyond Coda

Outline

1

Data

2

Demonstrations

3

Beyond

4

Coda

Carletta Developing Meeting Support Technologies

slide-32
SLIDE 32

Data Demonstrations Beyond Coda

Meeting Browsing

Carletta Developing Meeting Support Technologies

slide-33
SLIDE 33

Data Demonstrations Beyond Coda

Community of Interest

Carletta Developing Meeting Support Technologies

slide-34
SLIDE 34

Data Demonstrations Beyond Coda

Synthetron

collaborative brainstorming on a given topic

  • n-line, real-time, large scale discussions

uses instant messaging

Carletta Developing Meeting Support Technologies

slide-35
SLIDE 35

Data Demonstrations Beyond Coda

Synthetron

collaborative brainstorming on a given topic

  • n-line, real-time, large scale discussions

uses instant messaging requires manual analysis to identify the key themes

Carletta Developing Meeting Support Technologies

slide-36
SLIDE 36

Data Demonstrations Beyond Coda

Comic Book Summaries

Carletta Developing Meeting Support Technologies

slide-37
SLIDE 37

Data Demonstrations Beyond Coda

COI Experiences

“Being a SME, following AMI since a few years, we are very enthusiastic with the mini-project experience... We were able to translate [AMIDA’s latest research results] into a practical application with AMIDA researchers in a very short time, testing the first comics format reports with several of our end clients in less than a month, allowing pragmatic and quick cycle time.” Joanne Celens, CEO, Synthethron

Carletta Developing Meeting Support Technologies

slide-38
SLIDE 38

Data Demonstrations Beyond Coda

COI Experiences

“Being a SME, following AMI since a few years, we are very enthusiastic with the mini-project experience... We were able to translate [AMIDA’s latest research results] into a practical application with AMIDA researchers in a very short time, testing the first comics format reports with several of our end clients in less than a month, allowing pragmatic and quick cycle time.” Joanne Celens, CEO, Synthethron

Carletta Developing Meeting Support Technologies

slide-39
SLIDE 39

Data Demonstrations Beyond Coda

Outline

1

Data

2

Demonstrations

3

Beyond

4

Coda

Carletta Developing Meeting Support Technologies

slide-40
SLIDE 40

Data Demonstrations Beyond Coda

Data and tools make this possible - but what kind of data, and what kind of tools?

Carletta Developing Meeting Support Technologies

slide-41
SLIDE 41

Data Demonstrations Beyond Coda

A Toy Example

Carletta Developing Meeting Support Technologies

slide-42
SLIDE 42

Data Demonstrations Beyond Coda

NXT-format Switchboard Corpus

47.0 48.0 49.0

t (s)

ph t ph hh ph ae ph v ph t ph ax ph d ph iy ph l ph w ph ih ph dh ph ih ph t ph dh ph ah ph g ph dh ph ah ph ah ph v ph er ph m ph ih ph n ph t ph d ph ah ph z ph en syl n syl s syl n syl p syl n syl p syl n syl p syl p syl p syl p syl n word have VB word to TO word deal VB word with IN word it PRP word the DT word the DT word government NN word doesn’t VBZ-RB phrase disfl nt NP nt NP markable

  • rganisation

med-gen kontrast contrast kontrast contrast kontrast backgd da statement kontrast backgd markable non-concrete

  • ld

nt EDITED nt PP nt VP nt VP nt VP nt VP nt S nt S nt NP accent nuclear accent plain accent nuclear phrase minor phrase major trace movement target source repair reparandum disfluency sil

* * * * * * * * * * * * * * * * * *

word does VBZ phonword doesn’t 47.96-48.18

*

phon word n’t RB phon syl n word the DT phonword the 47.48-47.61

*

phon

Carletta Developing Meeting Support Technologies

slide-43
SLIDE 43

Data Demonstrations Beyond Coda

NITE XML Toolkit

Open source toolkit for handling annotations with temporal

  • rdering and full structural relations

Data storage format designed to support distributed corpus development

Libraries for data handling, query, and writing graphical user interfaces End user annotation tools for common tasks Command line utilities for analysis, feature extraction

Carletta Developing Meeting Support Technologies

slide-44
SLIDE 44

Data Demonstrations Beyond Coda

Butterflies: deixis

Carletta Developing Meeting Support Technologies

slide-45
SLIDE 45

Data Demonstrations Beyond Coda

Butterflies: Bible studies

Carletta Developing Meeting Support Technologies

slide-46
SLIDE 46

Data Demonstrations Beyond Coda

Butterflies: opinions about movies

Carletta Developing Meeting Support Technologies

slide-47
SLIDE 47

Data Demonstrations Beyond Coda

Butterflies: eyetracking

Carletta Developing Meeting Support Technologies

slide-48
SLIDE 48

Data Demonstrations Beyond Coda

Butterflies: dialogue system strategy

Carletta Developing Meeting Support Technologies

slide-49
SLIDE 49

Data Demonstrations Beyond Coda

Community building

A well-functioning research community needs data, tools, and resources — and most of all, if we want to understand each other and break out of our ruts, we need to figure out how to share them both among ourselves and among people we don’t talk to every day.

Carletta Developing Meeting Support Technologies

slide-50
SLIDE 50

Data Demonstrations Beyond Coda

Links

The NITE XML Toolkit, http://www.ltg.ed.ac.uk/NITE The NXT-format Switchboard Corpus, http://groups.inf.ed.ac.uk/switchboard The AMI Corpus, http://corpus.amiproject.org Me, http://homepages.inf.ed.ac.uk/jeanc

Carletta Developing Meeting Support Technologies