Coordination via Dialogue Interaction Raquel Fernndez Institute for - - PowerPoint PPT Presentation
Coordination via Dialogue Interaction Raquel Fernndez Institute for - - PowerPoint PPT Presentation
Coordination via Dialogue Interaction Raquel Fernndez Institute for Logic, Language & Computation University of Amsterdam Dialogue Modelling My area of research falls under the heading of Dialogue Modelling: a fairly new field at the
Dialogue Modelling
My area of research falls under the heading of Dialogue Modelling:
- a fairly new field at the interface of (computational) linguistics,
artificial intelligence, psychology, cognitive science, . . .
- concerned with language as it is used in conversation.
In particular, my interests focus on the semantic, pragmatic, and coordination-related aspects of dialogue. Methodologically:
- interest in empirical evidence (from corpora or experiments);
- interest in computational methods of enquiry and evaluation.
Research area connected to both the Logic & Language and Language & Computation groups at the ILLC.
Raquel Fernández LoLaCo 2012 2 / 31
Outline
Two examples of research projects connected to dialogue interaction and coordination:
- Colour terms in collaborative reference tasks
- Adaptation in child-adult dialogues
Raquel Fernández LoLaCo 2012 3 / 31
Interpretation is Flexible
Speakers do not always share identical semantic representations nor identical lexicons. But they are able to communicate successfully most of the time.
Raquel Fernández LoLaCo 2012 4 / 31
Sometimes interlocutors negotiate expressions explicitly
A: A docksider. B: A what? A: Um. B: Is that a kind of dog? A: No, it’s a kind of um leather shoe, kinda pennyloafer. B: Okay, okay, got it.
⇒ Thereafter “the pennyloafer”
Susan Brennan & Herbert Clark (1996). Conceptual Pacts and Lexical Choice, Journal
- f Experimental Psychology, 22(6):1482–1493.
Herbert Clark & Donna Wilkes-Gibbs (1986). Referring as a collaborative process. Cognition, 22:1–39.
Raquel Fernández LoLaCo 2012 5 / 31
Sometimes they implicitly guess their partners’ intentions
They relax the interpretation of their utterances and look for the referent that best matches this looser interpretation. A: a diamond B: ok
[A must mean the tilted square]
A: the salmon shoes B: ok
[A must mean those pink shoes]
Raquel Fernández LoLaCo 2012 6 / 31
Can we implement an artificial dialogue agent that is capable of implicit coordination?
Bert Baumgaertner, Raquel Fernández, and Matthew Stone (2012). Towards a Flexible Semantics: Colour Terms in Collaborative Reference Tasks. In Proceedings of the First Joint Conference on Lexical and Computational Semantics (*SEM), Montreal, Canada.
Raquel Fernández LoLaCo 2012 7 / 31
Aims
We are interested in modelling implicit adaptation computationally
- to get a better understanding of this process
- to contribute to the development of dialogue systems that are
able to better coordinate with their human partners. Our focus is on collaborative referential tasks, taking colour terms as a case study. Our aim is to develop dialogue agents that employ flexible semantic representations
Raquel Fernández LoLaCo 2012 8 / 31
Intuitions
Our view of how colour terms are used in referential tasks follows basic pragmatic principles: speakers and addressees tend to maximise the success of their joint task while minimising costs.
- Gricean maxims of conversation: say enough but not more than is
required (quantity).
- Clark & colleagues’ principle of least collaborative effort: minimise the
joint effort of the interlocutors
Raquel Fernández LoLaCo 2012 9 / 31
In the domain of colours we take this to mean:
Addressees
- are able to relax the interpretation of the speaker’s terms and
look for the referent that best matches this looser interpretation. Speakers
- tend to use a basic colour term whenever this is enough
- but resort to alternative terms (e.g., ‘bordeaux’ or ‘navy blue’)
in contexts where the basic term is deemed insufficient because there are “competitors”.
Raquel Fernández LoLaCo 2012 10 / 31
Our Agent’s Lexicon
Data: publicly available database of RGB codes and colour terms created by Randall Monroe (author of the webcomic xkcd.com)
- colour naming survey taken by around two hundred thousand
participants
- 954 colour terms (the most frequently used by the participants)
- paired with a unique RGB code (location in the RGB colour
space most frequently named with the colour term in question.)
Raquel Fernández LoLaCo 2012 11 / 31
http://blog.xkcd.com/2010/05/03/color-survey-results/
Raquel Fernández LoLaCo 2012 12 / 31
Colour Model and Algorithms
We treat colours as points in a conceptual space
- RGB dimensions (ranging from 0 to 255)
- each RGB code in the lexicon is considered a prototype colour.
- amongst the 954 colour terms in the lexicon, we pick up 10
which we consider basic colours.
- we measure colour proximity in terms of Euclidean distances
between RGB values.
Gärdenfors (2000). Conceptual Spaces. MIT Press, Cambridge.
Our algorithms make use of three thresholds:
- min: minimum distance required for two colours to be considered
different.
- max: maximum range of allowable search for alternative colours
- compdist: distance range within which a colour is considered a
competitor
Raquel Fernández LoLaCo 2012 13 / 31
What do people actually do?
We conducted two small experiments to collect data about how speakers and addressees use colour terms in referential tasks. The two experiments were run online, with 36 native-English participants: 19 in ExpA and 17 in ExpB.
- Generation (ExpA):
∗ participants were shown a series of scenes each with a target ∗ they were asked to refer to the target with a colour term that would allow a potential addressee to identify it in the current context
- Resolution (ExpB):
∗ participants were shown a series of scenes each with a colour term ∗ they were asked to pick up the intended referent ∗ the colour terms used were selected from those produced in ExpA
- Scenes generated according to two parameters:
∗ basic vs. non-basic target colour (brown or magenta vs. rose or blue) ∗ with or without competitors (colours at a distance threshold)
Raquel Fernández LoLaCo 2012 14 / 31
brown chocolate brown dark brown earthy brown poop brown same as mud basic colour w/o competitors 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 blueberry brown chocolate brown colour of mud dark brown basic colour with competitors 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 dark pink dusty rose magenta mauve pink red rose rose pink salmon salmon pink non-basic colour w/o competitors 0.0 0.1 0.2 0.3 0.4 0.5
bright pink dull light fuchsia dull salmon pink dusty rose light mauve light pink light red light salmon lightish pink magenta mauve medium pink
- rangish pink
pastel pink pink red rose rose pink salmon salmon pink terra cotta
non-basic colour with competitors 0.0 0.1 0.2 0.3 0.4 0.5
Main Experimental Results
ExpA showed that speakers attempt to adapt their colour descriptions to the context and that there is high variability in the terms they choose to do this. ExpB showed that reference resolution is almost always successful despite the high variation of terms observed in ExpA.
- Basic colours:
∗ without competitors: participants successfully identified the targets in all cases (100% success rate) ∗ with competitors: 98% success rate ∗ the same results for terms with proportionally high and low freq.
- Non-basic colours:
∗ without competitors: 100% success in all cases (low/high freq.) ∗ with competitors: differences as an effect of frequency
◮ terms produced with high frequency: no resolution errors ◮ low frequency terms: resolution success rate dropped to 78% Raquel Fernández LoLaCo 2012 16 / 31
Comparing Our Model to the Human Data
- The experimental data allows us to make informative
comparisons between humans and our model.
- The data is not sufficient for a proper evaluation
- but the comparison illuminates how the model can be refined
and what the setup required for a proper evaluation would be.
Raquel Fernández LoLaCo 2012 17 / 31
Comparing resolution: success rate
Basic Colours Non-basic Colours high freq. low freq. high freq. low freq. % nc c nc c nc c nc c Humans ExpB 100 98 100 98 100 100 100 78 Resolution algorithm 100 71 100 71 50 100 75 71 c = competitors nc = no competitors
- An agent that rigidly associates colours and terms would have
successfully resolved only 4 of the 29 cases, 3 of which were basic colours with no distractors – a 7.25% success rate.
- A random algorithm would have an average success rate of 25% (four
potential targets)
- Our algorithm is closer to human performance
Raquel Fernández LoLaCo 2012 18 / 31
Summary of Results and Open Issues
Our aim has been to model implicit processes of adaptation in referring tasks, focusing on the specific case of colours. The experiments show that speakers differ greatly in the expressions they use, but addressees are nevertheless able to coordinate. Some open issues:
- Euclidean distances over RGB values seem too crude – a better
approach closer to human perception (Lab model with Delta-e values?)
- We need a more systematic and empirically motivated way to set the
thresholds used by the algorithms.
- How to evaluate automatic generation given the amount of variation
- bserved?
- The performance of the artificial agent should be evaluated in
interaction (integration with a dialogue system)
- Can the approach be extended to other types of expressions?
Raquel Fernández LoLaCo 2012 19 / 31
Outline
Two examples of research projects connected to dialogue interaction and coordination:
- Colour terms in collaborative reference tasks
- Adaptation in child-adult dialogues
Raquel Fernández LoLaCo 2012 20 / 31
Convergence in Dialogue
Humans have a strong tendency to align with their interlocutors when they are engaged in conversation
- convergence on the same vocabulary
- adaptation speech rate, accent, pronunciation
- syntactic structures
- posture, gestures, facial expressions
A variety of terms used in the literature: convergence, alignment, accommodation, tuning, adaptation, chamaleon effect,...
Raquel Fernández LoLaCo 2012 21 / 31
Convergence in Asymmetric Interaction
Convergence has been attested mostly in symmetric situations: dialogue between speakers with equivalent linguistic abilities. However, there is also evidence of convergence / adaptation in asymmetric situations:
- Human–computer interaction: humans adapt features of their
language to the production of dialogue systems or virtual characters
- Native–non-native speakers: native speakers adapt features of
their speech (articulation, speech rate, lexical choice) when talking to non-natives
- Child-adult interaction: our focus
it is well known that child-directed speech (CDS) exhibits distinct features at many levels of linguistic processsing
Raquel Fernández LoLaCo 2012 22 / 31
Main Aims of Our Study
- Contribute to current research on the role of adaptation in CDS.
- Corroborate quantitatively the dynamic character of CDS.
∗ by examining real corpus data ∗ by developing quantitative measures that are easy to derive
- Study the scope of the adaptation process by looking at different
levels of language processing.
Kunert, Fernández, and Zuidema (2011). Adaptation in Child Directed Speech: Evidence from Corpora, in Proceedings of SemDial 2011, the 15th Workshop on the Semantics and Pragmatics of Dialogue, pp. 112-119, Los Angeles, California.
Raquel Fernández LoLaCo 2012 23 / 31
Data
We use the Brown Corpus from the CHILDES database:
- 3 children: Adam (2;3–5;2), Sarah (2;3–5;1), and Eve (1;6–2;3)
- 214 transcribed longitudinal conversations (one per corpus file)
An excerpt from the CHILDES Corpus (Adam sub-corpus):
CHI: Why it got a little tire? MOT: Because it’s a little truck. CHI: can’t it be a bigger truck? MOT: that one can’t be a bigger truck but there are bigger trucks.
Raquel Fernández LoLaCo 2012 24 / 31
Measures of Speech Complexity
Four simple measures to quantify speech complexity:
- Mean Utterance Length: ∼ syntactic complexity
- Mean Word Length: ∼ morphological complexity
- Mean Number of Word Types: ∼ lexical complexity
- Mean Number of Consonant Triples: ∼ phonological complexity
These are combined to obtain a measure of the overall language complexity that acts as a kind of average of the basic measures:
- General Complexity (GC): the sum of UL, WL, WT and CT,
after applying the z-score-transform to each (common scale).
Raquel Fernández LoLaCo 2012 25 / 31
Complexity against Age
Correlation between WT and UL complexity (vertical axis) and the age of the child in months (horizontal axis) in the Adam corpus.
Child-WT vs. age Child-UL vs. age
20 30 40 50 60 70 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 r=0.82*** 20 30 40 50 60 70 1.5 2 2.5 3 3.5 4 4.5 5 5.5 6 r=0.93***
Mother-WT vs. age Mother-UL vs. age
20 30 40 50 60 70 0.8 1 1.2 1.4 1.6 1.8 2 r=0.65*** 20 30 40 50 60 70 4.2 4.4 4.6 4.8 5 5.2 5.4 5.6 5.8 6 r=0.60***
Raquel Fernández LoLaCo 2012 26 / 31
Measuring Correlations
Our interest is in investigating whether the child’s and caretaker’s utterances are correlated and what the possible causes for these correlations are.
- We use the Pearson product-moment correlation coefficient: we
calculate Pearson’s r for each measure X and pair of DPs j, k.
Raquel Fernández LoLaCo 2012 27 / 31
Correlation between complexity of child utterances (horizontal axis) and the mother’s utterances (vertical axis) in the Adam corpus:
utterance length word length
1 2 3 4 5 6 4.2 4.4 4.6 4.8 5 5.2 5.4 5.6 5.8 6 r=0.67*** 3 3.5 4 4.5 3.35 3.4 3.45 3.5 3.55 3.6 3.65 3.7 3.75 r=0.57***
word types consonant triples
0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0.8 1 1.2 1.4 1.6 1.8 2 r=0.55*** 0.05 0.1 0.15 0.2 0.25 0.1 0.15 0.2 0.25 0.3 0.35 r=0.62***
Raquel Fernández LoLaCo 2012 28 / 31
Results across Corpora
Adam Sarah Eve
CT WT WL UL general complexity −0.5 0.5 1 Pearson r
N=55
*** *** *** *** *** ADAM−MOT CT WT WL UL general complexity −0.5 0.5 1 Pearson r
N=132
*** *** ** *** *** SARAH−MOT CT WT WL UL general complexity −0.5 0.5 1 Pearson r
N=20
** *** ** EVE−MOT
Correlations are robust across measures and child-mother pairs.
Raquel Fernández LoLaCo 2012 29 / 31
Summary of Results and Open Issues
We have investigated the dynamics of CDS by quantifying linguistic complexity with simple corpus-based measures.
- There are strong correlations between the complexity of child
and mother utterances.
∗ there is dynamic adaptation between mother and child ∗ these correlations are not entirely explained by the child’s age and the repetitions in the dialogue ∗ they seem to depend on adaptation at the micro-level of dialogue interaction
Some issues that need further investigation:
- What dialogue mechanisms may explain the observed
correlations (beyond age and repetition factors)?
- Does convergence / adaptation enhance language acquisition?
- Challenge: model dynamic alignment (coordination + change)
Raquel Fernández LoLaCo 2012 30 / 31
Conclusions
- Some of the most interesting aspects of language are tied to dialogue.
- Speakers do not always share identical semantic representations nor
identical lexicons: coordination is critical.
- Our linguistic theories should be able to account for (or at least be
compatible with) dialogue coordination mechanisms.
- One of the key challenges is to model coordination and learning
exploiting processes at the microlevel of dialogue interaction.
- The most fun and scientifically interesting way to investigate these
issues is by combining formal theories with processing actual data and implementing artificial agents.
Raquel Fernández (in press). Dialogue, in Oxford Handbook of Computational Linguistics, 2nd Ed., Oxford University Press.
Raquel Fernández LoLaCo 2012 31 / 31