Carnegie Mellon
Inferring User Intent for Learning by Observation
Kevin R. Dixon
krd@cs.cmu.edu
Department of Electrical & Computer Engineering Carnegie Mellon University
2004-01-23, Inferring User Intent for LBO – p.1
Inferring User Intent for Learning by Observation Kevin R. Dixon - - PowerPoint PPT Presentation
Carnegie Mellon Inferring User Intent for Learning by Observation Kevin R. Dixon krd@cs.cmu.edu Department of Electrical & Computer Engineering Carnegie Mellon University 2004-01-23, Inferring User Intent for LBO p.1 Carnegie Mellon
Carnegie Mellon
krd@cs.cmu.edu
2004-01-23, Inferring User Intent for LBO – p.1
Carnegie Mellon
2004-01-23, Inferring User Intent for LBO – p.2
Carnegie Mellon
2004-01-23, Inferring User Intent for LBO – p.2
Carnegie Mellon
2004-01-23, Inferring User Intent for LBO – p.3
Carnegie Mellon
2004-01-23, Inferring User Intent for LBO – p.4
Carnegie Mellon
2004-01-23, Inferring User Intent for LBO – p.5
Carnegie Mellon
2004-01-23, Inferring User Intent for LBO – p.6
Carnegie Mellon
2004-01-23, Inferring User Intent for LBO – p.7
Carnegie Mellon
2004-01-23, Inferring User Intent for LBO – p.7
Carnegie Mellon
2004-01-23, Inferring User Intent for LBO – p.8
Carnegie Mellon
2004-01-23, Inferring User Intent for LBO – p.8
Carnegie Mellon
2004-01-23, Inferring User Intent for LBO – p.9
Carnegie Mellon
2004-01-23, Inferring User Intent for LBO – p.9
Carnegie Mellon
2004-01-23, Inferring User Intent for LBO – p.9
Carnegie Mellon
et al., 2003)
c, 2001)
2004-01-23, Inferring User Intent for LBO – p.10
Carnegie Mellon
Environment Sensor Environment: ω0, ω1, . . . , ωM Subgoals: y0, y1, . . . , yn
2004-01-23, Inferring User Intent for LBO – p.11
Carnegie Mellon
Sensor ω0, ω1, . . . , ωM
y1, . . . , yn Environment Environment: Estimated Subgoals:
2004-01-23, Inferring User Intent for LBO – p.12
Carnegie Mellon
2004-01-23, Inferring User Intent for LBO – p.13
Carnegie Mellon
2004-01-23, Inferring User Intent for LBO – p.14
Carnegie Mellon
2004-01-23, Inferring User Intent for LBO – p.14
Carnegie Mellon
2004-01-23, Inferring User Intent for LBO – p.15
Carnegie Mellon
2004-01-23, Inferring User Intent for LBO – p.15
Carnegie Mellon
Warmuth, 1992)
2004-01-23, Inferring User Intent for LBO – p.16
Carnegie Mellon
2004-01-23, Inferring User Intent for LBO – p.17
Carnegie Mellon
2004-01-23, Inferring User Intent for LBO – p.17
Carnegie Mellon
x3 x3 x1 x2 x1 x2
2004-01-23, Inferring User Intent for LBO – p.18
Carnegie Mellon
2004-01-23, Inferring User Intent for LBO – p.19
Carnegie Mellon
2004-01-23, Inferring User Intent for LBO – p.19
Carnegie Mellon
2004-01-23, Inferring User Intent for LBO – p.19
Carnegie Mellon
2004-01-23, Inferring User Intent for LBO – p.20
Carnegie Mellon
Number of Tasks Probability of Error
2004-01-23, Inferring User Intent for LBO – p.21
Carnegie Mellon
2004-01-23, Inferring User Intent for LBO – p.22
Carnegie Mellon
2004-01-23, Inferring User Intent for LBO – p.23
Carnegie Mellon
2004-01-23, Inferring User Intent for LBO – p.24
Carnegie Mellon
2004-01-23, Inferring User Intent for LBO – p.25
Carnegie Mellon
0.74 0.76 0.78 0.8 0.82 1.1 1.12 1.14 1.16 −1.9 −1.86 −1.82
x (m) Start
y (m) z (m)
0.74 0.76 0.78 0.8 0.82 −0.34 −0.32 −0.3 −0.28 −1.9 −1.86 −1.82
Start x (m)
y (m) z (m)
2004-01-23, Inferring User Intent for LBO – p.26
Carnegie Mellon
2004-01-23, Inferring User Intent for LBO – p.27
Carnegie Mellon
Subgoals: y0, y1, . . . , yn Environment:
Hypothesis
+ 1
Prediction:
2004-01-23, Inferring User Intent for LBO – p.28
Carnegie Mellon
2004-01-23, Inferring User Intent for LBO – p.29
Carnegie Mellon
2004-01-23, Inferring User Intent for LBO – p.30
Carnegie Mellon
Get Next Subroutine Another Subroutine? Predict Next Waypoint Add Subroutine to CDHMM Waypoint Get Initialize CDHMM Another Waypoint? Sufficient Confidence? Compute Prediction Error
no yes yes no yes no
2004-01-23, Inferring User Intent for LBO – p.31
Carnegie Mellon
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 140 150 160 170 180 190
δ States
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0.018 0.02 0.022 0.024 0.026 0.028 0.03
δ Time (s)
2004-01-23, Inferring User Intent for LBO – p.32
Carnegie Mellon
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1.15 1.2 1.25 1.3 1.35 1.4 1.45 1.5 x 10
−3
δ Avg Median (m)
2004-01-23, Inferring User Intent for LBO – p.33
Carnegie Mellon
2004-01-23, Inferring User Intent for LBO – p.34
Carnegie Mellon
2004-01-23, Inferring User Intent for LBO – p.34
Carnegie Mellon
0.2 0.4 0.6 0.8 1 20 40 60 80 100 Confidence Threshold Percentage Useful Predictions Total Predictions
(p ≪ 0.01)
2004-01-23, Inferring User Intent for LBO – p.35
Carnegie Mellon
200 400 600 800 1000 1200 1400 1600 10
−4
10
−3
10
−2
10
−1
Median (m) Waypoint Number
2004-01-23, Inferring User Intent for LBO – p.36
Carnegie Mellon
2004-01-23, Inferring User Intent for LBO – p.37
Carnegie Mellon
2004-01-23, Inferring User Intent for LBO – p.38
Carnegie Mellon
2004-01-23, Inferring User Intent for LBO – p.39
Carnegie Mellon
2004-01-23, Inferring User Intent for LBO – p.40
Carnegie Mellon
2004-01-23, Inferring User Intent for LBO – p.41
Carnegie Mellon
2004-01-23, Inferring User Intent for LBO – p.42
Carnegie Mellon
2004-01-23, Inferring User Intent for LBO – p.43
Carnegie Mellon
2004-01-23, Inferring User Intent for LBO – p.43
Carnegie Mellon
no yes Associate Subgoals with Environment Observe User Another Demo? Compute Subgoals Map Demos to Same Environment Learn from Demonstrations Perform Task
2004-01-23, Inferring User Intent for LBO – p.44
Carnegie Mellon
2004-01-23, Inferring User Intent for LBO – p.45
Carnegie Mellon
Wall Chair Legs Agent Orange Desk Human Legs Computer
2004-01-23, Inferring User Intent for LBO – p.46
Carnegie Mellon
2004-01-23, Inferring User Intent for LBO – p.47
Carnegie Mellon
c, 2001)
2004-01-23, Inferring User Intent for LBO – p.48
Carnegie Mellon
c, 2001)
2004-01-23, Inferring User Intent for LBO – p.48
Carnegie Mellon
c, 2001)
2004-01-23, Inferring User Intent for LBO – p.48
Carnegie Mellon
2004-01-23, Inferring User Intent for LBO – p.49
Carnegie Mellon
Start 2004-01-23, Inferring User Intent for LBO – p.50
Carnegie Mellon
Start
2004-01-23, Inferring User Intent for LBO – p.50
Carnegie Mellon
Start
2004-01-23, Inferring User Intent for LBO – p.50
Carnegie Mellon
Start
2004-01-23, Inferring User Intent for LBO – p.50
Carnegie Mellon
Start Start 2004-01-23, Inferring User Intent for LBO – p.51
Carnegie Mellon
2004-01-23, Inferring User Intent for LBO – p.52
Carnegie Mellon
2004-01-23, Inferring User Intent for LBO – p.52
Carnegie Mellon
2004-01-23, Inferring User Intent for LBO – p.52
Carnegie Mellon
2004-01-23, Inferring User Intent for LBO – p.53
Carnegie Mellon
2004-01-23, Inferring User Intent for LBO – p.53
Carnegie Mellon
2004-01-23, Inferring User Intent for LBO – p.53
Carnegie Mellon
2004-01-23, Inferring User Intent for LBO – p.54
Carnegie Mellon
2004-01-23, Inferring User Intent for LBO – p.55
Carnegie Mellon
Start Start
2004-01-23, Inferring User Intent for LBO – p.55
Carnegie Mellon
Start Start Start Start
2004-01-23, Inferring User Intent for LBO – p.55
Carnegie Mellon
2004-01-23, Inferring User Intent for LBO – p.56
Carnegie Mellon
2004-01-23, Inferring User Intent for LBO – p.56
Carnegie Mellon
2004-01-23, Inferring User Intent for LBO – p.57
Carnegie Mellon
2004-01-23, Inferring User Intent for LBO – p.57
Carnegie Mellon
Prediction-error threshold = 0.4 Prediction-error threshold = 0.8
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 10 20 30 40 50
Prediction−error threshold Subgoals
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0.005 0.01 0.015 0.02 0.025 0.03
Prediction−error threshold Avg Trajectory Error
2004-01-23, Inferring User Intent for LBO – p.58
Carnegie Mellon
2004-01-23, Inferring User Intent for LBO – p.59
Carnegie Mellon
2004-01-23, Inferring User Intent for LBO – p.60
Carnegie Mellon
2004-01-23, Inferring User Intent for LBO – p.61
Carnegie Mellon
2004-01-23, Inferring User Intent for LBO – p.62
Carnegie Mellon
2004-01-23, Inferring User Intent for LBO – p.63
Carnegie Mellon
2004-01-23, Inferring User Intent for LBO – p.64
Carnegie Mellon
2004-01-23, Inferring User Intent for LBO – p.65
Carnegie Mellon
2004-01-23, Inferring User Intent for LBO – p.65
Carnegie Mellon
2004-01-23, Inferring User Intent for LBO – p.66
Carnegie Mellon
2004-01-23, Inferring User Intent for LBO – p.67
Carnegie Mellon
2004-01-23, Inferring User Intent for LBO – p.67
Carnegie Mellon
2004-01-23, Inferring User Intent for LBO – p.67
Carnegie Mellon
2004-01-23, Inferring User Intent for LBO – p.67
Carnegie Mellon
2004-01-23, Inferring User Intent for LBO – p.68
Carnegie Mellon
2 1 5 6 3 4
2004-01-23, Inferring User Intent for LBO – p.69
Carnegie Mellon
2004-01-23, Inferring User Intent for LBO – p.70
Carnegie Mellon
2004-01-23, Inferring User Intent for LBO – p.71
Carnegie Mellon
2004-01-23, Inferring User Intent for LBO – p.72
Carnegie Mellon
2004-01-23, Inferring User Intent for LBO – p.73
Carnegie Mellon
2004-01-23, Inferring User Intent for LBO – p.74
Carnegie Mellon
2004-01-23, Inferring User Intent for LBO – p.75
Carnegie Mellon
2004-01-23, Inferring User Intent for LBO – p.76
Abe, N., & Warmuth, M. K. (1992). On the computational complexity
Machine Learning, 9. Alissandrakis, A., Nehaniv, C. L., & Dautenhahn, K. (2002). Imitation with ALICE: Learning to imitate corresponding actions across dis- similar embodiments. IEEE Transactions on Systems, Man, and Cybernetics – Part A: Systems and Humans, 32. Asada, H., & Asari, Y. (1988). The direct teaching of tool manipula- tion skills via the impedance identifi cation of human motions. Pro- ceedings of the IEEE International Conference on Robotics and Automation. Billard, A., Epars, Y., Cheng, G., & Schaal, S. (2003). Discovering im- itation strategies through categorization of multi-dimensional data. Proceedings of the IEEE/RSJ International Conference on Intelli- gent Robots and Systems. Chen, J., & Zelinsky, A. (2003). Programming by demonstration: Coping with suboptimal teaching actions. International Journal of Robotics Research, 22. Craig, J. J. (1989). Introduction to robotics: Mechanics and control. Addison Wesley. Second edition. Cypher, A. (Ed.). (1993). Watch what I do: Programming by demon-
Ephraim, Y., & Merhav, N. (2002). Hidden Markov processes. IEEE Transactions on Information Theory, 48.
Forsythe, C., & Xavier, P . G. (2002). Human emulation: Progress to- ward realistic synthetic human agents. Proceedings of the 11th Conference on Computer-Generated Forces and Behavior Repre- sentation. Friedrich, H., M¨ unch, S., Dillmann, R., Bocionek, S., & Sassin, M. (1996). Robot programming by demonstration (RPD): Supporting the induction by human interaction. Machine Learning, 23, 163– 189. Gertz, M. W., Maxion, R. A., & Khosla, P . K. (1995). Visual program- ming and hypermedia implementation within a distributed labora- tory environment. Journal of Intelligent Automation and Soft Com- puting. Gillman, D., & Sipser, M. (1994). Inference and minimization of hidden Markov chains. Proceedings of the Seventh Annual ACM Confer- ence on Computational Learning Theory (COLT). Hannaford, B., & Lee, P . (1991). Hidden Markov model analysis of force/torque information in telemanipulation. International Journal
Hovland, G., Sikka, P ., & McCarragher, B. (1996). Skill acquisition from human demonstration using a hidden Markov model. Proceedings
Hwang, J.-H., Arkin, R. C., & Kwon, D.-S. (2003). Mobile robots at your fi ngertip: Bezier curve on-line trajectory generation for supervisory
Intelligent Robots and Systems. Iba, S., Paredis, C. J., & Khosla, P . K. (2002). Interactive multi-modal
robot programming. Proceedings of the IEEE International Confer- ence on Robotics and Automation. Montemerlo, M., Roy, N., & Thrun, S. (2003). Perspectives on stan- dardization in mobile robot programming : The Carnegie Mellon navigation (CARMEN) toolkit. Proceedings of the IEEE/RSJ Inter- national Conference on Intelligent Robots and Systems. Nicolescu, M. N., & Matari´ c, M. J. (2001). Learning and interacting in human-robot domains. IEEE Transactions on Systems, Man, and Cybernetics – Part A: Systems and Humans, 31. Papadimitriou, C. H., & Steiglitz, K. (1998). Combinatorial optimiza- tion: Algorithms and complexity. Mineola, New York: Dover Publi-
Pomerleau, D. (1991). Effi cient training of artifi cial neural networks for autonomous navigation. Neural Copmutation, 3, 88–97. Pomerleau, D. (1996). Neural network vision for robot driving. Early Visual Learning. Oxford University Press. Rabiner, L. R. (1989). A tutorial on hidden Markov models and se- lected applications in speech recognition. Proceedings of the IEEE, 77, 257–286. Ron, D., Singer, Y., & Tishby, N. (1998). On the learnability and usage
System Sciences, 56. Schaal, S., Ijspeert, A., & Billard, A. (2003). Computational ap- proaches to motor learning by imitation. Philosophical Transaction
537–547.
Singh, R., Raj, B., & Stern, R. M. (2002). Automatic generation of sub- word units for speech recognition systems. IEEE Transactions on Speech and Audio Processing, 10, 89–99. Skubic, M., & Volz, R. A. (2000). Acquiring robust, force-based as- sembly skills from human demonstration. IEEE Transactions on Robotics and Automation, 16. Stolcke, A., & Omohundro, S. (1994). Inducing probabilistic grammars by Bayesian model merging. International Conference on Gram- matical Inference.
Carnegie Mellon
Algorithm Learn-Structure X = {X0, X1, . . . , XM} is the multiset of all observation sequences. ǫ ≥ 0 is the similarity threshold. 1: V := ∅, E := ∅ 2: GX := (V, V 0, E, X, V, f, g) 3: for all Xi ∈ {X0, X1, . . . , XM} 4: for all xn ∈ {xi
0, xi 1, . . . , xi Ni}
5: ǫmin := min
vi∈V µC(Vvi ∪ {xn}, Vvi ∪ {xn})
6: if ǫmin ≤ ǫ then 7: vnew := arg min
vi∈V µC(Vvi ∪ {xn}, Vvi ∪ {xn})
8: Vvnew := Vvnew ∪ {xn} 9: end if 10: else if ǫmin > ǫ then 11: create empty node vnew 12: V := V ∪ {vnew} 13: Vvnew := {xn} 14: gvnew := 0 15: if n > 0 then 16: E := E {eprev→new} 17: feprev→new := 0 18: end if 19: end else if 20: if n > 0 then 21: feprev→new := feprev→new + 1 22: end if 23: else if n = 0 then 24: gvnew := gvnew + 1 25: end else if 26: vprev := vnew 27: end for all 28: end for all
2004-01-23, Inferring User Intent for LBO – p.77
Carnegie Mellon
2004-01-23, Inferring User Intent for LBO – p.78
Carnegie Mellon
2004-01-23, Inferring User Intent for LBO – p.79
Carnegie Mellon
−1 ≤ ǫ
5 10 15 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
ε 1−δ χ2
7 Cumulative Distribution Function
2004-01-23, Inferring User Intent for LBO – p.80
Carnegie Mellon
2004-01-23, Inferring User Intent for LBO – p.81
Carnegie Mellon
Carnegie Mellon
2004-01-23, Inferring User Intent for LBO – p.82
Carnegie Mellon
2004-01-23, Inferring User Intent for LBO – p.82
Carnegie Mellon
2004-01-23, Inferring User Intent for LBO – p.82
Carnegie Mellon
2004-01-23, Inferring User Intent for LBO – p.83
Carnegie Mellon
2004-01-23, Inferring User Intent for LBO – p.84
Carnegie Mellon
2004-01-23, Inferring User Intent for LBO – p.84
Carnegie Mellon
2004-01-23, Inferring User Intent for LBO – p.84
Carnegie Mellon
2004-01-23, Inferring User Intent for LBO – p.85
Carnegie Mellon
Error threshold = 0.4 Error threshold = 0.61 Error threshold = 0.8
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 10 20 30 40 50
Prediction−error threshold Subgoals
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0.005 0.01 0.015 0.02 0.025 0.03
Prediction−error threshold Avg Trajectory Error
2004-01-23, Inferring User Intent for LBO – p.86
Carnegie Mellon
1 2 3 4 5 6 7 8 a b c d e f g h
1 2 3 4 5 6 7 8
Steiglitz, 1998)
2004-01-23, Inferring User Intent for LBO – p.87