Biologically Inspired Machine Perception N i c h o l a s B u t k o - - PowerPoint PPT Presentation

biologically inspired machine perception
SMART_READER_LITE
LIVE PREVIEW

Biologically Inspired Machine Perception N i c h o l a s B u t k o - - PowerPoint PPT Presentation

Biologically Inspired Machine Perception N i c h o l a s B u t k o , M a c h i n e P e r c e p t i o n L a b W I n t e r , 2 0 1 0 <Chapter 1> Artificial Intelligence vs. Natural Intelligence Borrowed Intelligence vs. Owned


slide-1
SLIDE 1

Biologically Inspired Machine Perception

N i c h o l a s B u t k o , M a c h i n e P e r c e p t i o n L a b W I n t e r , 2 0 1 0

slide-2
SLIDE 2

<Chapter 1>

Artificial Intelligence vs. Natural Intelligence Borrowed Intelligence vs. Owned Intelligence Hard Things are Easy, Easy Things are Hard

slide-3
SLIDE 3

Inspiration

Early on, Artificial Intelligence grabbed hold of my imagination and wouldn’t let go. “The Age of Spiritual Machines” By 2020, computers will have more transistors than brains have neurons. That won’t be sufficient for computers to be intelligent: Can’t write a summary of a movie Can’t tie shoe-laces Can’t recognize humor AI is not limited by computing power, but by our understanding of “intelligence” A revolution in that understanding is required before we can create truly cognitive machines. I wanted to be part of that revolution.

slide-4
SLIDE 4

First Steps

Freshman year of undergrad: Volunteered in lab of AI prof in CSE. “Labeling” eyes and mouths. Thousands of images. Computer used this information to help figure out facial expression. One of the most successful paradigms in AI: “Supervised Learning” “Learn” about facial expressions from thousands of examples Use statistics, calculus, and linear algebra.

slide-5
SLIDE 5

Computer Expression Recognition Toolbox

“Supervised Learning” has been very successful; My own lab uses it extensively to develop sophisticated facial-expression recognizers.

[Demo at end, if we have time]

slide-6
SLIDE 6

Computer Expression Recognition Toolbox

Widely Applicable:

Driver Drowsiness Lie Detection Real/Fake Pain Autism Therapy Tutoring Smile Shutter Art

Different from how humans learn: Nobody points out thousands of eyes and mouths to babies to help them learn about faces.

slide-7
SLIDE 7

May, 11, 1997

slide-8
SLIDE 8

What’s wrong?

slide-9
SLIDE 9

Simple Is Hard

Daniel Wolpert, “The Master Puppeteer” Crick Memorial Lecture, 2005

http://royalsociety.org/event.asp?id=3773

slide-10
SLIDE 10

Why is simple hard?

Artificial domains like chess have a clear, well defined structure. Natural domains like “seeing” are rife with ambiguity. Consider a simple problem like “how to look at something.”

? ? ?

slide-11
SLIDE 11

Dealing with Ambiguities

To know “how to look somewhere”, it is helpful to know “where did I look?” From many experiences of sending signals to your eye-muscle neurons, your brain can learn the relationship between actions and consequences. Even the question “Where did I look?” is hard to answer! Lots of things could go wrong. Can we ever make explicit rules for all of them?

Whole Scene View 1 View 2 Difficulty

No match Same object? (Which lightpost?) Same object type? (Lake or Cloud?) Same location? (Moving Target)

slide-12
SLIDE 12

</Chapter 1>

1) Which of these is easiest for a computer program: Seeing, Doing your laundry, Playing Sudoku, Writing a Book Report, Laughing at funny jokes? 2) We gave four reasons that it’s tough to know where you’re looking. Can you remember them? What’s the main difficulty that unites them? 3) If you were going to use today’s state-of-the-art approaches to make an intelligent computer program that “Knows how to teach,” what is the first thing you should do?

slide-13
SLIDE 13

<Chapter 2>

The Computational Approach: Do we need feathers to fly? Define the Problem with a Generative Model Algebra to the Rescue: Finding all the rules.

slide-14
SLIDE 14

How to study Natural Intelligence?

Study the “aerodynamics” of natural intelligence -- the underlying principles and objectives organizing behavior.

Want a theory that’s not just about humans Flying is not about birds and feathers. Different organisms or systems may not have access to the type of actuators and sensors that humans have, but we still want to understand and build intelligent systems.

Choose problems that will help us understand behavior in real life.

E.g. “Learning how to look somewhere.”

Trying to understand perception by studying only neurons is like trying to understand bird flight by studying only feathers: it just cannot be done. In order to understand bird flight, we have to understand aerodynamics; only then do the structure of the feathers and the different shapes of bird’s wings make sense.

  • -Marr, Vision, 1982
slide-15
SLIDE 15

Defining the Problem: A Generative Model

A “Generative Model” is a tool to describe the structure of the problems organisms face. You must describe how the things you can see relate to the things you want to know. You must describe your uncertainty about how things are and how things will be. Probability theory tells us how to make the best guess about how the things you want to know are and how they will be based on everything you’ve seen before.

  • a

Motor command value How the motors work World appearance Where the camera is looking Camera image

t=1 t=2 t=3

{0.3, 0.2} {0.25, 0.5} {0.1, 0.1}

Sensory () Motor (a)

1 2

1 2

+∞ +∞ +∞

+∞

={x',y'} ={x,y}

=2

slide-16
SLIDE 16

Finding all the rules

A little probability theory: And a little algebra: Give us all the rules for making the best guess about where we are looking.

W here am I looking?

  • p(τt|τ1:t−1, ψ1:t, a1:t) =

= p(τt|τ1:t−1, a1:t)p(ψt|ψ1:t−1, τ1:t−1)p(ψ1:t−1|τ1:t−1) p(ψ1:t|τ1:t−1)

g(τt) = = −.5

P redicted Motion Match

  • (τt − CtαKt)T (CtΣαKtCT

t + Qα)−1(τt − CtαKt)

− .5

  • xy

(ψxy

t

− λxy

Kt)2

(σxy2

λKt + q2 λ)

  • Image Match

−.5

  • xy

log(σxy2

λKt + q2 λ)

  • Uncertainty P enalty

Everything you’ve seen so far What you see right now.

slide-17
SLIDE 17

Finding all the rules

A little probability theory: And a little algebra: Give us all the rules for making the best guess about where we are looking.

W here am I looking?

  • p(τt|τ1:t−1, ψ1:t, a1:t) =

= p(τt|τ1:t−1, a1:t)p(ψt|ψ1:t−1, τ1:t−1)p(ψ1:t−1|τ1:t−1) p(ψ1:t|τ1:t−1)

g(τt) = = −.5

P redicted Motion Match

  • (τt − CtαKt)T (CtΣαKtCT

t + Qα)−1(τt − CtαKt)

− .5

  • xy

(ψxy

t

− λxy

Kt)2

(σxy2

λKt + q2 λ)

  • Image Match

−.5

  • xy

log(σxy2

λKt + q2 λ)

  • Uncertainty P enalty

Everything you’ve seen so far What you see right now.

Where you think you’re looking based

  • n the neural signals sent to your eyes.
slide-18
SLIDE 18

Finding all the rules

A little probability theory: And a little algebra: Give us all the rules for making the best guess about where we are looking.

W here am I looking?

  • p(τt|τ1:t−1, ψ1:t, a1:t) =

= p(τt|τ1:t−1, a1:t)p(ψt|ψ1:t−1, τ1:t−1)p(ψ1:t−1|τ1:t−1) p(ψ1:t|τ1:t−1)

g(τt) = = −.5

P redicted Motion Match

  • (τt − CtαKt)T (CtΣαKtCT

t + Qα)−1(τt − CtαKt)

− .5

  • xy

(ψxy

t

− λxy

Kt)2

(σxy2

λKt + q2 λ)

  • Image Match

−.5

  • xy

log(σxy2

λKt + q2 λ)

  • Uncertainty P enalty

Everything you’ve seen so far What you see right now.

Possible Match OK Match Good Match

slide-19
SLIDE 19

Finding all the rules

A little probability theory: And a little algebra: Give us all the rules for making the best guess about where we are looking.

W here am I looking?

  • p(τt|τ1:t−1, ψ1:t, a1:t) =

= p(τt|τ1:t−1, a1:t)p(ψt|ψ1:t−1, τ1:t−1)p(ψ1:t−1|τ1:t−1) p(ψ1:t|τ1:t−1)

g(τt) = = −.5

P redicted Motion Match

  • (τt − CtαKt)T (CtΣαKtCT

t + Qα)−1(τt − CtαKt)

− .5

  • xy

(ψxy

t

− λxy

Kt)2

(σxy2

λKt + q2 λ)

  • Image Match

−.5

  • xy

log(σxy2

λKt + q2 λ)

  • Uncertainty P enalty

Everything you’ve seen so far What you see right now.

Avoid if possible

slide-20
SLIDE 20

Finding all the rules

A little probability theory: And a little algebra: Give us all the rules for making the best guess about where we are looking.

W here am I looking?

  • p(τt|τ1:t−1, ψ1:t, a1:t) =

= p(τt|τ1:t−1, a1:t)p(ψt|ψ1:t−1, τ1:t−1)p(ψ1:t−1|τ1:t−1) p(ψ1:t|τ1:t−1)

g(τt) = = −.5

P redicted Motion Match

  • (τt − CtαKt)T (CtΣαKtCT

t + Qα)−1(τt − CtαKt)

− .5

  • xy

(ψxy

t

− λxy

Kt)2

(σxy2

λKt + q2 λ)

  • Image Match

−.5

  • xy

log(σxy2

λKt + q2 λ)

  • Uncertainty P enalty

Everything you’ve seen so far What you see right now.

Best Guess!

slide-21
SLIDE 21

Learning to Look

50 100 150 200 250 300 350 10 20 30

Eye-Movements

Error on Desired Eye-Movement

slide-22
SLIDE 22

What the brain does

In 1992, Duhamel et al. showed that the parietal cortex does something similar to what we just described. Just before an eye-movement, cells “remap” their visual representation to be in line with what they expect to see. This does not mean the brain is doing probability theory and algebra. It may mean the brain found a way to implement the solution probability theory and algebra give.

REFERENCES AND NOTES

  • 1. J. Pines and T. Hunter, Nature 346, 760 (1990).
  • 2. K. I. Swenson, K. M. Farrell,
  • J. V. Ruderman, ibid.

47, 861 (1986).

  • 3. G. Draetta et a!., ibid. 56, 829 (1989).
  • 4. C. F. Lehner and P. H. O'Farrell,

ibid., p. 957.

  • 5. J. Minshull, R. Golsteyn, C. S. Hill, T. Hunt,
  • EMBOJ. 9, 2865 (1990).
  • 6. B. Faha et al., in preparation.
  • 7. L. Tsai, E. Harlow, M. Meyerson, Nature 353, 174

(1991).

  • 8. A. Giordano et a!., Cell 58, 981 (1989).
  • 9. L. Bandara, J. Adamczewski, T. Hunt, N. La

Thangue, Nature 352, 249 (1991).

  • 10. S. Chellappan, S. Hiebert, M. Mudryj, J. Horowitz,
  • J. Nevins, Cell 65, 1053 (1991).
  • 11. M. Mudryj et a!., ibid., p. 1243.
  • 12. E. Harlow, B. J. Franza,
  • C. Schley, J. Viro!. 55,533

(1985).

  • 13. A. Giordano et a!., Science

253, 1271 (1991).

  • 14. B. Faha, unpublished data.
  • 15. D. W. Cleveland, S. G. Fischer, M. W. Kirschner,
  • U. K. Laemmli, J. Biol. Chem. 252, 1102 (1977).
  • 16. These include CEM, H9, Weri, WI38, Hs68, HL60,

and MCF-7 cell lines.

  • 17. Q. Hu et a!., Mo!. Ce!!. Biol. 11, 5792 (1991).
  • 18. L.-H. Tsai, unpublished data.
  • 19. M. Ewen, Y. Xing, J. B. Lawrence, D. Livingston,

Ce!! 66, 1155 (1991).

  • 20. Q. Hu, N. Dyson, E. Harlow, EMBO J. 9, 1147

(1990).

  • 21. S. Huang, N. P. Wang, B. Y. Tseng, W. H. Lee, E.
  • H. Lee, ibid., p. 1815.
  • 22. W. J. Kaelin, M. E. Ewen, D. M. Livingston, Mo!.
  • Cell. Biol. 10, 3761 (1990).
  • 23. N. Dyson et a!., in preparation.
  • 24. Q. Hu, J. Lees, K. Buchkovich, E. Harlow, Mo!.
  • Cell. Biol., in press.
  • 25. S. Bagchi, P. Raychaudhuri,
  • J. Nevins, Cell 62, 659

(1990).

  • 26. E. Harlow, L. V. Crawford, D. C. Pim, N. M.

Williamson, J. Virol. 39, 861 (1981).

  • 27. U. K. Laemmli, Nature 227, 680 (1970).
  • 28. W. M. Bonner and R. A. Laskey, Eur.
  • J. Biochem.

46, 83 (1974).

  • 29. E. Harlow and D. Lane, Antibodies:

A Laboratory Manual (Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, 1988).

  • 30. Treatment of lysates with SDS and boiling reduced

the affinity of BF683 for cycin A. To make the intensity of cyclin A comparable in immunoprecip- itations from treated and untreated lysates, lanes 4 to 6 were exposed to film 10 hours longer than lanes 1 to 3.

  • 31. D. B. Smith and K. S. Johnson, Gene

67, 31 (1988).

  • 32. C. Herrmann et al., J. Virol. 65, 5848 (1991).
  • 33. J. DeCaprio et al., Cell 58, 1085 (1989).
  • 34. The authors acknowledge W. Reese and Q. Hu for

important contributions to the early stages of this

  • work. We also thank our colleagues at Cold Spring

Harbor Laboratory and the MGH Cancer Center for their helpful discussions, N. Dyson and E. Lees for critical reading of the manuscript,

  • E. Lees for the

gift of the cyclin A mutations, and J. Duffy, M. Ockler, and P. Renna for art and photography. Supported by NIH grants CA 13106 and 55339. 10 September 1991; accepted 25 November 1991

The Updating of the Representation of Visual Space in Parietal Cortex by Intended Eye Movements

JEAN-RENI

DUHAMEL, CAROL L. COLBY, MICHAEL E. GOLDBERG* Every eye movement produces a shift in the visual image on the retina. The receptive field, or retinal response area, of an individual visual neuron moves with the eyes so that after an eye movement it covers a new portion of visual space. For some parietal neurons, the location of the receptive field is shown to shift transiently before an eye

  • movement. In addition, nearly all parietal neurons respond when an eye movement

brings the site of a previously flashed stimulus into the receptive field. Parietal cortex both anticipates the retinal consequences of eye movements and updates the retinal coordinates of remembered stimuli to generate a continuously accurate representation

  • f visual space.
A

S WE MOVE OUR EYES, A STATION-

ary object excites successive loca- tions on the retina. Despite this constantly shifting input, we perceive a sta- ble visual world. This perception is presum- ably based on an internal representation derived from both visual and nonvisual in-

  • formation. Helmholtz proposed that the

brain uses information about intended movement to interpret retinal displacements (1). We show that single neurons in monkey parietal cortex use information about in- tended eye movements to update the repre- sentation of visual space (2).

Laboratory of Sensorimotor Research, National Eye Institute, National Institutes of Health, Building 10, Room lOC 101, Bethesda, MD 20892. *To whom correspondence should be addressed.

The shift in the visual image on the retina produced by a saccade is determined by the size and direction of the eye movement. This predictability enables the representation of visual space in parietal cortex to be remapped in advance of the eye movement. At the single cell level, the intention to move the eyes evokes a transient shift in the retinal location at which a stimulus can excite the neuron. Our results are summarized schematically in Fig. 1, in which an observer transfers fixation from the mountain top to the tree. During fixation, the representation of the visual scene in parietal cortex is stable. A given neuron encodes the stimulus at a certain retinal location (the cloud). Immedi- ately before and during the saccade, the cortical representation shifts into the coor- dinates of the next intended fixation. The neuron now responds to the stimulus at a new retinal location (the sun) and stops responding to the stimulus at the initial location (the cloud). The neuron thus antic- ipates the retinal consequences of the in- tended eye movement: the cortical represen- tation shifts first, and then the eye catches

  • up. After the eye movement, the representa-

tion in parietal cortex matches the reafferent visual input and the neuron continues to respond to the stimulus (the sun). This process constitutes a remapping of the stim- ulus from the coordinates of the initial fixa- tion to those of the intended fixation. We demonstrated this remapping by studying the visual responsiveness of neu- rons in the lateral intraparietal area (LIP) of alert monkeys performing fixation and sac- cade tasks (3). Neurons in LIP have reti- nocentric receptive fields and carry visual and visual memory signals (4). An example is shown in Fig. 2. When the monkey fixates, this neuron responds to the onset of a visual stimulus in its receptive field at a latency of 70 ms (Fig. 2A). Receptive field borders were defined while the monkey maintained fixation, and, under these condi- tions, stimuli presented outside these bor- ders never activated the neuron. In the sac- cade task, the fixation target jumps at the same time that a visual stimulus appears. The visual stimulus is positioned so that it will be in the receptive field when the mon- key has completed the saccade. If there were no predictive remapping, the cell would be expected to begin discharging 70 ms after the eye movement brings the stimulus into

Oculomotor viuleet

events Visual events Fixate Intend eye movement

Refixate

  • Fig. 1. Remapping
  • f the

visual representation in parietal cortex. Each panel represents the visual imnage at a point in time relative to a sequence

  • f
  • culomotor

events. Receptive field of a parietal neuron, dashed circle; center of current gaze location, solid circle; and coordinates

  • f the cor-

tical representation, cross hairs.

90 SCIENCE, VOL. 255

slide-23
SLIDE 23

</Chapter 2>

1) True or false?: Scientists know that intelligence definitely requires neurons. 2) Make a Generative Model for the question, “Is this street safe to cross?”: A) What do you want to know? B) What can you do? C) What can you see? D) How is the answer likely to change in the future? 3) What is the role of “Nurture” in learning to look? What is the role of “Nature?”

slide-24
SLIDE 24

<Chapter 3>

Exquisitely tuned information consumers. Bamboozled by Math? Ask a different question. Generative models let you reward yourself for a job well done.

slide-25
SLIDE 25

Where do we find information?

Information helps us answer a question: Who was the 17th president? Will it rain tomorrow? What am I supposed to talk about next?

We constantly gather visual information by moving our eyes!

slide-26
SLIDE 26

Choosing where to look

People don’t closely examine every inch of the world. Eye-movements are tuned to optimally gather information. We turned two of the shortcuts that people use into new machine perception technologies.

1) Fast Visual Saliency 2) Digital Eye

slide-27
SLIDE 27

How do we measure Information?

Maximum: uniform distribution has most information, because we can’t make a good guess. Additivity: we get as much information from two events as we get from each one separately. Continuity: small changes in probability give small changes in information. Symmetry: reordering/renaming outcomes doesn’t change information.

− ∞

−∞

p(x) log p(x)dx

slide-28
SLIDE 28

Visual Salience

Salient objects “pop out” of visual scenes.

Simple preprocessing step directs computational resources. Rare (improbable) image features are more salient than common (probable

  • nes)

Improbable events carry more information. We developed an efficient way to model the statistics of a video stream, and analyze it for salient “pop out”.

slide-29
SLIDE 29

Two Examples

Offline: Video Analysis Online: Camera Control

slide-30
SLIDE 30

Empirically Useful

Tracks people in pre-school:

68.04% of salience-tracking images contained people. 34.81% of playback images contained people.

Predicts Key-frames in Video Annotation:

Video sequence labeled by coders for “Change in activity.” [RED] Initial attempts at salience-based video statistics can give up to 70% signal correlation [BLUE] Can also be used to make a “virtual cameraman” to focus on areas of a scene.

slide-31
SLIDE 31

What information?

Salience approaches don’t really pay attention to what they see. Inhibition of return Can pre-compute saccade trajectory from first image. Not reacting to information in the image. The image is constant, and all image analysis is pre-computed. What is the consequence of each eye movement? Information-gathering model, but what information was gathered? What question were we trying to answer?

1 2 3 4 5

slide-32
SLIDE 32

Task Directed Looking Behavior

Visual Popout can be useful for robots, and it seems to be important in people, but it can’t account for task-specific looking behavior. It has long been known that where people look depends on what questions they are trying to answer. [Yarbus 1967] Current studies have difficulty making quantitative claims: “Fixations are tightly linked in time to the evolution task. Very few irrelevant regions are fixated.” [Hayhoe & Ballard 2005]

slide-33
SLIDE 33

Uncertainty after I open my eyes Uncertainty with my eyes closed

Mutual Information is “How much was my uncertainty about a question I have reduced by the things I do and see?” Is this street safe to cross? Don’t look: Very uncertain Look left: Somewhat uncertain Look right: Not uncertain

How do we measure Information?

I(S; A, O) =

  • S

p(S|A, O) log p(S|A, O)dS −

  • S

p(S) log p(S)dS = H(S) − H(S|A, O)

slide-34
SLIDE 34

Infomax Principle: “Feeling of learning”

Supervised: Student / teacher model of learning; teacher knows right answer. Learning judged by %Correct. Infomax: Confidence in response to a question (Information). Reinforcement Learning: Given a reinforcement signal, learn how to act to

  • ptimally accrue that reinforcer.

In the Infomax approach, learn to gain information in order to become confident (can’t be confident without information).

50 100 No Yes 50 100 No Yes

Confidence

slide-35
SLIDE 35

Searching for Faces

4 3 2 1

1

No Face

2

No Face

3

No Face

4

Face!

slide-36
SLIDE 36

A Generative Model for Visual Search

[Adapted from Najemnik & Geisler 2005]

0.5 1.0 1.5 2.0 2.5 3.0
  • 8 -7 -6 -5 -4 -3 -2 -1 0
1 2 3 4 5 6 7 8 Target-Eye Distance (Degrees)

Target Signal Strength Signal Signal+Noise ~N(0,1) Belief Likelihood State / Action t=0 t=1

0.5 1.0 1.5 2.0 2.5 3.0
  • 8 -7 -6 -5 -4 -3 -2 -1 0
1 2 3 4 5 6 7 8 Target-Eye Distance (Degrees)

t=2

0.5 1.0 1.5 2.0 2.5 3.0
  • 8 -7 -6 -5 -4 -3 -2 -1 0
1 2 3 4 5 6 7 8 Target-Eye Distance (Degrees)

t=3 Infomax Reward

slide-37
SLIDE 37

Digital Eye in Action

9 6 0 x 5 4 0 V i d e o ( 1 / 2 M p x ) . D i g i t a l R e t i n a : 2 5 F P S V i o l a J o n e s : 1 . 2 5 F P S

slide-38
SLIDE 38

</Chapter 3>

1) True or False?: If you close your eyes (and ears, nose, etc.), you get no information about whether a street is safe to cross. 2) Why is it a good idea to be bad at “Where’s Waldo?” 3) In Infomax Control approaches, you reward yourself for doing things that make you more certain about the answer to a question. What keeps you from just tricking yourself into believing things with complete certainty?

slide-39
SLIDE 39

<Chapter 4>

Moving up: Social Awareness Baby Einstein: Doing the right experiment at the right time. Generative Models let you own your intelligence.

slide-40
SLIDE 40

It takes 2-month infants about 40 minutes to learn new contingencies

(head moves mobile moves) By 10 months infants have become experts at learning new contingencies: (it takes them only a few seconds to detect contingencies).

Learning Contingencies

[Movellan & Watson 1985]

slide-41
SLIDE 41
slide-42
SLIDE 42

A Generative Model for Contingency

Example: vocalization contingency Actions: Vocalize, Remain Quiet Question: Are the sound statistics after my vocalization different from background? Goal: Choose length of waiting period to quickly become confident in the answer to this question. Volume

Vocalization Vocalization 50 100 No Yes 50 100 No Yes

Confidence

slide-43
SLIDE 43

Infomax Control Demo

slide-44
SLIDE 44

Developmental Result

4 8 12 16 20 24 28 32 36 40 2 4 6 8 10 12 14 16 18 20

10 mo: 3.4 Minutes 2 mo: 18 Minutes

Months of Development

Minutes of Interaction for Accurate Contingency Detection *Butko, Movellan, ICDL 2007

slide-45
SLIDE 45

Learning to See Humans

Is it possible to learn about the visual appearance of people based on contingency? [John Watson (1972), 2 month infants] Contingency is the driver of social development Contingency defines the concept of “caregiver” Computational Analysis Is Watson’s hypothesis computationally plausible? If so, how long does it take to gather enough information to learn reliably?

slide-46
SLIDE 46

Testing the Hypothesis

Infomax model of detecting contingencies High reliability in real world, real time robotic applications. Movellan and Fasel (2006): Segmental Boltzmann Fields identify and locate objects in cluttered scenes weak training label “A leopard is probably in this scene” Use contingency to teach yourself about people: “A social being is probably in this scene”

slide-47
SLIDE 47

Autonomous Robotic Learner BEV A Baby’s Eye-View Robot

slide-48
SLIDE 48

GA

Nothing Responsive. Take a Picture

slide-49
SLIDE 49

GA Nice Baby

Something Responsive! Take a Picture

slide-50
SLIDE 50

3700 Images collected over 90 minutes of interaction. No experimenter intervention Variety of lighting and background conditions No post-processing of images (rectification, etc.)

“Baby’s Eye View”

Learning In The WilD

18% - No face ; 4% - No Person

17% - Face ; 20% - Person

Contingency

No Contingency

slide-51
SLIDE 51

Key Results

Learns what people look like with high accuracy very quickly (6 minutes) Shows preference for schematic faces shown by infants shortly after birth (40 minutes) Shows preference for caregivers above other people shown by infants shortly after birth (2 days)

50 100 150 200 250 0.5 0.6 0.7 0.8 0.9 1 Number of Training Images 2AFC Performance (Face v. No Person) Caregivers Other People
slide-52
SLIDE 52

</Chapter 4>

1) From a computational point of view, in what ways is social intelligence “special,” or fundamentally different from low-level perceptual intelligence? 2) Under the Infomax Hypothesis, how do babies learn to be good scientists, i.e. ask the right question at the right time? 3) What ultimately enabled BEV to learn what people look like, without borrowing the expertise

  • f human teachers?
slide-53
SLIDE 53

Advice

Artificial Intelligence is an exciting field where we are constantly pushing the boundaries of imagination. Get involved in research labs early. Take lots of math classes:

Calculus, Linear Algebra, Probability, Statistics, Discrete Math, Algorithms & Data Structures

slide-54
SLIDE 54

Thanks, Pr. Belew!

For more info: http://mplab.ucsd.edu nbutko@ucsd.edu