A Model of the Development of the Fusiform Face Area
Garrison W. Cottrell Gary's Unbelievable Research Unit (GURU) Computer Science and Engineering Department Institute for Neural Computation UCSD Collaborators, Past & Present: Ralph Adolphs, Luke Barrington, Serge Belongie, Kristin Branson, Tom Busey, Andy Calder, Eric Christiansen, Matthew Dailey, Piotr Dollar, Michael Fleming, AfmZakaria Haque, Janet Hsiao, Carrie Joyce, Brenden Lake, Kang Lee, Tim Marks, Joe McCleery, Janet Metcalfe, Jonathan Nelson, Nam Nguyen, Curt Padgett, Angelina Saldivar, Honghao Shan, Maki Sugimoto, Matt Tong, Brian Tran, Danke Xie, Keiji Yamada, Lingyun ZhangA Model of the Development of the Fusiform Face Area Garrison W. - - PowerPoint PPT Presentation
A Model of the Development of the Fusiform Face Area Garrison W. - - PowerPoint PPT Presentation
A Model of the Development of the Fusiform Face Area Garrison W. Cottrell Gary's Unbelievable Research Unit (GURU) Computer Science and Engineering Department Institute for Neural Computation UCSD Collaborators, Past & Present: Ralph
Why model?
- Models rush in where theories fear to tread.
- Models can be manipulated in ways people cannot
- Models can be analyzed in ways people cannot.
Models rush in where theories fear to tread
- Theories are high level descriptions of the processes underlying
- They are often not explicit about the processes involved.
- They are difficult to reason about if no mechanisms are explicit -- they may
- Theory formation itself is difficult.
- Using machine learning techniques, one can often build a working
- A working model provides an “intuition pump” for how things might
- A working model may make unexpected predictions (e.g., the
Models can be manipulated in ways people cannot
- We can see the effects of variations in cortical architecture (e.g., split
- We can see the effects of variations in processing resources (e.g.,
- We can see the effects of variations in environment (e.g., what if our
- We can see variations in behavior due to different kinds of brain
Models can be analyzed in ways people cannot
In the following, I specifically refer to neural network models.- We can do single unit recordings.
- We can selectively ablate and restore parts of the network, even down
- We can measure the individual connections -- e.g., the receptive and
- We can measure responses at different layers of processing (e.g.,
How (I like) to build Cognitive Models
- I like to be able to relate them to the brain, so “neurally
- The model should be a working model of the actual task,
- Of course, the model should nevertheless be simplifying
- Do we really need to model the (supposed) translation invariance
- As far as I can tell, NO!
- Then, take the model “as is” and fit the experimental data:
The other way (I like) to build Cognitive Models
- Same as above, except:
- Use them as exploratory models -- in domains where there
- r undergraduates) to suggest what we might find if we
- Examples:
- Why we might get specialized face processors
- Why those face processors get recruited for other tasks
Outline
- Review of our model of face and object processing
- Some insights from modeling:
- Does a specialized processor for faces need to be innately
- Why is there a left-side face bias?
Outline
- Review of our model of face and object processing
- Some insights from modeling:
- Does a specialized processor for faces need to be innately
- Why is there a left-side face bias?
The Face Processing System The Face Processing System
PCA . . . . . . Gabor Filtering Happy Sad Afraid Angry Surprised Disgusted Neural Net Pixel (Retina) Level Object (IT) Level Perceptual (V1) Level Category LevelThe Face Processing System The Face Processing System
The Face Processing System The Face Processing System
PCA . . . . . . Gabor Filtering Bob Carol Ted Cup Can Book Neural Net Pixel (Retina) Level Object (IT) Level Perceptual (V1) Level Category Level Feature Feature level levelThe Face Processing System The Face Processing System
LSF PCA HSF PCA . . . . . . Gabor Filtering Bob Carol Ted Cup Can Book Neural Net Pixel (Retina) Level Object (IT) Level Perceptual (V1) Level Category LevelThe Gabor Filter Layer
- Basic feature: the 2-D Gabor wavelet filter (Daugman, 85):
- These model the processing in early visual areas
How to do PCA with a neural network
(Cottrell, Munro & Zipser, 1987; Cottrell & Fleming 1990; Cottrell & Metcalfe 1990; O’Toole et al. 1991) A self-organizing network that learns whole-object representations (features, Principal Components, (features, Principal Components, Holons Holons, , eigenfaces eigenfaces) ) ... Holons (Gestalt layer) Input from Input from Perceptual Layer Perceptual LayerHow to do PCA with a neural network
(Cottrell, Munro & Zipser, 1987; Cottrell & Fleming 1990; Cottrell & Metcalfe 1990; O’Toole et al. 1991) A self-organizing network that learns whole-object representations (features, Principal Components, (features, Principal Components, Holons Holons, , eigenfaces eigenfaces) ) ... Holons (Gestalt layer) Input from Input from Perceptual Layer Perceptual LayerHow to do PCA with a neural network
(Cottrell, Munro & Zipser, 1987; Cottrell & Fleming 1990; Cottrell & Metcalfe 1990; O’Toole et al. 1991) A self-organizing network that learns whole-object representations (features, Principal Components, (features, Principal Components, Holons Holons, , eigenfaces eigenfaces) ) ... Holons (Gestalt layer) Input from Input from Perceptual Layer Perceptual LayerHow to do PCA with a neural network
(Cottrell, Munro & Zipser, 1987; Cottrell & Fleming 1990; Cottrell & Metcalfe 1990; O’Toole et al. 1991) A self-organizing network that learns whole-object representations (features, Principal Components, (features, Principal Components, Holons Holons, , eigenfaces eigenfaces) ) ... Holons (Gestalt layer) Input from Input from Perceptual Layer Perceptual LayerHow to do PCA with a neural network
(Cottrell, Munro & Zipser, 1987; Cottrell & Fleming 1990; Cottrell & Metcalfe 1990; O’Toole et al. 1991) A self-organizing network that learns whole-object representations (features, Principal Components, (features, Principal Components, Holons Holons, , eigenfaces eigenfaces) ) ... Holons (Gestalt layer) Input from Input from Perceptual Layer Perceptual LayerHow to do PCA with a neural network
(Cottrell, Munro & Zipser, 1987; Cottrell & Fleming 1990; Cottrell & Metcalfe 1990; O’Toole et al. 1991) A self-organizing network that learns whole-object representations (features, Principal Components, (features, Principal Components, Holons Holons, , eigenfaces eigenfaces) ) ... Holons (Gestalt layer) Input from Input from Perceptual Layer Perceptual LayerThe “Gestalt” Layer: Holons
(Cottrell, Munro & Zipser, 1987; Cottrell & Fleming 1990; Cottrell & Metcalfe 1990; O’Toole et al. 1991) A self-organizing network that learns whole-object representations (features, Principal Components, (features, Principal Components, Holons Holons, , eigenfaces eigenfaces) ) ... Holons (Gestalt layer) Input from Input from Perceptual Layer Perceptual LayerHolons
- They act like face cells (Desimone, 1991):
- Response of single units is strong despite occluding eyes, e.g.
- Response drops off with rotation
- Some fire to my dog’s face
- A novel representation: Distributed templates --
- each unit’s optimal stimulus is a ghostly looking face (template-
- but many units participate in the representation of a single face
- For this audience: Neither exemplars nor prototypes!
- Explain holistic processing:
- Why? If stimulated with a partial match, the firing
The Final Layer: Classification
(Cottrell & Fleming 1990; Cottrell & Metcalfe 1990; Padgett & Cottrell 1996; Dailey & Cottrell, 1999; Dailey et al. 2002) The holistic representation is then used as input to a categorization network trained by supervised learning.- Excellent generalization performance demonstrates the
The Final Layer: Classification
- Categories can be at different levels: basic, subordinate.
- Simple learning rule (~delta rule). It says (mild lie here):
- add inputs to your weights (synaptic strengths) when
- subtract them when you are supposed to be off.
- This makes your weights “look like” your favorite patterns
- When no hidden units => No back propagation of error.
- When hidden units: we get task-specific features (most
Outline
- Review of our model of face and object processing
- Some insights from modeling:
- Does a specialized processor for faces need to be innately
- Why is there a left-side face bias?
Introduction
- The brain appears to devote specialized resources to
- The issue: innate or learned?
- Our approach: computational models guided by
- The model: competing neural networks + biologically
- Results: interaction between face discrimination and
- No innateness necessary!
Step one: a model with parts
- Independent networks compete to perform new tasks
- A mediator rewards winners
- The question: What might cause a specialized face processor?
Developmental biases in learning
- The task: we have strong need to discriminate between
- Mother’s face recognition at 4 days (Pascalis et al., 1995)
- The input: low spatial frequencies - which tends to be more
- Infant sensitivity to high spatial frequencies is low at birth
Neural Network Implementation
Input Stimulus Image Preprocessing- Networks in
- Output mixed
- More error feedback to “winner”
- Rich get richer effect
Experimental methods
- Image data: 12 faces, 12 books, 12 cups, 12 soda
- 8-bit grayscale, cropped and scaled to
Image Preprocessing
Gabor Jet Pattern Vector (8x5 Elements) Filter Responses (512x5 Elements) Dimensionality Reduction PCA PCA PCA PCA PCAEffects of filtering with different spatial frequencies
Task Manipulation
- To investigate the question of task effects,
- Superordinate four-way classification (book?
- Subordinate classification within one class; simple
Input spatial frequency manipulation
- To investigate the effects of spatial frequencies,
- Each module receives same full pattern vector
- One module receives low spatial frequencies; other
Conditions summary
- Within the subordinate training condition, we also
- Thus we have a simple 2x2 design:
- Two task conditions
- Two input conditions
- Within the subordinate task condition, there are four
Measuring specialization
- Train the network
- Record how gate network outputs change with each
Specialization Results
Gating Unit Average Weight Four-way classification (Face, Book, Cup, Can?) Book identification (Face, Cup, Can, Book1, Book2, ...?) Face identification (Book, Cup, Can, Bob, Carol, Ted, ...?) Module 1 Module 2 Hi freq Lo freq TASK All frequencies Hi/Lo split INPUTWhy does this happen?
- To investigate why the low spatial frequency network
- We measured how well these networks generalized to new
- The results show that low spatial frequencies generalize
- This means that a network with low spatial frequencies will
Results from a single network
Modeling prosopagnosia
- Can “damage” the specialized network.
Conclusions so far…
- There is a strong interaction between task and spatial
- The model suggests that the infant’s low visual acuity
- ther objects could “lock in” a special face processor
- => General mechanisms (competition, known innate
- No need for an innately-specified processor
Outline
- Review of our model of face and object processing
- Some insights from modeling:
- Does a specialized processor for faces need to be innately
- Why is there a left-side face bias?
Which of these two people look the most like the middle one?
Why is that?
Modeling a split fovea
- Three models for comparison:
- No split
- Split, early convergence of information
- Split, intermediate: like early, but half the weights
- Split, late convergence
Spatial Frequency bias
- Two methods for biasing the spatial frequencies:
- No bias
- Biased “sigmoidally” - one side gets more LSF, the other
- This corresponds to attentional filtering in the DFF
Experiment 1 Training Data
- Faces vary in expression
- From CAFÉ dataset (California Facial Expressions)
Data Analysis
- We will compare the three architectures on
- But most importantly, we will compare them on
- Given a left-left face, and a right-right face, how
- Left Side Bias effect:
- Activation(left,left) - Activation(right,right)
Experiment 1 results: Accuracy
- Having a biased input reduces accuracy
- Why?
Experiment 1 results: Left Side Bias
- The Late and Intermediate architectures show an
Experiment 2 data: Greebles from San Francisco and Ketchikan, AK
- Lighting is from morning to late afternoon at
- Train on one, test on the other
Experiment 2 results: LSB in late and intermediate architectures
Experiment 3 Data: faces with different lighting
Yale face databaseExperiment 3 Results: LSB in late and intermediate architectures
Early, Intermediate and Late architectures
- The “intermediate” architecture = early with half the weights
- LSF are more redundant, and can probably work well with
Wrap up
- We are able to explain a variety of results in face
- How a specialized area might arise for faces, and why low
- Why there might be a left-side bias in face recognition
- And a whole lot more I didn’t talk about today!