Computational Models of Vision

Symposium organized by Laurenz Wiskott and Gustavo Deco
at the 31st Göttingen Neurobiology Conference,
Göttingen, Germany, March 29 - April 1, 2007,
the symposim will take place March 30


The visual system is particularly attractive to theoreticians for several reasons: A large amount of experimental results are available to inspire and constrain the models, it is a sensory system and therefore the inputs are fairly well defined, it is sufficiently complex to offer a number of interesting research questions, it is so homogeneous that one can expect approaches based on only a few principles to have large explanatory power, and any progress being made in modelling biological vision is likely to foster progress in technical vision applications as well. However, despite the considerable effort and progress that theoreticians have made in modelling vision, very basic questions are still open and subject to controversial discussions: What are optimal visual features and how can they self-organize? How are visual invariances achieved? How is visual attention implemented and controlled? How are object classes being learned? This symposium focusses on a systems level of visual processing from feature extraction to visual perception and brings together researchers with different perspectives and modelling approaches.

9:00-9:05 Opening remarks

9:05-9:35 Laurenz Wiskott (Institute for Theoretical Biology, Humboldt-University Berlin, Germany)
"Towards an analytical derivation of complex cell receptive field properties"
Slow Feature Analysis (SFA) [1] is an algorithm for extracting slowly varying features from a quickly varying signal. If applied to image sequences generated from natural images using a range of spatial transformations, SFA yields units that share many properties with complex and hypercomplex cells of early visual areas [2].  All units are responsive to Gabor stimuli with phase invariance, some show sharpened or widened orientation or frequency tuning, secondary response lobes, end/side-inhibition, or selectivity for direction of motion. Interestingly, the results do not depend on the higher-order statistics of natural images. We get virtually identical results with colored noise images. This permits a clear formulation of the conditions under which complex cell properties emerge and makes the problem amenable to an analytical treatment. Here we show that important complex cell properties can be derived by means of variational calculus from first principles and a few basic assumptions.
[1] Wiskott, L. and Sejnowski, T.J. (2002). Slow Feature Analysis: Unsupervised Learning of Invariances. Neural Computation, 14(4):715-770. http://www.ini.rub.de/PEOPLE/wiskott/Abstracts/WisSej2002.html
[2] Berkes, P. and Wiskott, L. (2005). Slow feature analysis yields a rich repertoire of complex cell properties. Journal of Vision, 5(6):579-602. http://journalofvision.org/5/6/9/

9:35-10:05 Robert Shapley (Center for Neural Science, New York University, New york, USA)
"Large scale models of the V1 cortical network"
I will describe a large-scale (order 10^4 units) computational model of a local patch of the input layer 4Cα of the primary visual cortex (V1) of the macaque monkey. The model attempts to be realistic by including anatomical and biophysical data from many sources. The model neurons are integrate-and-fire neurons with biologically plausible synaptic conductances. This model can account for the distributions of orientation and spatial frequency selectivity across the population of V1, and also the relative prevalence of linear and nonlinear spatial summation in V1 neurons. The crucial controlling variable appears to be the relative strength of cortico-cortical inhibition relative to excitation. Indeed, strong cortico-cortical inhibition in the model is a prerequisite for cortical stability, for feature selectivity, and for spatial-summation linearity.

10:05-10:35 Fred Wolf (Dept. of Nonlinear Dynamics, MPI für Strömungsforschung, Göttingen, Germany)
"Pinwheels - A structure without a function?"

10:35-11:00 Break

11:00-11:30 Edmund Rolls (Dept of Experimental Psychology, University of Oxford, England)
"Models of invariant object recognition and global motion recognition in the ventral and dorsal visual systems"
The ventral visual system operates to form invariant representations of faces and objects in the inferior temporal visual cortex. It has been shown that a multistage feed-forward architecture with convergence and competition at each stage is able to learn invariant representations of objects including faces by use of a Hebbian synaptic modification rule which incorporates a short memory trace (0.5 s) of preceding activity. This trace rule enables the network to learn the properties of objects which are spatio-temporally invariant over this time scale (Rolls and Deco, 2002). A new learning principle utilises continuous spatial transformations to compute invariant representations (Stringer et al, 2006). It has been found that in complex natural scenes, the receptive fields of inferior temporal cortex neurons shrink to approximately the size of an object, and are centred on or close to the fovea (Rolls et al, 2003). It is proposed that this provides a solution to reading the output of the ventral visual system, for primarily the object that is close to the fovea is represented by inferior temporal visual cortex neuronal activity. The effect is captured in models that use competition to weight the representation towards what is at the fovea. The model has been extended to account for covert attentional effects such as finding the location of a target object in a complex scene, by incorporating modules to represent the dorsal visual system, backprojections, and short term memory networks in the prefrontal cortex to keep active the representation of the object of attention, and does not require temporal synchronization to implement binding (Aggelopoulos et al 2005). The model has also been extended to a theory of how invariant global motion such as rotation is computed in the dorsal visual system.
Rolls,E.T. and Deco,G. (2002) Computational Neuroscience of Vision. Oxford University Press: Oxford.
Rolls,E.T., Aggelopoulos,N.C., and Zheng,F. (2003) The receptive fields of inferior temporal cortex neurons in natural scenes. Journal of Neuroscience 23: 339-348.
Aggelopoulos,N.C., Franco,L. and Rolls,E.T. (2005) Object perception in natural scenes: encoding by inferior temporal cortex simultaneously recorded neurons. Journal of Neurophysiology 93: 1342-1357.
Stringer,S.M., Perry, G., Rolls,E.T. and Proske,H. (2006) Learning invariant object recognition in the visual system with continuous transformations. Biological Cybernetics 94: 128-142.
Rolls,E.T. and Stringer,S.M. (2006) Invariant global motion recognition in the dorsal visual system: a unifying theory. Neural Computation, in press.
Rolls,E.T. and Stringer,S.M. (2007) Invariant visual object recognition: a model, with lighting invariance. Journal de Physiologie Paris, in press.

canceled Gustavo Deco (Computational Neuroscience Group, Universitat Pompeu Fabra, Barcelona, Spain)
"Competition and Cooperation Cortical Mechanisms in Visual Cognition"
Cognitive behaviour requires complex context-dependent processing of information that emerges from the links between attentional perceptual processes, working memory and reward-based evaluation of the performed actions. We describe a computational neuroscience theoretical framework which shows how an attentional state held in a short term memory in the prefrontal cortex can by top-down processing influence ventral and dorsal stream cortical areas using biased competition to account for many aspects of visual attention. We also show how within the prefrontal cortex an attentional bias can influence the mapping of sensory inputs to motor outputs, and thus play an important role in decision making. We also show how the absence of expected rewards can switch the attentional bias signal, and thus rapidly and flexibly alter cognitive performance. This theoretical framework incorporates spiking and synaptic dynamics which enable single neuron responses, fMRI activations, psychophysical results, the effects of pharmacological agents, and the effects of damage to parts of the system, to be explicitly simulated and predicted. This computational neuroscience framework provides an approach for integrating different levels of investigation of brain function, and for understanding the relations between them. The models also directly address how bottom-up and top-down processes interact in visual cognition, and show how some apparently serial processes reflect the operation of interacting parallel distributed systems.

11:30-12:00 Peter König (Institut für Kognitionswissenschaft, Universität Osnabrück, Germany)
"A model of the ventral visual system based on temporal stability and local memory"
The cerebral cortex is a remarkably homogeneous structure suggesting a rather generic computational machinery. Indeed, under a variety of conditions, functions attributed to specialized areas can be supported by other regions. However, a host of studies have laid out an ever more detailed map of functional cortical areas. This leaves us with the puzzle of whether different cortical areas are intrinsically specialized, or whether they differ mostly by their position in the processing hierarchy and their inputs but apply the same computational principles. Here we show that the computational principle of optimal stability of sensory representations combined with local memory gives rise to a hierarchy of processing stages resembling the ventral visual pathway when it is exposed to continuous natural stimuli. Early processing stages show receptive fields similar to those observed in the primary visual cortex. Subsequent stages are selective for increasingly complex configurations of local features, as observed in higher visual areas. The last stage of the model displays place fields as observed in entorhinal cortex and hippocampus. The results suggest that functionally heterogeneous cortical areas can be generated by only a few computational principles and highlight the importance of the variability of the input signals in forming functional specialization.



setup March 21, 2006; updated March 28, 2007
Laurenz Wiskott, http://www.ini.rub.de/PEOPLE/wiskott/