Can you do shift- and size-invariant recognition? If you want to test it, look at the fixation point in the center of the figure below and then, without moving the eyes away from the fixation point, scan the letters and name them. How many Ms are there, for instance? Usually it is hard to keep the fixation but I guess you have no problems naming the letter.
Figure: A test pattern for shift- and size-invariant recognition.
How does our visual system achieve shift- and size-invariance? This is the question I am addressing in this project from a general perspective. I am trying to get and give an overview over what we know about this issue from psychophysics, neurobiology and in particular computational modelling. I focus on invariant representation rather than recognition and I do not consider models which are application oriented and not biologically plausible. There are mainly two classes of models, the dynamic routing circuit model by Bruno Olshausen et al., and what I call invariant feature networks. The later are networks, such as the neocognitron, that achieve invariance by pooling over many units with identical receptive fields. I also consider some open questions regarding both types of networks.
This project has benefit greatly from an extended email discussion with Bruno Olshausen.
Black colored reference are the principal ones. Gray colored references are listed for the sake of completeness only. They contain little additional information. .ps-files are optimized for printing; .pdf-files are optimized for viewing at the computer.