Segmentation from motion

Most segmentation from motion algorithms exploit one particular technique to achieve the task of image segmentation. These techniques either have the correspondence problem or the aperture problem, depending on whether they are based on feature matching or on spatio-temporal gradients of the grey value. While the aperture problem is inherent to the gradient methods, the correspondence problem can be reduced by using more complex, i.e. larger features. This, on the other hand, leads to low resolution.

The key idea of this project is to integrate information from Gabor- and Mallat-wavelet transform to overcome the aperture and the correspondence problems. The Gabor-wavelet transform can be used to compute image flow with high precision and reliability but poor spatial resolution. A histogram over the image flow field is then used to infer motion hypotheses, hypotheses about how many objects are in the scene and in which directions they are moving. The basic assumption is that objects move only translationally, though the system degrades gracefully as this assumption is violated. The motion hypotheses are then used to reduce the correspondence problem on the high resolution representation of the Mallat-wavelet transform. Integration over time helps to improve the reliability further. No object models are used, so that the algorithm is able to segment several objects of arbitrary shape; see the figures. No assumptions about motion continuity are made, so that the algorithm is able to track objects which jump back and forth arbitrarily; see Figure 2. Segmentation is only performed on edges, because only edges provide enough evidence for reliable motion estimation and edge information suffices to reconstruct images.

moving animal sequence
(25 kB)

Figure 1: One frame of the moving-animal sequence and the segmentation result. The zebra moves to the right, the elephant to the left. The background is resting and is treated as just one more object.

dot-pattern sequence (6 kB)

Figure 2: One frame of the dot-pattern sequence and the segmentation result. The pattern consists of a circle of eight dots and a moving background. In addition to the continuous motion each frame is arbitrarily displaced by up to about 6 pixels relative to the previous frame; images have 128x128 pixels.

Image sequences: Below you find the five image sequences I have used in my experiments presented in the publications. Feel free to use them for your research and scientific publications as long as you cite the source. The sequences are stored as animated .gif files. You can view them with your browser or xanim, for instance. You can decompose the sequences into single images with the command `convert -deconstruct sequence_name.gif sequence_name%02d.gif'. With `convert -deconstruct sequence_name.gif[14] sequence_name14.gif' you can select frame 14 only, for instance. The frames you see here on this page are the frames used in the Pattern Recognition paper.

elephant and zebra passing
each other (968 kB), frame 14 grasping a letter scale
(1.14 MB), frame 40 circle of dots in
moving cloud plus random displacements (387 kB), frame 06 Laurenz entering a room (581
kB), frame 11 in depth rotating plant
(1.16 MB), frame 44

Here are two more sequences not used for the publications.

circle of dots in moving
cloud (581 kB), frame 06 in plane rotating hobbit
book (1.16 MB), frame 00

I agree that it would be great if I could present also the segmentation results for these sequences as animated gifs. However, I haven't used the program since years now and I am reluctant to go through the hassle of recompiling and rerunning it.

Thanks go to Arne Jacobs in Bremen for technical assistance in converting my image sequences into animated gif files.

The Institut für Neuroinformatik (INI) is a central research unit of the Ruhr-Universität Bochum. We aim to understand the fundamental principles through which organisms generate behavior and cognition while linked to their environments through sensory systems and while acting in those environments through effector systems. Inspired by our insights into such natural cognitive systems, we seek new solutions to problems of information processing in artificial cognitive systems. We draw from a variety of disciplines that include experimental approaches from psychology and neurophysiology as well as theoretical approaches from physics, mathematics, electrical engineering and applied computer science, in particular machine learning, artificial intelligence, and computer vision.

Universitätsstr. 150, Building NB, Room 3/32
D-44801 Bochum, Germany

Tel: (+49) 234 32-28967
Fax: (+49) 234 32-14210