Gradient-Based Training of Slow Feature Analysis and Spectral Embeddings

Dr. Merlin Schüler

Theory of Neural Systems

In this project we investigate methods that receive high-dimensional (e.g., visual) data and extract low-dimensional representations that are coherent under temporal proximity or under any other symmetric similarity metric.

Previous work on the topic has shown the efficacy of such representations as a base for goal-directed learning [1, 2] as well as structural data analysis [3, 4]. Furthermore, temporal coherence has been proposed as a principle to model neurophysiological phenomena, such as the formation of distinct oriospatial responses in the mammalian hippocampus: a hypothesis that has been substantiated by theoretical and simulated experimental evaluation [5].

Figure 1: NORB [6] photographs of toy plane embedded by a version of gradient-based SFA. Coherence based on rotation + elevation similarity, architecture is a Google MobileNet [7].

When coloring in the rotation angle (pink-to-pink) or the elevation (red-to-pink) it is apparent that these properties are disentangled and well preserved in the structure of the three-dimensional embedding. This holds even for unseen photographs of the same toy. Collapse onto a single point is prevented by differentiable whitening.

⇒

However, most of this work was limited in the choice of the approximators used for extraction. Past models where either shallow or relied on stacking shallowly-trained layers for hierarchical representations. Using a differentiable whitening procedure, we were able to achieve comparable results in multiple proof-of-concept settings [8] using deep feed-forward neural networks trained by stochastic gradient descent to optimize a global slowness objective. This hybrid approach allows us to tap not only into an extensive and active body of research regarding deep model design, but also into the well-understood theoretical foundations of slow feature analysis (SFA) and spectral embeddings.

**Figure 2:** comparison of different methods for training linear SFA on synthetic data. From left to right: gradient-based without constraints, gradient-based with unit-variance constraint, **gradient-based with differentiable whitening**, optimal analytical solution.

We believe that by bridging the gap between these fields it is possible to get a better understanding of how spectral methods can be used in high-dimensional machine learning as well as new tools to investigate coherence principles in neuroscientific modelling research.

Publications

2019

Gradient-based Training of Slow Feature Analysis by Differentiable Approximate Whitening
Schüler, M., Hlynsson, H. D. ′ið, & Wiskott, L.
In W. S. Lee & Suzuki, T. (Eds.), Proceedings of The Eleventh Asian Conference on Machine Learning (Vol. 101, pp. 316–331) Nagoya, Japan: PMLR

link

@inproceedings{SchülerHlynssonWiskott2019, author = {Schüler, Merlin and Hlynsson, Hlynur Dav′ið and Wiskott, Laurenz}, title = {Gradient-based Training of Slow Feature Analysis by Differentiable Approximate Whitening}, booktitle = {Proceedings of The Eleventh Asian Conference on Machine Learning}, editor = {Lee, Wee Sun and Suzuki, Taiji}, pages = {316–331}, publisher = {PMLR}, volume = {101}, series = {Proceedings of Machine Learning Research}, address = {Nagoya, Japan}, month = {}, year = {2019}, }

Schüler, M., Hlynsson, H. D. ′ið, & Wiskott, L.. (2019). Gradient-based Training of Slow Feature Analysis by Differentiable Approximate Whitening. In W. S. Lee & Suzuki, T. (Eds.), Proceedings of The Eleventh Asian Conference on Machine Learning (Vol. 101, pp. 316–331). Nagoya, Japan: PMLR. Retrieved from http://proceedings.mlr.press/v101/schuler19a.html

⇒

Publications

2019

About the INI

Contact