Activity and parameter sparsity in recurrent networks Sustainable Machine Learning

Description

The growing energy requirements of modern machine learning are limiting the ability to run models on large compute clusters owned by a handful of companies and preventing interesting end-user and edge applications. Therefore, it is becoming increasingly important to optimize not only the task performance, but also the energy efficiency of these models. Recurrent neural networks are well suited for energy efficient edge computing. A recently developed model [1] has demonstrated the potential of activity sparsity in recurrent networks. Furthermore, it has been shown that recurrent networks can operate in sparse parameter space if the sparsity is adapted online [2]. In this project, we will combine activity and online parameter sparsity, investigate these two types of sparsity, and identify optimal trade-offs between performance and energy efficiency.

Keywords: Efficient Machine Learning, Recurrent Neural Networks, GRU, Sparsity

Prerequisites
  • Experience with the Python programming language, ideally previous experience with machine learning in Python, e.g. PyTorch, Jax, etc.
  • Completed courses, internships, or projects related to artificial intelligence or machine learning.
References

[1] Anand Subramoney, Khaleelulla Khan Nazeer, Mark Schöne, Christian Mayr, David Kappel. Efficient recurrent architectures through activity sparsity and sparse back-propagation through time. ICLR 2023. https://arxiv.org/pdf/2206.06178.pdf

[2] Guillaume Bellec, David Kappel, Wolfgang Maass, Robert Legenstein. Deep rewiring: Training very sparse deep networks. ICLR 2017. https://arxiv.org/pdf/1711.05136

The Institut für Neuroinformatik (INI) is a central research unit of the Ruhr-Universität Bochum. We aim to understand the fundamental principles through which organisms generate behavior and cognition while linked to their environments through sensory systems and while acting in those environments through effector systems. Inspired by our insights into such natural cognitive systems, we seek new solutions to problems of information processing in artificial cognitive systems. We draw from a variety of disciplines that include experimental approaches from psychology and neurophysiology as well as theoretical approaches from physics, mathematics, electrical engineering and applied computer science, in particular machine learning, artificial intelligence, and computer vision.

Universitätsstr. 150, Building NB, Room 3/32
D-44801 Bochum, Germany

Tel: (+49) 234 32-28967
Fax: (+49) 234 32-14210