• RUB
  • INI
  • Projects
  • Is Hierarchical Quantization Essential for Optimal Reconstruction?
2025
2026
Is Hierarchical Quantization Essential for Optimal Reconstruction?
Funding:

DFG-funded research unit FOR 2812 "Constructing scenarios of the past: A new framework in episodic memory"

Gauss Centre for Supercomputing e.V. (www.gauss-centre.eu) funded this project by providing computing time through the John von Neumann Institute for Computing (NIC) on the GCS Supercomputer JUWELS (Julich Supercomputing Centre, 2021) at Julich Supercomputing Centre (JSC).


Vector-quantized variational autoencoders (VQ-VAEs) are central to models that rely on high reconstruction fidelity, from neural compression to generative pipelines. Hierarchical extensions, such as VQ-VAE2, are often credited with superior reconstruction performance because they split global and local features across multiple latent levels. However, since higher-level latents derive all their information from lower levels, they should not carry additional reconstructive content beyond what the lower-level already encodes. Combined with recent advances in training objectives and quantization mechanisms, this leads us to ask whether a single-level VQ-VAE, with matched representational budget and no codebook collapse, can equal the reconstruction fidelity of its hierarchical counterpart. Although the multi-scale structure of hierarchical models may improve perceptual quality in downstream tasks, the effect of hierarchy on reconstruction accuracy, isolated from codebook utilization and overall representational capacity, remains empirically underexamined. We revisit this question by comparing a two-level hierarchical VQ-VAE and a capacity-matched single-level model on high-resolution ImageNet images. Consistent with prior observations, we confirm that inadequate codebook utilization limits single-level VQ-VAEs and that overly high-dimensional embeddings destabilize quantization and increase codebook collapse. We show that lightweight interventions such as initialization from data, periodic reset of inactive codebook vectors, and systematic tuning of codebook size and dimension significantly reduce collapse and enable the single-level model to make effective use of its available capacity. Our results demonstrate that when representational budgets are matched, and codebook collapse is mitigated, single-level VQ-VAEs can match the reconstruction fidelity of hierarchical variants, challenging the assumption that hierarchical quantization is inherently superior for high-quality reconstructions. The code for reproducing our experiments is available at https://github.com/wiskott-lab/single-vs-hier-recon.


Publications

    2026

  • Is Hierarchical Quantization Essential for Optimal Reconstruction?
    Reyhanian, S., & Wiskott, L.
    In Proceedings of the 15th International Conference on Pattern Recognition Applications and Methods- ICPRAM 2026 (pp. 671–679) SciTePress - Science and Technology Publications

The Institut für Neuroinformatik (INI) is a research unit of the Faculties of Computer Science and Medicine at the Ruhr-Universität Bochum. Its scientific goal is to understand the fundamental principles through which organisms generate behavior and cognition while linked to their environments through sensory and effector systems. Inspired by our insights into such natural cognitive systems, we seek new solutions to problems of information processing in artificial cognitive systems. We draw from a variety of disciplines that include experimental psychology and neurophysiology as well as machine learning, neural artificial intelligence, computer vision, and robotics.

Universitätsstr. 150, Building NB, Room 3/32
D-44801 Bochum, Germany

Tel: (+49) 234 32-28967
Fax: (+49) 234 32-14210