Gradient aware adaptive quantization: Locally uniform quantization with learnable clipping thresholds for globally non-uniform weights

Kang Zhou; Yuning Qiu; Yuhang Li; Qibin Zhao; Guoxu Zhou

doi:10.1016/j.neunet.2025.107778

Gradient aware adaptive quantization: Locally uniform quantization with learnable clipping thresholds for globally non-uniform weights

Neural Netw. 2025 Jul 3:191:107778. doi: 10.1016/j.neunet.2025.107778. Online ahead of print.

Authors

Kang Zhou¹, Yuning Qiu², Yuhang Li³, Qibin Zhao⁴, Guoxu Zhou⁵

Affiliations

¹ School of Automation, Guangdong University of Technology, Guangzhou 510006, China. Electronic address: 2112204037@mail2.gdut.edu.cn.
² School of Automation, Guangdong University of Technology, Guangzhou 510006, China; Center for Advanced Intelligence Project (AIP), RIKEN, Tokyo 103-0027, Japan. Electronic address: yuning.qiu@riken.jp.
³ School of Automation, Guangdong University of Technology, Guangzhou 510006, China. Electronic address: 2112204383@mail2.gdut.edu.cn.
⁴ School of Automation, Guangdong University of Technology, Guangzhou 510006, China; Center for Advanced Intelligence Project (AIP), RIKEN, Tokyo 103-0027, Japan. Electronic address: qibin.zhao@riken.jp.
⁵ School of Automation, Guangdong University of Technology, Guangzhou 510006, China; Key Laboratory of Intelligent Detection and The Internet of Things in Manufacturing, Ministry of Education, Guangzhou 510006, China. Electronic address: gx.zhou@gdut.edu.cn.

PMID: 40644989
DOI: 10.1016/j.neunet.2025.107778

Abstract

Non-uniform quantization has been shown to achieve promising performance for compressing neural networks, due to its better adaptation to the distribution of weights. However, traditional non-uniform quantization methods rely solely on weight distribution density, resulting in diminished model performance post-quantization. To tackle this challenge, we propose a novel non-uniform quantization method that can not only automatically learn the clipping threshold but also adaptively adjust the quantization levels, which can effectively reduce the quantization error. Specifically, we first develop a local uniform quantization strategy by providing finer quantization levels in dense regions. In addition, the gradient of weights is also taken into account in assigning quantization levels. Furthermore, to further diminish quantization error, we propose a linear interpolation-based clipping method with a learnable threshold, which can automatically learn the clipping threshold, minimizing the impact of abnormal data on quantization. The efficacy of our method is validated on CIFAR10, CIFAR100, TINY-IMAGENET, and IMAGENET100 datasets, yielding promising results in terms of improved model performance and reduced quantization errors.

Keywords: Learnable clipping threshold; Neural network; Non-uniform quantization.

Publication types

Review