An In-Depth Overview of Existing Quantization Strategies for Neural Networks
Authors: Vishakha Agrawal
DOI: https://doi.org/10.5281/zenodo.14607968
Short DOI: https://doi.org/g8xxt2
Country: USA
Full-text Research PDF File:
View |
Download
Abstract: Neural network quantization has emerged as a crucial technique for efficient deployment of deep learning models on resource-constrained devices. This paper provides a detailed survey of existing quantization strategies, analyzing their theoretical foundations, algorithmic details, and empirical performance. We compare and contrast various quantization techniques, including post-training quantization, quantization- aware training, and knowledge distillation-based methods, to provide insights into their strengths, limitations, and applications.
Keywords: Quantization, QAT, PTQ, Dynamic Quantization, Fixed-Point Quantization, Mixed-Precision Quantization
Paper Id: 231990
Published On: 2020-12-03
Published In: Volume 8, Issue 6, November-December 2020