International Journal of Innovative Research in Engineering & Multidisciplinary Physical Sciences
E-ISSN: 2349-7300Impact Factor - 9.907

A Widely Indexed Open Access Peer Reviewed Online Scholarly International Journal

Call for Paper Volume 13 Issue 2 March-April 2025 Submit your research for publication

An In-Depth Overview of Existing Quantization Strategies for Neural Networks

Authors: Vishakha Agrawal

DOI: https://doi.org/10.5281/zenodo.14607968

Short DOI: https://doi.org/g8xxt2

Country: USA

Full-text Research PDF File:   View   |   Download


Abstract: Neural network quantization has emerged as a crucial technique for efficient deployment of deep learning models on resource-constrained devices. This paper provides a detailed survey of existing quantization strategies, analyzing their theoretical foundations, algorithmic details, and empirical performance. We compare and contrast various quantization techniques, including post-training quantization, quantization- aware training, and knowledge distillation-based methods, to provide insights into their strengths, limitations, and applications.

Keywords: Quantization, QAT, PTQ, Dynamic Quantization, Fixed-Point Quantization, Mixed-Precision Quantization


Paper Id: 231990

Published On: 2020-12-03

Published In: Volume 8, Issue 6, November-December 2020

Share this