An In-Depth Overview of Existing Quantization Strategies for Neural Networks

Vishakha Agrawal

doi:10.5281/zenodo.14607968

An In-Depth Overview of Existing Quantization Strategies for Neural Networks

Authors: Vishakha Agrawal

DOI: https://doi.org/10.5281/zenodo.14607968

Short DOI: https://doi.org/g8xxt2

Country: USA

Full-text Research PDF File: View | Download

Abstract: Neural network quantization has emerged as a crucial technique for efficient deployment of deep learning models on resource-constrained devices. This paper provides a detailed survey of existing quantization strategies, analyzing their theoretical foundations, algorithmic details, and empirical performance. We compare and contrast various quantization techniques, including post-training quantization, quantization- aware training, and knowledge distillation-based methods, to provide insights into their strengths, limitations, and applications.

Keywords: Quantization, QAT, PTQ, Dynamic Quantization, Fixed-Point Quantization, Mixed-Precision Quantization

Paper Id: 231990

Published On: 2020-12-03

Published In: Volume 8, Issue 6, November-December 2020

All research papers published in this journal/on this website are openly accessible and licensed under Creative Commons Attribution-ShareAlike 4.0 International License; accordingly, any user can read, download, copy, distribute, print, search, or link to the full texts of the authors/researchers submitted and published articles, crawl them for indexing, pass them as data to any software, or use them for any other lawful purpose. The journal is fulfilling the DOAJ's definition of open access.

About IJIRMPS Indexing & Archiving Publication Ethics Peer Review & Plagiarism	Website/Journal Policies Usage Policy Content Policies Privacy Policy	Contact Us +91-9687-828-838 editor@ijirmps.org

International Journal of Innovative Research in Engineering & Multidisciplinary Physical Sciences
E-ISSN: 2349-7300 • Impact Factor - 9.907

A Widely Indexed Open Access Peer Reviewed Online Scholarly International Journal

An In-Depth Overview of Existing Quantization Strategies for Neural Networks

Share this

International Journal of Innovative Research in Engineering & Multidisciplinary Physical Sciences E-ISSN: 2349-7300 • Impact Factor - 9.907

A Widely Indexed Open Access Peer Reviewed Online Scholarly International Journal

An In-Depth Overview of Existing Quantization Strategies for Neural Networks

Share this

International Journal of Innovative Research in Engineering & Multidisciplinary Physical Sciences
E-ISSN: 2349-7300 • Impact Factor - 9.907