||Local Non-linear Quantization for Neural Network Compression in MPEG-NNR
||(Hyeon-Cheol Moon) ; (Jae-Gon Kim)
|| Neural network compression; NNR; NCTM; CNN; Non-linear quantization
||Deep Convolutional Neural Networks (CNNs) have demonstrated excellent performance in various visual applications, but their deployment, especially in resource-constrained environments, is limited due to their enormous computational complexity and memory requirements. Therefore, compression of network models while still maintaining the task performance of the trained model is being studied. Recently, the Moving Picture Experts Group (MPEG) developed a standard called Neural Network compression and Representation (NNR) that provides a compressed representation of trained neural networks in an interoperable form. In this paper, we propose a local non-linear quantization (LNQ) method for compressing weight parameters of neural network models. The experimental results show that the proposed LNQ achieves about a 29% gain in compression efficiency with virtually no loss of performance in the tasks, compared to the NNR test model, called the Neural network Compression Test Model (NCTM) version 3.0.