Paper

Quantized Guided Pruning for Efficient Hardware Implementations of\n Convolutional Neural Networks

Convolutional Neural Networks (CNNs) are state-of-the-art in numerous\ncomputer vision tasks such as object classification and detection. However, the\nlarge amount of parameters they contain leads to a high computational\ncomplexity and strongly limits their usability in budget-constrained devices\nsuch as embedded devices. In this paper, we propose a combination of a new\npruning technique and a quantization scheme that effectively reduce the\ncomplexity and memory usage of convolutional layers of CNNs, and replace the\ncomplex convolutional operation by a low-cost multiplexer. We perform\nexperiments on the CIFAR10, CIFAR100 and SVHN and show that the proposed method\nachieves almost state-of-the-art accuracy, while drastically reducing the\ncomputational and memory footprints. We also propose an efficient hardware\narchitecture to accelerate CNN operations. The proposed hardware architecture\nis a pipeline and accommodates multiple layers working at the same time to\nspeed up the inference process.\n

arXiv (Cornell University)Published 2018-12-29Paper linkPDF

Authors: Hacene, Ghouthi Boukli · Gripon, Vincent · Arzel, Matthieu · Farrugia, Nicolas · Bengio, Yoshua

Topics

Relevant entities

People

Related coverage

Linked coverage will appear here.

Related events

Linked events will appear here.

Related discussions

Related discussion nodes will appear here.