Seminar: Graduate Seminar
AMED: Automatic Mixed-Precision Quantization for Edge Devices
Mixed-precision quantization offers better utilization of customized hardware that supports arithmetic operations at different precision.
Hardware-aware quantization methods commonly optimize a dependent variable (such as FLOPs) for a specified property of the model or induce constraints on the model size.
Both makes the model’s performance inefficient when deployed on specific hardware. Our work proposes Automatic Mixed-Precision Quantization for Edge Devices (AMED), which, during the training
procedure, quantizes the model to a different precision, looks at the bit allocation as a Markov Decision Process based on direct signal from hardware architecture.
We perform a comprehensive evaluation of the proposed method demonstrates its superiority over current state-of-the-art schemes in terms of the trade-off between neural network accuracy and hardware efficiency, on different simulated hardware architectures of edge devices.
* M.Sc. student under the supervision of Prof. Avi Mendelson.