Quantization | Manas Sahni

Making Neural Nets Work With Low Precision

Deploying efficient neural nets on mobiles is becoming increasingly important. This post explores the concept of quantized inference, and how it works in TensorFlow Lite.