Model Optimization and Quantization

The TF Lite converter generates lightweight TF models suitable for resource-constrained mobile and edge devices. We can make the TF Lite models even more compact and fast by applying some optimization and quantization techniques at the cost of a little reduction in model performance. Let’s discuss the process of quantization and model optimization techniques offered by the TF Lite framework.

Quantization

Quantization is a procedure to map input values represented in a larger set to the output values in a relatively smaller set. The range of input values can be infinite (continuous) or finite (using a large number of bits to store numbers). The following figure shows the quantization of a continuous information source.

Get hands-on with 1200+ tech skills courses.