Serving, hardware and MLOps.
Quantization is a model optimization technique that lowers the numerical precision of weights and activations, usually converting 32-bit floats to 8-bit integers or similar lower-bit formats.