Documentation
Deep dive into model compression techniques. Understand how distillation, quantization, and pruning make AI models smaller and faster.
Knowledge Distillation
Learn how dark knowledge and soft targets compress large models into efficient students.
Read guideQuantization
Reduce model size by converting weights from high-precision to lower-precision formats.
Read guidePruning
Remove redundant parameters while preserving accuracy for leaner, faster models.
Read guide