A technique where a smaller model learns from a larger, more complex model, retaining critical knowledge while reducing size.
More about Knowledge Distillation
Knowledge Distillation is a machine learning process where a smaller model, called the "student," learns to replicate the performance of a larger, more complex model, called the "teacher." This is achieved by transferring knowledge from the teacher to the student through training on the outputs or intermediate representations of the teacher model.
This technique is widely used to optimize models for deployment in resource-constrained environments, ensuring that they retain critical capabilities for tasks like document retrieval and semantic search.