no code implementations • 23 Nov 2022 • Philip de Rijk, Lukas Schneider, Marius Cordts, Dariu M. Gavrila
Knowledge Distillation (KD) is a well-known training paradigm in deep neural networks where knowledge acquired by a large teacher model is transferred to a small student.