no code implementations • 23 Jun 2022 • Durga Prasad Ganta, Himel Das Gupta, Victor S. Sheng
In this research, we have shown that using multiple teaching assistant models, the student model (the smaller model) can be further improved.
Ensemble Learning Knowledge Distillation