SEDONA: Search for Decoupled Neural Networks toward Greedy Block-wise Learning

Backward locking and update locking are well-known sources of inefficiency in backpropagation that prevent from concurrently updating layers. Several works have recently suggested using local error signals to train network blocks asynchronously to overcome these limitations. However, they often require numerous iterations of trial-and-error to find the best configuration for local training, including how to decouple network blocks and which auxiliary networks to use for each block. In this work, we propose a differentiable search algorithm named SEDONA to automate this process. Experimental results show that our algorithm can consistently discover transferable decoupled architectures for VGG and ResNet variants, and significantly outperforms the ones trained with end-to-end backpropagation and other state-of-the-art greedy-leaning methods in CIFAR-10, Tiny-ImageNet and ImageNet. Thanks to improved parallelism by local training, we also report up to 2.02× speedup over backpropagation in total training time.

PDF Abstract

Results from the Paper


Task Dataset Model Metric Name Metric Value Global Rank Benchmark
Neural Architecture Search ImageNet SEDONA (ResNet-152 Aux. Ens., K=4) Top-1 Error Rate 20.2 # 25
Neural Architecture Search ImageNet SEDONA (ResNet-152, K=4) Top-1 Error Rate 21.09 # 38

Methods