Learnable Extended Activation Function (LEAF) for Deep Neural Networks

This paper introduces Learnable Extended Activation Function (LEAF) - an adaptive activation function that combines the properties of squashing functions and rectifier units. Depending on the target architecture and data processing task, LEAF adapts its form during training to achieve lower loss values and improve the training results. While not suffering from the "vanishing gradient" effect, LEAF can directly replace SiLU, ReLU, Sigmoid, Tanh, Swish, and AHAF in feed-forward, recurrent, and many other neural network architectures. The training process for LEAF features a two-stage approach when the activation function parameters update before the synaptic weights. The experimental evaluation in the image classification task shows the superior performance of LEAF compared to the non-adaptive alternatives. Particularly, LEAF-asTanh provides 7% better classification accuracy than hyperbolic tangents on the CIFAR-10 dataset. As empirically examined, LEAF-as-SiLU and LEAF-as-Sigmoid in convolutional networks tend to "evolve" into SiLU-like forms. The proposed activation function and the corresponding training algorithm are relatively simple from the computational standpoint and easily apply to existing deep neural networks.

PDF Abstract

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods