no code implementations • 10 Dec 2021 • Qi Sun, Hexin Dong, Zewei Chen, Jiacheng Sun, Zhenguo Li, Bin Dong
Gradient-based methods for the distributed training of residual networks (ResNets) typically require a forward pass of the input data, followed by back-propagating the error gradient to update model parameters, which becomes time-consuming as the network goes deeper.
no code implementations • NeurIPS Workshop DLDE 2021 • Qi Sun, Hexin Dong, Zewei Chen, Weizhen Dian, Jiacheng Sun, Yitong Sun, Zhenguo Li, Bin Dong
Backpropagation algorithm is indispensable for training modern residual networks (ResNets) and usually tends to be time-consuming due to its inherent algorithmic lockings.
no code implementations • ICCV 2021 • Yao Zhu, Jiacheng Ma, Jiacheng Sun, Zewei Chen, Rongxin Jiang, Zhenguo Li
We find that adversarial training contributes to obtaining an energy function that is flat and has low energy around the real data, which is the key for generative capability.
2 code implementations • CVPR 2021 • Yawen Duan, Xin Chen, Hang Xu, Zewei Chen, Xiaodan Liang, Tong Zhang, Zhenguo Li
While existing NAS methods mostly design architectures on a single task, algorithms that look beyond single-task search are surging to pursue a more efficient and universal solution across various tasks.
no code implementations • 1 Jan 2021 • Yao Zhu, Jiacheng Sun, Zewei Chen, Zhenguo Li
We justify the algorithm with a linear model that the added saliency maps pull data away from its closest decision boundary.
2 code implementations • 1 Jan 2021 • Yawen Duan, Xin Chen, Hang Xu, Zewei Chen, Xiaodan Liang, Tong Zhang, Zhenguo Li
While existing NAS methods mostly design architectures on one single task, algorithms that look beyond single-task search are surging to pursue a more efficient and universal solution across various tasks.
no code implementations • 3 Sep 2020 • Qi Sun, Hexin Dong, Zewei Chen, Weizhen Dian, Jiacheng Sun, Yitong Sun, Zhenguo Li, Bin Dong
Gradient-based algorithms for training ResNets typically require a forward pass of the input data, followed by back-propagating the objective gradient to update parameters, which are time-consuming for deep ResNets.
no code implementations • ECCV 2020 • Xin Chen, Yawen Duan, Zewei Chen, Hang Xu, Zihao Chen, Xiaodan Liang, Tong Zhang, Zhenguo Li
In spite of its remarkable progress, many algorithms are restricted to particular search spaces.
Ranked #14 on Neural Architecture Search on NAS-Bench-201, ImageNet-16-120 (Search time (s) metric)
no code implementations • 16 Jun 2020 • Jiacheng Sun, Xiangyong Cao, Hanwen Liang, Weiran Huang, Zewei Chen, Zhenguo Li
In recent years, a variety of normalization methods have been proposed to help train neural networks, such as batch normalization (BN), layer normalization (LN), weight normalization (WN), group normalization (GN), etc.
no code implementations • 23 Jan 2020 • Zewei Chen, Fengwei Zhou, George Trimponias, Zhenguo Li
Despite recent progress, the problem of approximating the full Pareto front accurately and efficiently remains challenging.
no code implementations • 3 Sep 2019 • Vasco Lopes, Fabio Maria Carlucci, Pedro M Esperança, Marco Singh, Victor Gabillon, Antoine Yang, Hang Xu, Zewei Chen, Jun Wang
The Neural Architecture Search (NAS) problem is typically formulated as a graph search problem where the goal is to learn the optimal operations over edges in order to maximise a graph-level global objective.