Search Results for author: Carter Yang

Understanding Why ViT Trains Badly on Small Datasets: An Intuitive Perspective

Vision transformer (ViT) is an attention neural network architecture that is shown to be effective for computer vision tasks.

456

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.