Search Results for author: Yi-Lin Sung

Found 11 papers, 8 papers with code

DAM: Dynamic Adapter Merging for Continual Video QA Learning

1 code implementation • 13 Mar 2024 • Feng Cheng, Ziyang Wang, Yi-Lin Sung, Yan-Bo Lin, Mohit Bansal, Gedas Bertasius

Our DAM model outperforms prior state-of-the-art continual learning approaches by 9. 1% while exhibiting 1. 9% less forgetting on 6 VidQA datasets spanning various domains.

Continual Learning Image Classification +2

Paper
Code

SELMA: Learning and Merging Skill-Specific Text-to-Image Experts with Auto-Generated Data

no code implementations • 11 Mar 2024 • Jialu Li, Jaemin Cho, Yi-Lin Sung, Jaehong Yoon, Mohit Bansal

In this paper, we introduce SELMA: Skill-Specific Expert Learning and Merging with Auto-Generated Data, a novel paradigm to improve the faithfulness of T2I models by fine-tuning models on automatically generated, multi-skill image-text datasets, with skill-specific expert learning and merging.

In-Context Learning

Paper
Add Code

ECoFLaP: Efficient Coarse-to-Fine Layer-Wise Pruning for Vision-Language Models

no code implementations • 4 Oct 2023 • Yi-Lin Sung, Jaehong Yoon, Mohit Bansal

We first determine the sparsity ratios of different layers or blocks by leveraging the global importance score, which is efficiently computed based on the zeroth-order approximation of the global model gradients.

Model Compression

Paper
Add Code

Merge, Then Compress: Demystify Efficient SMoE with Hints from Its Routing Policy

1 code implementation • 2 Oct 2023 • Pingzhi Li, Zhenyu Zhang, Prateek Yadav, Yi-Lin Sung, Yu Cheng, Mohit Bansal, Tianlong Chen

Sparsely activated Mixture-of-Experts (SMoE) has shown promise to scale up the learning capacity of neural networks, however, they have issues like (a) High Memory Usage, due to duplication of the network layers into multiple copies as experts; and (b) Redundancy in Experts, as common learning-based routing policies suffer from representational collapse.

Paper
Code

Unified Coarse-to-Fine Alignment for Video-Text Retrieval

1 code implementation • ICCV 2023 • Ziyang Wang, Yi-Lin Sung, Feng Cheng, Gedas Bertasius, Mohit Bansal

Specifically, our model captures the cross-modal similarity information at different granularity levels.

Ranked #11 on Video Retrieval on MSR-VTT

Retrieval Text Retrieval +2

Paper
Code

An Empirical Study of Multimodal Model Merging

1 code implementation • 28 Apr 2023 • Yi-Lin Sung, Linjie Li, Kevin Lin, Zhe Gan, Mohit Bansal, Lijuan Wang

In this paper, we expand on this concept to a multimodal setup by merging transformers trained on different modalities.

Retrieval Visual Question Answering (VQA)

Paper
Code

Vision Transformers are Parameter-Efficient Audio-Visual Learners

1 code implementation • CVPR 2023 • Yan-Bo Lin, Yi-Lin Sung, Jie Lei, Mohit Bansal, Gedas Bertasius

To do so, we propose a latent audio-visual hybrid (LAVISH) adapter that adapts pretrained ViTs to audio-visual tasks by injecting a small number of trainable parameters into every layer of a frozen ViT.

Ranked #4 on Audio-visual Question Answering on MUSIC-AVQA

Audio-visual Question Answering

Paper
Code

LST: Ladder Side-Tuning for Parameter and Memory Efficient Transfer Learning

2 code implementations • 13 Jun 2022 • Yi-Lin Sung, Jaemin Cho, Mohit Bansal

LST saves 69% of the memory costs to fine-tune the whole network, while other methods only save 26% of that in similar parameter usages (hence, 2. 7x more memory savings).

Transfer Learning Visual Question Answering (VQA)

214

Paper
Code

VL-Adapter: Parameter-Efficient Transfer Learning for Vision-and-Language Tasks

1 code implementation • CVPR 2022 • Yi-Lin Sung, Jaemin Cho, Mohit Bansal

Our results demonstrate that training the adapter with the weight-sharing technique (4. 18% of total parameters for image-text tasks and 3. 39% for video-text tasks) can match the performance of fine-tuning the entire model.

Image Captioning Transfer Learning

196

Paper
Code

Training Neural Networks with Fixed Sparse Masks

1 code implementation • NeurIPS 2021 • Yi-Lin Sung, Varun Nair, Colin Raffel

In this paper, we show that it is possible to induce a fixed sparse mask on the model's parameters that selects a subset to update over many iterations.

Transfer Learning

Paper
Code

Difference-Seeking Generative Adversarial Network

no code implementations • ICLR 2019 • Yi-Lin Sung, Sung-Hsien Hsieh, Soo-Chang Pei, Chun-Shien Lu

DSGAN considers the scenario that the training samples of target distribution, $p_{t}$, are difficult to collect.

Generative Adversarial Network

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.