Search Results for author: Biao Gong

Found 12 papers, 4 papers with code

A Recipe for Scaling up Text-to-Video Generation with Text-free Videos

1 code implementation • 25 Dec 2023 • Xiang Wang, Shiwei Zhang, Hangjie Yuan, Zhiwu Qing, Biao Gong, Yingya Zhang, Yujun Shen, Changxin Gao, Nong Sang

Following such a pipeline, we study the effect of doubling the scale of training set (i. e., video-only WebVid10M) with some randomly collected text-free videos and are encouraged to observe the performance improvement (FID from 9. 67 to 8. 19 and FVD from 484 to 441), demonstrating the scalability of our approach.

Ranked #6 on Text-to-Video Generation on MSR-VTT

Text-to-Image Generation Text-to-Video Generation +2

Paper
Code

Ranni: Taming Text-to-Image Diffusion for Accurate Instruction Following

no code implementations • 28 Nov 2023 • Yutong Feng, Biao Gong, Di Chen, Yujun Shen, Yu Liu, Jingren Zhou

Existing text-to-image (T2I) diffusion models usually struggle in interpreting complex prompts, especially those with quantity, object-attribute binding, and multi-subject descriptions.

Attribute Denoising +1

Paper
Add Code

Learning Disentangled Identifiers for Action-Customized Text-to-Image Generation

no code implementations • 27 Nov 2023 • Siteng Huang, Biao Gong, Yutong Feng, Xi Chen, Yuqian Fu, Yu Liu, Donglin Wang

Experimental results show that existing subject-driven customization methods fail to learn the representative characteristics of actions and struggle in decoupling actions from context features, including appearance.

Text-to-Image Generation

Paper
Add Code

Check, Locate, Rectify: A Training-Free Layout Calibration System for Text-to-Image Generation

no code implementations • 27 Nov 2023 • Biao Gong, Siteng Huang, Yutong Feng, Shiwei Zhang, Yuyuan Li, Yu Liu

To align the generated image with layout instructions, we present a training-free layout calibration system SimM that intervenes in the generative process on the fly during inference time.

Text-to-Image Generation

Paper
Add Code

Logic Diffusion for Knowledge Graph Reasoning

no code implementations • 6 Jun 2023 • Xiaoying Xie, Biao Gong, Yiliang Lv, Zhen Han, Guoshuai Zhao, Xueming Qian

Most recent works focus on answering first order logical queries to explore the knowledge graph reasoning via multi-hop logic predictions.

Paper
Add Code

Selective and Collaborative Influence Function for Efficient Recommendation Unlearning

no code implementations • 20 Apr 2023 • Yuyuan Li, Chaochao Chen, Xiaolin Zheng, Yizhao Zhang, Biao Gong, Jun Wang

In this paper, we first identify two main disadvantages of directly applying existing unlearning methods in the context of recommendation, i. e., (i) unsatisfactory efficiency for large-scale recommendation models and (ii) destruction of collaboration across users and items.

Recommendation Systems

Paper
Add Code

Troika: Multi-Path Cross-Modal Traction for Compositional Zero-Shot Learning

1 code implementation • 27 Mar 2023 • Siteng Huang, Biao Gong, Yutong Feng, Min Zhang, Yiliang Lv, Donglin Wang

Recent compositional zero-shot learning (CZSL) methods adapt pre-trained vision-language models (VLMs) by constructing trainable prompts only for composed state-object pairs.

Compositional Zero-Shot Learning Object

Paper
Code

Scanning Only Once: An End-to-end Framework for Fast Temporal Grounding in Long Videos

1 code implementation • ICCV 2023 • Yulin Pan, Xiangteng He, Biao Gong, Yiliang Lv, Yujun Shen, Yuxin Peng, Deli Zhao

Video temporal grounding aims to pinpoint a video segment that matches the query description.

Paper
Code

ViM: Vision Middleware for Unified Downstream Transferring

no code implementations • ICCV 2023 • Yutong Feng, Biao Gong, Jianwen Jiang, Yiliang Lv, Yujun Shen, Deli Zhao, Jingren Zhou

ViM consists of a zoo of lightweight plug-in modules, each of which is independently learned on a midstream dataset with a shared frozen backbone.

Paper
Add Code

UKnow: A Unified Knowledge Protocol for Common-Sense Reasoning and Vision-Language Pre-training

no code implementations • 14 Feb 2023 • Biao Gong, Xiaoying Xie, Yutong Feng, Yiliang Lv, Yujun Shen, Deli Zhao

This work presents a unified knowledge protocol, called UKnow, which facilitates knowledge-based studies from the perspective of data.

Common Sense Reasoning

Paper
Add Code

VoP: Text-Video Co-operative Prompt Tuning for Cross-Modal Retrieval

1 code implementation • CVPR 2023 • Siteng Huang, Biao Gong, Yulin Pan, Jianwen Jiang, Yiliang Lv, Yuyuan Li, Donglin Wang

Many recent studies leverage the pre-trained CLIP for text-video cross-modal retrieval by tuning the backbone with additional heavy modules, which not only brings huge computational burdens with much more parameters, but also leads to the knowledge forgetting from upstream models.

Cross-Modal Retrieval Retrieval +1

Paper
Code

Deep Multi-View Enhancement Hashing for Image Retrieval

no code implementations • 1 Feb 2020 • Chenggang Yan, Biao Gong, Yuxuan Wei, Yue Gao

Therefore, we try to introduce the multi-view deep neural network into the hash learning field, and design an efficient and innovative retrieval model, which has achieved a significant improvement in retrieval performance.

Image Retrieval Retrieval

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.