Search Results for author: Guohao Sun

Text Is MASS: Modeling as Stochastic Embedding for Text-Video Retrieval

Correspondingly, a single text embedding may be less expressive to capture the video embedding and empower the retrieval.

Paper
Code

Recent advancements in the vision-language model have shown notable generalization in vision-language tasks after visual instruction tuning.

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.