no code implementations • EMNLP 2020 • Nayu Liu, Xian Sun, Hongfeng Yu, Wenkai Zhang, Guangluan Xu
Multimodal summarization for open-domain videos is an emerging task, aiming to generate a summary from multisource information (video, audio, transcript).
1 code implementation • 14 Sep 2022 • Zhiqiang Yuan, Wenkai Zhang, Chongyang Li, Zhaoying Pan, Yongqiang Mao, Jialiang Chen, Shouke Li, Hongqi Wang, Xian Sun
Finally, we analyze the SeLo performance of RS cross-modal retrieval models in detail, explore the impact of different variables on this task, and provide a complete benchmark for the SeLo task.
no code implementations • 24 Apr 2022 • Feifei Xu, Shanlin Zhou, Xinpeng Wang, Yunpu Ma, Wenkai Zhang, Zhisong Li
To merge these two forms of knowledge into the dialogue effectively, we design a dynamic virtual knowledge selector and a controller that help to enrich and expand knowledge space.
1 code implementation • 21 Apr 2022 • Zhiqiang Yuan, Wenkai Zhang, Kun fu, Xuan Li, Chubo Deng, Hongqi Wang, Xian Sun
Our model adapts to multi-scale feature inputs, favors multi-source retrieval methods, and can dynamically filter redundant features.
Ranked #8 on Cross-Modal Retrieval on RSITMD
1 code implementation • 21 Apr 2022 • Zhiqiang Yuan, Wenkai Zhang, Changyuan Tian, Xuee Rong, Zhengyuan Zhang, Hongqi Wang, Kun fu, Xian Sun
In this article, we first propose a novel RSCTIR framework based on global and local information (GaLR), and design a multi-level information dynamic fusion (MIDF) module to efficaciously integrate features of different levels.
Ranked #6 on Cross-Modal Retrieval on RSITMD
no code implementations • 19 Jul 2021 • Zizhang Wu, Wenkai Zhang, Jizheng Wang, Man Wang, Yuanzhu Gan, Xinchao Gou, Muqing Fang, Jing Song
The 3D visual perception for vehicles with the surround-view fisheye camera system is a critical and challenging task for low-cost urban autonomous driving.
1 code implementation • 17 Jun 2021 • Wenkai Zhang, Hongyu Lin, Xianpei Han, Le Sun, Huidan Liu, Zhicheng Wei, Nicholas Jing Yuan
Specifically, during neural network training, we naturally model the noise samples in each batch following a hypergeometric distribution parameterized by the noise-rate.
1 code implementation • ACL 2021 • Wenkai Zhang, Hongyu Lin, Xianpei Han, Le Sun
Distant supervision tackles the data bottleneck in NER by automatically generating training instances via dictionary matching.
no code implementations • 30 Mar 2021 • Zizhang Wu, Man Wang, Jason Wang, Wenkai Zhang, Muqing Fang, Tianhao Xu
It's worth noting that the owner-member relationship between wheels and vehicles has an significant contribution to the 3D perception of vehicles, especially in the embedded environment.