1 code implementation • Findings (NAACL) 2022 • Jinpeng Hu, He Zhao, Dan Guo, Xiang Wan, Tsung-Hui Chang
In doing so, label information contained in the embedding vectors can be effectively transferred to the target domain, and Bi-LSTM can further model the label relationship among different domains by pre-train and then fine-tune setting.
Cross-Domain Named Entity Recognition named-entity-recognition +2
2 code implementations • 16 Apr 2024 • Bin Ren, Nancy Mehta, Radu Timofte, Hongyuan Yu, Cheng Wan, Yuxin Hong, Bingnan Han, Zhuoyuan Wu, Yajun Zou, Yuqing Liu, Jizhe Li, Keji He, Chao Fan, Heng Zhang, Xiaolin Zhang, Xuanwu Yin, Kunlong Zuo, Bohao Liao, Peizhe Xia, Long Peng, Zhibo Du, Xin Di, Wangkai Li, Yang Wang, Wei Zhai, Renjing Pei, Jiaming Guo, Songcen Xu, Yang Cao, ZhengJun Zha, Yan Wang, Yi Liu, Qing Wang, Gang Zhang, Liou Zhang, Shijie Zhao, Long Sun, Jinshan Pan, Jiangxin Dong, Jinhui Tang, Xin Liu, Min Yan, Menghan Zhou, Yiqiang Yan, Yixuan Liu, Wensong Chan, Dehua Tang, Dong Zhou, Li Wang, Lu Tian, Barsoum Emad, Bohan Jia, Junbo Qiao, Yunshuai Zhou, Yun Zhang, Wei Li, Shaohui Lin, Shenglong Zhou, Binbin Chen, Jincheng Liao, Suiyi Zhao, Zhao Zhang, Bo wang, Yan Luo, Yanyan Wei, Feng Li, Mingshen Wang, Yawei Li, Jinhan Guan, Dehua Hu, Jiawei Yu, Qisheng Xu, Tao Sun, Long Lan, Kele Xu, Xin Lin, Jingtong Yue, Lehan Yang, Shiyi Du, Lu Qi, Chao Ren, Zeyu Han, YuHan Wang, Chaolin Chen, Haobo Li, Mingjun Zheng, Zhongbao Yang, Lianhong Song, Xingzhuo Yan, Minghan Fu, Jingyi Zhang, Baiang Li, Qi Zhu, Xiaogang Xu, Dan Guo, Chunle Guo, Jiadi Chen, Huanhuan Long, Chunjiang Duanmu, Xiaoyan Lei, Jie Liu, Weilin Jia, Weifeng Cao, Wenlong Zhang, Yanyu Mao, Ruilong Guo, Nihao Zhang, Qian Wang, Manoj Pandey, Maksym Chernozhukov, Giang Le, Shuli Cheng, Hongyuan Wang, Ziyan Wei, Qingting Tang, Liejun Wang, Yongming Li, Yanhui Guo, Hao Xu, Akram Khatami-Rizi, Ahmad Mahmoudi-Aznaveh, Chih-Chung Hsu, Chia-Ming Lee, Yi-Shiuan Chou, Amogh Joshi, Nikhil Akalwadi, Sampada Malagi, Palani Yashaswini, Chaitra Desai, Ramesh Ashok Tabib, Ujwala Patil, Uma Mudenagudi
In sub-track 1, the practical runtime performance of the submissions was evaluated, and the corresponding score was used to determine the ranking.
1 code implementation • 21 Mar 2024 • Jingjing Hu, Dan Guo, Kun Li, Zhan Si, Xun Yang, Xiaojun Chang, Meng Wang
Inspired by the activity-silent and persistent activity mechanisms in human visual perception biology, we design a Unified Static and Dynamic Network (UniSDNet), to learn the semantic association between the video and text/audio queries in a cross-modal environment for efficient video grounding.
1 code implementation • 17 Mar 2024 • Jing Zhang, Liang Zheng, Dan Guo, Meng Wang
This paper develops small vision language models to understand visual art, which, given an art work, aims to identify its emotion category and explain this prediction with natural language.
2 code implementations • 12 Mar 2024 • Fei Wang, Dan Guo, Kun Li, Zhun Zhong, Meng Wang
To this end, we present FD4MM, a new paradigm of Frequency Decoupling for Motion Magnification with a Multi-level Isomorphic Architecture to capture multi-level high-frequency details and a stable low-frequency structure (motion field) in video space.
1 code implementation • 8 Mar 2024 • Dan Guo, Kun Li, Bin Hu, Yan Zhang, Meng Wang
It offers insights into the feelings and intentions of individuals and is important for human-oriented applications such as emotion recognition and psychological assessment.
Ranked #1 on Micro-Action Recognition on MA-52
1 code implementation • 20 Dec 2023 • Zhangbin Li, Dan Guo, Jinxing Zhou, Jing Zhang, Meng Wang
These selected pairs are constrained to have larger similarity values than the mismatched pairs.
Audio-visual Question Answering Audio-Visual Question Answering (AVQA) +4
1 code implementation • 7 Dec 2023 • Fei Wang, Dan Guo, Kun Li, Meng Wang
Then, we introduce a novel dynamic filter that eliminates noise cues and preserves critical features in the motion magnification and amplification generation phases.
no code implementations • 13 Oct 2023 • Sheng Zhou, Dan Guo, Jia Li, Xun Yang, Meng Wang
The associations between these repetitive objects are superfluous for answer reasoning; (2) two spatially distant OCR tokens detected in the image frequently have weak semantic dependencies for answer reasoning; and (3) the co-existence of nearby objects and tokens may be indicative of important visual cues for predicting answers.
no code implementations • 12 Sep 2023 • Jiaxiu Li, Kun Li, Jia Li, Guoliang Chen, Dan Guo, Meng Wang
Compared with the general video grounding task, MTVG focuses on meticulous actions and changes on the face.
no code implementations • 25 Aug 2023 • Jia Li, Wei Qian, Kun Li, Qi Li, Dan Guo, Meng Wang
Specifically, we achieve the results of 0. 8492 and 0. 8439 for MuSe-Personalisation in terms of arousal and valence CCC.
1 code implementation • 15 Aug 2023 • Wei Qian, Dan Guo, Kun Li, Xilan Tian, Meng Wang
Specifically, the proposed Dual-TL uses a Spatial TokenLearner (S-TL) to explore associations in different facial ROIs, which promises the rPPG prediction far away from noisy ROI disturbances.
no code implementations • 11 Aug 2023 • Kun Li, Dan Guo, Meng Wang
First, we employed a sharing feature encoder to project both video and query into a joint feature space before performing cross-modal co-attention (i. e., video-to-query attention and query-to-video attention) to highlight discriminative features in each modality.
no code implementations • 11 Aug 2023 • Yen Nhi Truong Vu, Dan Guo, Ahmed Taha, Jason Su, Thomas Paul Matthews
Deep-learning-based object detection methods show promise for improving screening mammography, but high rates of false positives can hinder their effectiveness in clinical practice.
no code implementations • 3 Aug 2023 • Kun Li, Dan Guo, Guoliang Chen, Feiyang Liu, Meng Wang
In this paper, we present the solution of our team HFUT-VUT for the MultiMediate Grand Challenge 2023 at ACM Multimedia 2023.
1 code implementation • 20 Jul 2023 • Kun Li, Dan Guo, Guoliang Chen, Xinge Peng, Meng Wang
In this paper, we briefly introduce the solution of our team HFUT-VUT for the Micros-gesture Classification in the MiGA challenge at IJCAI 2023.
Ranked #1 on Micro-gesture Recognition on iMiGUE
no code implementations • 4 Mar 2023 • Jinxing Zhou, Dan Guo, Yiran Zhong, Meng Wang
We perform extensive experiments on the LLP dataset and demonstrate that our method can generate high-quality segment-level pseudo labels with the help of our newly proposed loss and the label denoising strategy.
1 code implementation • 30 Jan 2023 • Jinxing Zhou, Xuyang Shen, Jianyuan Wang, Jiayi Zhang, Weixuan Sun, Jing Zhang, Stan Birchfield, Dan Guo, Lingpeng Kong, Meng Wang, Yiran Zhong
To deal with these problems, we propose a new baseline method that uses a temporal pixel-wise audio-visual interaction module to inject audio semantics as guidance for the visual segmentation process.
no code implementations • TMM 2022 • Zhao Xie, Jiansong Chen, Kewei Wu, Dan Guo, Richang Hong
In the global aggregation module, the global prior knowledge is learned by aggregating the visual feature sequence of video into a global vector.
Ranked #62 on Action Recognition on Something-Something V2
1 code implementation • 18 Nov 2022 • Jinxing Zhou, Dan Guo, Meng Wang
Visual and audio signals often coexist in natural environments, forming audio-visual events (AVEs).
1 code implementation • 14 Oct 2022 • Kang Liu, Feng Xue, Dan Guo, Le Wu, Shujie Li, Richang Hong
This paper aims at solving the mismatch problem between MFE and UIM, so as to generate high-quality embedding representations and better model multimodal user preferences.
1 code implementation • 10 Oct 2022 • Kang Liu, Feng Xue, Xiangnan He, Dan Guo, Richang Hong
In this work, we propose to model multi-grained popularity features and jointly learn them together with high-order connectivity, to match the differentiation of user preferences exhibited in popularity features.
no code implementations • 22 Jul 2022 • Jia Li, Jiantao Nie, Dan Guo, Richang Hong, Meng Wang
Here, we regard an expressive face as the comprehensive result of a set of facial muscle movements on one's poker face (i. e., emotionless face), inspired by Facial Action Coding System.
Ranked #5 on Facial Expression Recognition (FER) on FER+
1 code implementation • 11 Jul 2022 • Jinxing Zhou, Jianyuan Wang, Jiayi Zhang, Weixuan Sun, Jing Zhang, Stan Birchfield, Dan Guo, Lingpeng Kong, Meng Wang, Yiran Zhong
To deal with the AVS problem, we propose a novel method that uses a temporal pixel-wise audio-visual interaction module to inject audio semantics as guidance for the visual segmentation process.
no code implementations • 14 Sep 2020 • Yijue Wang, Jieren Deng, Dan Guo, Chenghong Wang, Xianrui Meng, Hang Liu, Caiwen Ding, Sanguthevar Rajasekaran
Distributed learning such as federated learning or collaborative learning enables model training on decentralized data from users and only collects local gradients, where data is processed close to its sources for data privacy.
no code implementations • 24 Jun 2020 • Dan Guo, Yang Wang, Peipei Song, Meng Wang
Unsupervised image captioning with no annotations is an emerging challenge in computer vision, where the existing arts usually adopt GAN (Generative Adversarial Networks) models.
1 code implementation • CVPR 2020 • Dan Guo, Hui Wang, Hanwang Zhang, Zheng-Jun Zha, Meng Wang
Visual dialog is a challenging task that requires the comprehension of the semantic dependencies among implicit visual and textual contexts.
Ranked #12 on Visual Dialog on VisDial v0.9 val