Search Results for author: Dan Guo

Found 27 papers, 16 papers with code

A Label-Aware Autoregressive Framework for Cross-Domain NER

1 code implementation • Findings (NAACL) 2022 • Jinpeng Hu, He Zhao, Dan Guo, Xiang Wan, Tsung-Hui Chang

In doing so, label information contained in the embedding vectors can be effectively transferred to the target domain, and Bi-LSTM can further model the label relationship among different domains by pre-train and then fine-tune setting.

Cross-Domain Named Entity Recognition named-entity-recognition +2

Paper
Code

The Ninth NTIRE 2024 Efficient Super-Resolution Challenge Report

2 code implementations • 16 Apr 2024 • Bin Ren, Nancy Mehta, Radu Timofte, Hongyuan Yu, Cheng Wan, Yuxin Hong, Bingnan Han, Zhuoyuan Wu, Yajun Zou, Yuqing Liu, Jizhe Li, Keji He, Chao Fan, Heng Zhang, Xiaolin Zhang, Xuanwu Yin, Kunlong Zuo, Bohao Liao, Peizhe Xia, Long Peng, Zhibo Du, Xin Di, Wangkai Li, Yang Wang, Wei Zhai, Renjing Pei, Jiaming Guo, Songcen Xu, Yang Cao, ZhengJun Zha, Yan Wang, Yi Liu, Qing Wang, Gang Zhang, Liou Zhang, Shijie Zhao, Long Sun, Jinshan Pan, Jiangxin Dong, Jinhui Tang, Xin Liu, Min Yan, Menghan Zhou, Yiqiang Yan, Yixuan Liu, Wensong Chan, Dehua Tang, Dong Zhou, Li Wang, Lu Tian, Barsoum Emad, Bohan Jia, Junbo Qiao, Yunshuai Zhou, Yun Zhang, Wei Li, Shaohui Lin, Shenglong Zhou, Binbin Chen, Jincheng Liao, Suiyi Zhao, Zhao Zhang, Bo wang, Yan Luo, Yanyan Wei, Feng Li, Mingshen Wang, Yawei Li, Jinhan Guan, Dehua Hu, Jiawei Yu, Qisheng Xu, Tao Sun, Long Lan, Kele Xu, Xin Lin, Jingtong Yue, Lehan Yang, Shiyi Du, Lu Qi, Chao Ren, Zeyu Han, YuHan Wang, Chaolin Chen, Haobo Li, Mingjun Zheng, Zhongbao Yang, Lianhong Song, Xingzhuo Yan, Minghan Fu, Jingyi Zhang, Baiang Li, Qi Zhu, Xiaogang Xu, Dan Guo, Chunle Guo, Jiadi Chen, Huanhuan Long, Chunjiang Duanmu, Xiaoyan Lei, Jie Liu, Weilin Jia, Weifeng Cao, Wenlong Zhang, Yanyu Mao, Ruilong Guo, Nihao Zhang, Qian Wang, Manoj Pandey, Maksym Chernozhukov, Giang Le, Shuli Cheng, Hongyuan Wang, Ziyan Wei, Qingting Tang, Liejun Wang, Yongming Li, Yanhui Guo, Hao Xu, Akram Khatami-Rizi, Ahmad Mahmoudi-Aznaveh, Chih-Chung Hsu, Chia-Ming Lee, Yi-Shiuan Chou, Amogh Joshi, Nikhil Akalwadi, Sampada Malagi, Palani Yashaswini, Chaitra Desai, Ramesh Ashok Tabib, Ujwala Patil, Uma Mudenagudi

In sub-track 1, the practical runtime performance of the submissions was evaluated, and the corresponding score was used to determine the ranking.

Image Super-Resolution

226

Paper
Code

Unified Static and Dynamic Network: Efficient Temporal Filtering for Video Grounding

1 code implementation • 21 Mar 2024 • Jingjing Hu, Dan Guo, Kun Li, Zhan Si, Xun Yang, Xiaojun Chang, Meng Wang

Inspired by the activity-silent and persistent activity mechanisms in human visual perception biology, we design a Unified Static and Dynamic Network (UniSDNet), to learn the semantic association between the video and text/audio queries in a cross-modal environment for efficient video grounding.

Video Grounding

Paper
Code

Training A Small Emotional Vision Language Model for Visual Art Comprehension

1 code implementation • 17 Mar 2024 • Jing Zhang, Liang Zheng, Dan Guo, Meng Wang

This paper develops small vision language models to understand visual art, which, given an art work, aims to identify its emotion category and explain this prediction with natural language.

Language Modelling

Paper
Code

Frequency Decoupling for Motion Magnification via Multi-Level Isomorphic Architecture

2 code implementations • 12 Mar 2024 • Fei Wang, Dan Guo, Kun Li, Zhun Zhong, Meng Wang

To this end, we present FD4MM, a new paradigm of Frequency Decoupling for Motion Magnification with a Multi-level Isomorphic Architecture to capture multi-level high-frequency details and a stable low-frequency structure (motion field) in video space.

Motion Magnification Representation Learning

Paper
Code

Benchmarking Micro-action Recognition: Dataset, Methods, and Applications

1 code implementation • 8 Mar 2024 • Dan Guo, Kun Li, Bin Hu, Yan Zhang, Meng Wang

It offers insights into the feelings and intentions of individuals and is important for human-oriented applications such as emotion recognition and psychological assessment.

Ranked #1 on Micro-Action Recognition on MA-52

Benchmarking Micro-Action Recognition

Paper
Code

Object-aware Adaptive-Positivity Learning for Audio-Visual Question Answering

1 code implementation • 20 Dec 2023 • Zhangbin Li, Dan Guo, Jinxing Zhou, Jing Zhang, Meng Wang

These selected pairs are constrained to have larger similarity values than the mismatched pairs.

Audio-visual Question Answering Audio-Visual Question Answering (AVQA) +4

Paper
Code

EulerMormer: Robust Eulerian Motion Magnification via Dynamic Filtering within Transformer

1 code implementation • 7 Dec 2023 • Fei Wang, Dan Guo, Kun Li, Meng Wang

Then, we introduce a novel dynamic filter that eliminates noise cues and preserves critical features in the motion magnification and amplification generation phases.

Denoising Motion Magnification

Paper
Code

Exploring Sparse Spatial Relation in Graph Inference for Text-Based VQA

no code implementations • 13 Oct 2023 • Sheng Zhou, Dan Guo, Jia Li, Xun Yang, Meng Wang

The associations between these repetitive objects are superfluous for answer reasoning; (2) two spatially distant OCR tokens detected in the image frequently have weak semantic dependencies for answer reasoning; and (3) the co-existence of nearby objects and tokens may be indicative of important visual cues for predicting answers.

Graph Learning Object +5

Paper
Add Code

Dual-Path Temporal Map Optimization for Make-up Temporal Video Grounding

no code implementations • 12 Sep 2023 • Jiaxiu Li, Kun Li, Jia Li, Guoliang Chen, Dan Guo, Meng Wang

Compared with the general video grounding task, MTVG focuses on meticulous actions and changes on the face.

Sentence text similarity +1

Paper
Add Code

Exploiting Diverse Feature for Multimodal Sentiment Analysis

no code implementations • 25 Aug 2023 • Jia Li, Wei Qian, Kun Li, Qi Li, Dan Guo, Meng Wang

Specifically, we achieve the results of 0. 8492 and 0. 8439 for MuSe-Personalisation in terms of arousal and valence CCC.

Multimodal Sentiment Analysis

Paper
Add Code

Dual-path TokenLearner for Remote Photoplethysmography-based Physiological Measurement with Facial Videos

1 code implementation • 15 Aug 2023 • Wei Qian, Dan Guo, Kun Li, Xilan Tian, Meng Wang

Specifically, the proposed Dual-TL uses a Spatial TokenLearner (S-TL) to explore associations in different facial ROIs, which promises the rPPG prediction far away from noisy ROI disturbances.

Paper
Code

ViGT: Proposal-free Video Grounding with Learnable Token in Transformer

no code implementations • 11 Aug 2023 • Kun Li, Dan Guo, Meng Wang

First, we employed a sharing feature encoder to project both video and query into a joint feature space before performing cross-modal co-attention (i. e., video-to-query attention and query-to-video attention) to highlight discriminative features in each modality.

Feature Correlation regression +1

Paper
Add Code

M&M: Tackling False Positives in Mammography with a Multi-view and Multi-instance Learning Sparse Detector

no code implementations • 11 Aug 2023 • Yen Nhi Truong Vu, Dan Guo, Ahmed Taha, Jason Su, Thomas Paul Matthews

Deep-learning-based object detection methods show promise for improving screening mammography, but high rates of false positives can hinder their effectiveness in clinical practice.

object-detection Object Detection

Paper
Add Code

Data Augmentation for Human Behavior Analysis in Multi-Person Conversations

no code implementations • 3 Aug 2023 • Kun Li, Dan Guo, Guoliang Chen, Feiyang Liu, Meng Wang

In this paper, we present the solution of our team HFUT-VUT for the MultiMediate Grand Challenge 2023 at ACM Multimedia 2023.

Paper
Add Code

Joint Skeletal and Semantic Embedding Loss for Micro-gesture Classification

1 code implementation • 20 Jul 2023 • Kun Li, Dan Guo, Guoliang Chen, Xinge Peng, Meng Wang

In this paper, we briefly introduce the solution of our team HFUT-VUT for the Micros-gesture Classification in the MiGA challenge at IJCAI 2023.

Ranked #1 on Micro-gesture Recognition on iMiGUE

Action Classification Classification +2

Paper
Code

Improving Audio-Visual Video Parsing with Pseudo Visual Labels

no code implementations • 4 Mar 2023 • Jinxing Zhou, Dan Guo, Yiran Zhong, Meng Wang

We perform extensive experiments on the LLP dataset and demonstrate that our method can generate high-quality segment-level pseudo labels with the help of our newly proposed loss and the label denoising strategy.

Denoising Pseudo Label

Paper
Add Code

Audio-Visual Segmentation with Semantics

1 code implementation • 30 Jan 2023 • Jinxing Zhou, Xuyang Shen, Jianyuan Wang, Jiayi Zhang, Weixuan Sun, Jing Zhang, Stan Birchfield, Dan Guo, Lingpeng Kong, Meng Wang, Yiran Zhong

To deal with these problems, we propose a new baseline method that uses a temporal pixel-wise audio-visual interaction module to inject audio semantics as guidance for the visual segmentation process.

Segmentation Semantic Segmentation +1

432

Paper
Code

Global Temporal Difference Network for Action Recognition

no code implementations • TMM 2022 • Zhao Xie, Jiansong Chen, Kewei Wu, Dan Guo, Richang Hong

In the global aggregation module, the global prior knowledge is learned by aggregating the visual feature sequence of video into a global vector.

Ranked #62 on Action Recognition on Something-Something V2

Action Recognition

Paper
Add Code

Contrastive Positive Sample Propagation along the Audio-Visual Event Line

1 code implementation • 18 Nov 2022 • Jinxing Zhou, Dan Guo, Meng Wang

Visual and audio signals often coexist in natural environments, forming audio-visual events (AVEs).

Contrastive Learning Representation Learning

Paper
Code

MEGCF: Multimodal Entity Graph Collaborative Filtering for Personalized Recommendation

1 code implementation • 14 Oct 2022 • Kang Liu, Feng Xue, Dan Guo, Le Wu, Shujie Li, Richang Hong

This paper aims at solving the mismatch problem between MFE and UIM, so as to generate high-quality embedding representations and better model multimodal user preferences.

Collaborative Filtering Image Classification

Paper
Code

Joint Multi-grained Popularity-aware Graph Convolution Collaborative Filtering for Recommendation

1 code implementation • 10 Oct 2022 • Kang Liu, Feng Xue, Xiangnan He, Dan Guo, Richang Hong

In this work, we propose to model multi-grained popularity features and jointly learn them together with high-order connectivity, to match the differentiation of user preferences exhibited in popularity features.

Collaborative Filtering Recommendation Systems

Paper
Code

Emotion Separation and Recognition from a Facial Expression by Generating the Poker Face with Vision Transformers

no code implementations • 22 Jul 2022 • Jia Li, Jiantao Nie, Dan Guo, Richang Hong, Meng Wang

Here, we regard an expressive face as the comprehensive result of a set of facial muscle movements on one's poker face (i. e., emotionless face), inspired by Facial Action Coding System.

Ranked #5 on Facial Expression Recognition (FER) on FER+

Disentanglement Facial Expression Recognition +1

Paper
Add Code

Audio-Visual Segmentation

1 code implementation • 11 Jul 2022 • Jinxing Zhou, Jianyuan Wang, Jiayi Zhang, Weixuan Sun, Jing Zhang, Stan Birchfield, Dan Guo, Lingpeng Kong, Meng Wang, Yiran Zhong

To deal with the AVS problem, we propose a novel method that uses a temporal pixel-wise audio-visual interaction module to inject audio semantics as guidance for the visual segmentation process.

Segmentation

432

Paper
Code

SAPAG: A Self-Adaptive Privacy Attack From Gradients

no code implementations • 14 Sep 2020 • Yijue Wang, Jieren Deng, Dan Guo, Chenghong Wang, Xianrui Meng, Hang Liu, Caiwen Ding, Sanguthevar Rajasekaran

Distributed learning such as federated learning or collaborative learning enables model training on decentralized data from users and only collects local gradients, where data is processed close to its sources for data privacy.

Federated Learning Reconstruction Attack

Paper
Add Code

Recurrent Relational Memory Network for Unsupervised Image Captioning

no code implementations • 24 Jun 2020 • Dan Guo, Yang Wang, Peipei Song, Meng Wang

Unsupervised image captioning with no annotations is an emerging challenge in computer vision, where the existing arts usually adopt GAN (Generative Adversarial Networks) models.

Computational Efficiency Image Captioning +2

Paper
Add Code

Iterative Context-Aware Graph Inference for Visual Dialog

1 code implementation • CVPR 2020 • Dan Guo, Hui Wang, Hanwang Zhang, Zheng-Jun Zha, Meng Wang

Visual dialog is a challenging task that requires the comprehension of the semantic dependencies among implicit visual and textual contexts.

Ranked #12 on Visual Dialog on VisDial v0.9 val

Graph Attention Graph Embedding +2

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.