no code implementations • Findings (EMNLP) 2021 • Yiming Wang, Ximing Li, Xiaotang Zhou, Jihong Ouyang
Short text nowadays has become a more fashionable form of text data, e. g., Twitter posts, news titles, and product reviews.
1 code implementation • COLING 2022 • Yiming Wang, Qianren Mao, Junnan Liu, Weifeng Jiang, Hongdong Zhu, JianXin Li
Labeling large amounts of extractive summarization data is often prohibitive expensive due to time, financial, and expertise constraints, which poses great challenges to incorporating summarization system in practical applications.
no code implementations • 6 May 2024 • Yanxi Chen, Chunxiao Li, Xinyang Dai, Jinhuan Li, Weiyu Sun, Yiming Wang, Renyuan Zhang, Tinghe Zhang, Bo wang
Multi-label learning (MLL) requires comprehensive multi-semantic annotations that is hard to fully obtain, thus often resulting in missing labels scenarios.
no code implementations • 25 Apr 2024 • Xiang He, Weiye Song, Yiming Wang, Fabio Poiesi, Ji Yi, Manishi Desai, Quanqing Xu, Kongzheng Yang, Yi Wan
Automatic retinal layer segmentation with medical images, such as optical coherence tomography (OCT) images, serves as an important tool for diagnosing ophthalmic diseases.
1 code implementation • 16 Apr 2024 • Alessandro Conti, Enrico Fini, Massimiliano Mancini, Paolo Rota, Yiming Wang, Elisa Ricci
To address VIC, we propose Category Search from External Databases (CaSED), a training-free method that leverages a pre-trained vision-language model and an external database.
1 code implementation • 8 Apr 2024 • Benedetta Liberatori, Alessandro Conti, Paolo Rota, Yiming Wang, Elisa Ricci
To this aim, we introduce a novel method that performs Test-Time adaptation for Temporal Action Localization (T3AL).
no code implementations • 1 Apr 2024 • Luca Zanella, Willi Menapace, Massimiliano Mancini, Yiming Wang, Elisa Ricci
Video anomaly detection (VAD) aims to temporally locate abnormal events in a video.
no code implementations • 15 Mar 2024 • Francesco Taioli, Stefano Rosa, Alberto Castellini, Lorenzo Natale, Alessio Del Bue, Alessandro Farinelli, Marco Cristani, Yiming Wang
Moreover, we formally define the task of Instruction Error Detection and Localization, and establish an evaluation protocol on top of our benchmark dataset.
no code implementations • 28 Jan 2024 • Zisen Kong, Zhiqiang Fu, Dongxia Chang, Yiming Wang, Yao Zhao
We jointly optimize the construction of the latent consistent anchor graph and the feature transformation to generate a discriminative anchor graph.
1 code implementation • 18 Jan 2024 • Tongxin Yuan, Zhiwei He, Lingzhong Dong, Yiming Wang, Ruijie Zhao, Tian Xia, Lizhen Xu, Binglin Zhou, Fangqi Li, Zhuosheng Zhang, Rui Wang, Gongshen Liu
We introduce R-Judge, a benchmark crafted to evaluate the proficiency of LLMs in judging and identifying safety risks given agent interaction records.
no code implementations • 6 Dec 2023 • Luigi Riz, Cristiano Saltori, Yiming Wang, Elisa Ricci, Fabio Poiesi
Firstly, it introduces the novel task of NCD for point cloud semantic segmentation.
1 code implementation • 5 Dec 2023 • Victor G. Turrisi da Costa, Nicola Dall'Asen, Yiming Wang, Nicu Sebe, Elisa Ricci
Few-shot image classification aims to learn an image classifier using only a small set of labeled examples per class.
no code implementations • 4 Dec 2023 • Nicola Dall'Asen, Willi Menapace, Elia Peruzzo, Enver Sangineto, Yiming Wang, Elisa Ricci
The process of painting fosters creativity and rational planning.
1 code implementation • 4 Dec 2023 • Guofeng Mei, Luigi Riz, Yiming Wang, Fabio Poiesi
Zero-shot 3D point cloud understanding can be achieved via 2D Vision-Language Models (VLMs).
no code implementations • 23 Nov 2023 • Yiming Wang, Yu Lin, Xiaodong Zeng, Guannan Zhang
To our knowledge, our proposed framework is the first efficient and privacy-preserving LLM solution in the literature.
no code implementations • 23 Nov 2023 • Yiming Wang, Yuxuan Song, Minkai Xu, Rui Wang, Hao Zhou, WeiYing Ma
Our key innovation is to develop a multi-stage diffusion process.
no code implementations • 20 Nov 2023 • Yiming Wang, Yu Lin, Xiaodong Zeng, Guannan Zhang
Further investigation into weight update matrices of MultiLoRA exhibits reduced dependency on top singular vectors and more democratic unitary transform contributions.
1 code implementation • 20 Nov 2023 • Zhuosheng Zhang, Yao Yao, Aston Zhang, Xiangru Tang, Xinbei Ma, Zhiwei He, Yiming Wang, Mark Gerstein, Rui Wang, Gongshen Liu, Hai Zhao
Large language models (LLMs) have dramatically enhanced the field of language intelligence, as demonstrably evidenced by their formidable empirical performance across a spectrum of complex reasoning tasks.
no code implementations • 19 Oct 2023 • Yiming Wang, Qian Huang, Bin Tang, Huashan Sun, Xing Li
In addition, most approaches ignore the spatial and channel redundancy.
1 code implementation • 4 Oct 2023 • Luca Zanella, Benedetta Liberatori, Willi Menapace, Fabio Poiesi, Yiming Wang, Elisa Ricci
We tackle the complex problem of detecting and recognising anomalies in surveillance videos at the frame level, utilising only video-level supervision.
no code implementations • 3 Oct 2023 • Yiming Wang, Jinyu Li
In this paper, we aim to reduce model size by reparameterizing model weights across Transformer encoder layers and assuming a special weight composition and structure.
no code implementations • 13 Sep 2023 • Weide Liu, Zhonghua Wu, Yiming Wang, Henghui Ding, Fayao Liu, Jie Lin, Guosheng Lin
In this work, we tackle the challenging problem of long-tailed image recognition.
no code implementations • 14 Aug 2023 • Runyu Jiao, Yi Wan, Fabio Poiesi, Yiming Wang
The increasing popularity of compact and inexpensive cameras, e. g.~dash cameras, body cameras, and cameras equipped on robots, has sparked a growing interest in detecting anomalies within dynamic scenes recorded by moving cameras.
1 code implementation • 28 Jul 2023 • Youjie Zhou, Guofeng Mei, Yiming Wang, Fabio Poiesi, Yi Wan
This paper presents an investigation into the estimation of optical and scene flow using RGBD information in scenarios where the RGB modality is affected by noise or captured in dark environments.
1 code implementation • 30 Jun 2023 • Yiming Wang, Zhuosheng Zhang, Pei Zhang, Baosong Yang, Rui Wang
Neural-symbolic methods have demonstrated efficiency in enhancing the reasoning abilities of large language models (LLMs).
1 code implementation • 11 Jun 2023 • Yuguang Yang, Yiming Wang, Shupeng Geng, Runqi Wang, Yimi Wang, Sheng Wu, Baochang Zhang
The emergence of cross-modal foundation models has introduced numerous approaches grounded in text-image retrieval.
1 code implementation • NeurIPS 2023 • Alessandro Conti, Enrico Fini, Massimiliano Mancini, Paolo Rota, Yiming Wang, Elisa Ricci
We thus formalize a novel task, termed as Vocabulary-free Image Classification (VIC), where we aim to assign to an input image a class that resides in an unconstrained language-induced semantic space, without the prerequisite of a known vocabulary.
1 code implementation • 24 May 2023 • Błażej Leporowski, Arian Bakhtiarnia, Nicole Bonnici, Adrian Muscat, Luca Zanella, Yiming Wang, Alexandros Iosifidis
We introduce the first audio-visual dataset for traffic anomaly detection taken from real-world scenes, called MAVAD, with a diverse range of weather and illumination conditions.
1 code implementation • 22 May 2023 • Yiming Wang, Zhuosheng Zhang, Rui Wang
Further, we propose a Summary Chain-of-Thought (SumCoT) technique to elicit LLMs to generate summaries step by step, which helps them integrate more fine-grained details of source documents into the final summaries that correlate with the human writing mindset.
1 code implementation • 20 Mar 2023 • Francesco Giuliari, Gianluca Scarpellini, Stuart James, Yiming Wang, Alessio Del Bue
We present Positional Diffusion, a plug-and-play graph formulation with Diffusion Probabilistic Models to address positional reasoning.
no code implementations • 2 Jan 2023 • Seyed S. Mohammadi, Nuno F. Duarte, Dimitris Dimou, Yiming Wang, Matteo Taiana, Pietro Morerio, Atabak Dehban, Plinio Moreno, Alexandre Bernardino, Alessio Del Bue, Jose Santos-Victor
However, in practice, PCDs are often incomplete when objects are viewed from few and sparse viewpoints before the grasping action, leading to the generation of wrong or inaccurate grasp poses.
1 code implementation • ICCV 2023 • Ze Yang, Ruibo Li, Evan Ling, Chi Zhang, Yiming Wang, Dezhao Huang, Keng Teck Ma, Minhoe Hur, Guosheng Lin
To address this issue, we propose a new label-guided knowledge distillation (LGKD) loss, where the old model output is expanded and transplanted (with the guidance of the ground truth label) to form a semantically appropriate class correspondence with the new model output.
Ranked #1 on Continual Semantic Segmentation on ScanNet
no code implementations • 17 Dec 2022 • Baode Gao, Guangpeng Zhan, Hanzhang Wang, Yiming Wang, Shengxin Zhu
Accurate prediction of users' responses to items is one of the main aims of many computational advising applications.
1 code implementation • ICCV 2023 • Yiming Wang, Qin Han, Marc Habermann, Kostas Daniilidis, Christian Theobalt, Lingjie Liu
Recent methods for neural surface representation and rendering, for example NeuS, have demonstrated the remarkably high-quality reconstruction of static scenes.
1 code implementation • 5 Dec 2022 • Ce Zheng, Yiming Wang, Baobao Chang
Such methods usually model role classification as naive multi-class classification and treat arguments individually, which neglects label semantics and interactions between arguments and thus hindering performance and generalization of models.
1 code implementation • 18 Nov 2022 • Yuhang Lai, Chengxi Li, Yiming Wang, Tianyi Zhang, Ruiqi Zhong, Luke Zettlemoyer, Scott Wen-tau Yih, Daniel Fried, Sida Wang, Tao Yu
We introduce DS-1000, a code generation benchmark with a thousand data science problems spanning seven Python libraries, such as NumPy and Pandas.
no code implementations • 10 Nov 2022 • Zili Huang, Zhuo Chen, Naoyuki Kanda, Jian Wu, Yiming Wang, Jinyu Li, Takuya Yoshioka, Xiaofei Wang, Peidong Wang
In this paper, we investigate SSL for streaming multi-talker speech recognition, which generates transcriptions of overlapping speakers in a streaming fashion.
no code implementations • 8 Nov 2022 • Yiming Wang, Dongxia Chang, Zhiqiang Fu, Jie Wen, Yao Zhao
Multi-view representation learning has developed rapidly over the past decades and has been applied in many fields.
no code implementations • 1 Nov 2022 • Francesco Giuliari, Geri Skenderi, Marco Cristani, Alessio Del Bue, Yiming Wang
With the proposed graph-based scene representation, we estimate the unknown position of the target object using a Graph Neural Network that implements a novel attentional message passing mechanism.
no code implementations • 1 Nov 2022 • Mengdie Wang, Liyuan Shang, Suyun Zhao, Yiming Wang, Hong Chen, Cuiping Li, XiZhao Wang
Accordingly, the query results, guided by oracles with distinctive demands, may drive the OCC's clustering results in a desired orientation.
1 code implementation • 20 Oct 2022 • Giulio Mattolin, Luca Zanella, Elisa Ricci, Yiming Wang
Unsupervised Domain Adaptation (UDA) for object detection aims to adapt a model trained on a source domain to detect instances from a new target domain for which annotations are not available.
no code implementations • 16 Oct 2022 • Ruchao Fan, Yiming Wang, Yashesh Gaur, Jinyu Li
We examine CTCBERT on IDs from HuBERT Iter1, HuBERT Iter2, and PBERT.
1 code implementation • 11 Oct 2022 • Alessandro Conti, Paolo Rota, Yiming Wang, Elisa Ricci
Automatically understanding emotions from visual data is a fundamental task for human behaviour understanding.
Cross-Domain Facial Expression Recognition Facial Expression Recognition (FER) +2
no code implementations • 25 Aug 2022 • Yiming Wang, Qingzhe Gao, Libin Liu, Lingjie Liu, Christian Theobalt, Baoquan Chen
The learned representation can be used to synthesize novel view images of an arbitrary person from a sparse set of cameras, and further animate them with the user's pose control.
1 code implementation • 9 Jul 2022 • Bin Ren, Hao Tang, Yiming Wang, Xia Li, Wei Wang, Nicu Sebe
For semantic-guided cross-view image translation, it is crucial to learn where to sample pixels from the source view image and where to reallocate them guided by the target view semantic map, especially when there is little overlap or drastic view difference between the source and target images.
no code implementations • 21 Jun 2022 • Chengyi Wang, Yiming Wang, Yu Wu, Sanyuan Chen, Jinyu Li, Shujie Liu, Furu Wei
Recently, masked prediction pre-training has seen remarkable progress in self-supervised learning (SSL) for speech recognition.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +3
no code implementations • 2 Jun 2022 • Weide Liu, Zhonghua Wu, Yiming Wang, Henghui Ding, Fayao Liu, Jie Lin, Guosheng Lin
Previous long-tailed recognition methods commonly focus on the data augmentation or re-balancing strategy of the tail classes to give more attention to tail classes during the model training.
Ranked #9 on Long-tail Learning on CIFAR-10-LT (ρ=10)
1 code implementation • CVPR 2022 • Francesco Giuliari, Geri Skenderi, Marco Cristani, Yiming Wang, Alessio Del Bue
The SCG is used to estimate the unknown position of the target object in two steps: first, we feed the SCG into a novel Proximity Prediction Network, a graph neural network that uses attention to perform distance prediction between the node representing the target object and the nodes representing the observed objects in the SCG; second, we propose a Localisation Module based on circular intersection to estimate the object position using all the predicted pairwise distances in order to be independent of any reference system.
no code implementations • 7 Mar 2022 • Lizong Zhang, Yiming Wang, Bei Hui, Xiujian Zhang, Sijuan Liu, Shuxin Feng
Specifically, behavior recognition may even rely more on the modeling of temporal information containing short-range and long-range motions; this contrasts with computer vision tasks involving images that focus on the understanding of spatial information.
no code implementations • 1 Mar 2022 • Yiming Wang, Dongxia Chang, Zhiqiang Fu, Jie Wen, Yao Zhao
In this paper, we propose an augmentation-free graph contrastive learning framework, namely ACTIVE, to solve the problem of partial multi-view clustering.
1 code implementation • 10 Dec 2021 • Nicola Dall'Asen, Yiming Wang, Hao Tang, Luca Zanella, Elisa Ricci
With the goal to maintain the geometric attributes of the source face, i. e., the facial pose and expression, and to promote more natural face generation, we propose to exploit a Bipartite Graph to explicitly model the relations between the facial landmarks of the source identity and the ones of the condition identity through a deep model.
no code implementations • 1 Dec 2021 • Yiming Wang, Dongxia Chang, Zhiqiang Fu, Yao Zhao
In this paper, we consider the problem of multi-view clustering on incomplete views.
1 code implementation • 31 Oct 2021 • Youjie Zhou, Yiming Wang, Fabio Poiesi, Qi Qin, Yi Wan
We compare our L3D-based loop closure approach with recent approaches on LiDAR data and achieve state-of-the-art loop closure detection accuracy.
no code implementations • 28 Oct 2021 • Heming Wang, Yao Qian, Xiaofei Wang, Yiming Wang, Chengyi Wang, Shujie Liu, Takuya Yoshioka, Jinyu Li, DeLiang Wang
The reconstruction module is used for auxiliary learning to improve the noise robustness of the learned representation and thus is not required during inference.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +8
no code implementations • 11 Oct 2021 • Yiming Wang, Jinyu Li, Heming Wang, Yao Qian, Chengyi Wang, Yu Wu
In this paper we propose wav2vec-Switch, a method to encode noise robustness into contextualized representations of speech via contrastive learning.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +7
1 code implementation • IEEE International Conference on Image Processing 2021 • Seyed Saber Mohammadi, Yiming Wang, Alessio Del Bue
We address 3D shape classification with partial point cloud inputs captured from multiple viewpoints around the object.
Ranked #2 on 3D Point Cloud Classification on ModelNet40
no code implementations • CVPR 2021 • Zhiqiang Fu, Yao Zhao, Dongxia Chang, Xingxing Zhang, Yiming Wang
This paper presents a novel, simple yet robust self-representation method, i. e., Double Low-Rank Representation with Projection Distance penalty (DLRRPD) for clustering.
no code implementations • 11 May 2021 • Yiming Wang, Dongxia Chang, Zhiqiang Fu, Yao Zhao
Specifically, a multiple graph auto-encoder(M-GAE) is designed to flexibly encode the complementary information of multi-view data using a multi-graph attention fusion encoder.
no code implementations • 30 Apr 2021 • Yiming Wang, Dongxia Chang, Zhiqian Fu, Yao Zhao
This paper is the first attempt to employ graph pooling technique for node clustering and we propose a novel dual graph embedding network (DGEN), which is designed as a two-step graph encoder connected by a graph pooling layer to learn the graph embedding.
no code implementations • 26 Apr 2021 • Zhiqiang Fu, Yao Zhao, Dongxia Chang, Xingxing Zhang, Yiming Wang
In this paper, a novel unsupervised low-rank representation model, i. e., Auto-weighted Low-Rank Representation (ALRR), is proposed to construct a more favorable similarity graph (SG) for clustering.
no code implementations • 8 Feb 2021 • Yiming Wang, Hang Lv, Daniel Povey, Lei Xie, Sanjeev Khudanpur
Modern wake word detection systems usually rely on neural networks for acoustic modeling.
no code implementations • 28 Dec 2020 • Yiming Wang, Lingchao Guo, Zhaoming Lu, Xiangming Wen, Shuang Zhou, Wanyu Meng
To reconstruct 3D poses of people who move throughout the space rather than a fixed point, we fuse the amplitude and phase into Channel State Information (CSI) images which can provide both pose and position information.
no code implementations • 22 Dec 2020 • Shuang Zhou, Lingchao Guo, Zhaoming Lu, Xiangming Wen, Wei Zheng, Yiming Wang
Existing papers achieve good results when constructing the images of subjects who are in the prior training samples.
1 code implementation • ECCV 2020 • Yiming Wang, Alessio Del Bue
In this work we address the problem of autonomous 3D exploration of an unknown indoor environment using a depth camera.
1 code implementation • 3 Nov 2020 • Maya Aghaei, Matteo Bustreo, Yiming Wang, Gianluca Bailo, Pietro Morerio, Alessio Del Bue
In this work, we address the problem of estimating the so-called "Social Distancing" given a single uncalibrated image in unconstrained scenarios.
no code implementations • 17 Sep 2020 • Yiming Wang, Francesco Giuliari, Riccardo Berra, Alberto Castellini, Alessio Del Bue, Alessandro Farinelli, Marco Cristani, Francesco Setti
Our POMP method uses as input the current pose of an agent (e. g. a robot) and a RGB-D frame.
1 code implementation • 20 May 2020 • Yiwen Shao, Yiming Wang, Daniel Povey, Sanjeev Khudanpur
We present PyChain, a fully parallelized PyTorch implementation of end-to-end lattice-free maximum mutual information (LF-MMI) training for the so-called \emph{chain models} in the Kaldi automatic speech recognition (ASR) toolkit.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +1
1 code implementation • 17 May 2020 • Yiming Wang, Hang Lv, Daniel Povey, Lei Xie, Sanjeev Khudanpur
Always-on spoken language interfaces, e. g. personal digital assistants, rely on a wake word to start processing spoken input.
1 code implementation • 18 Sep 2019 • Yiming Wang, Tongfei Chen, Hainan Xu, Shuoyang Ding, Hang Lv, Yiwen Shao, Nanyun Peng, Lei Xie, Shinji Watanabe, Sanjeev Khudanpur
We present Espresso, an open-source, modular, extensible end-to-end neural automatic speech recognition (ASR) toolkit based on the deep learning library PyTorch and the popular neural machine translation toolkit fairseq.
Ranked #1 on Speech Recognition on Hub5'00 CallHome
Automatic Speech Recognition Automatic Speech Recognition (ASR) +6
no code implementations • 6 Feb 2019 • Yiming Wang, Xing Fan, I-Fan Chen, Yuzong Liu, Tongfei Chen, Björn Hoffmeister
The anchored segment refers to the wake-up word part of an audio stream, which contains valuable speaker information that can be used to suppress interfering speech and background noise.
no code implementations • WS 2018 • Chao Bei, Hao Zong, Yiming Wang, Baoyong Fan, Shiqi Li, Conghu Yuan
The submitted system focus on data clearing and techniques to build a competitive model for this task.
1 code implementation • Interspeech 2018 2018 • Daniel Povey, Gaofeng Cheng, Yiming Wang, Ke Li, Hainan Xu, Mahsa Yarmohammadi, Sanjeev Khudanpur
Time Delay Neural Networks (TDNNs), also known as onedimensional Convolutional Neural Networks (1-d CNNs), are an efficient and well-performing neural network architecture for speech recognition.
no code implementations • ICASSP 2018 • Hainan Xu, Ke Li, Yiming Wang, Jian Wang, Shiyin Kang, Xie Chen, Daniel Povey, Sanjeev Khudanpur
In this paper we describe an extension of the Kaldi software toolkit to support neural-based language modeling, intended for use in automatic speech recognition (ASR) and related tasks.
Ranked #36 on Speech Recognition on LibriSpeech test-other (using extra training data)
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2
no code implementations • 9 Apr 2018 • Zhehuai Chen, Justin Luitjens, Hainan Xu, Yiming Wang, Daniel Povey, Sanjeev Khudanpur
We describe initial work on an extension of the Kaldi toolkit that supports weighted finite-state transducer (WFST) decoding on Graphics Processing Units (GPUs).
no code implementations • INTERSPEECH 2016 2016 • Daniel Povey, Vijayaditya Peddinti, Daniel Galvez, Pegah Ghahrmani, Vimal Manohar, Xingyu Na, Yiming Wang, Sanjeev Khudanpur
Models trained with LFMMI provide a relative word error rate reduction of ∼11. 5%, over those trained with cross-entropy objective function, and ∼8%, over those trained with cross-entropy and sMBR objective functions.
Ranked #4 on Speech Recognition on WSJ eval92
no code implementations • NeurIPS 2014 • Tuo Zhao, Mo Yu, Yiming Wang, Raman Arora, Han Liu
When the regularization function is block separable, we can solve the minimization problems in a randomized block coordinate descent (RBCD) manner.
no code implementations • LREC 2014 • Liang Tian, Derek F. Wong, Lidia S. Chao, Paulo Quaresma, Francisco Oliveira, Yi Lu, Shuo Li, Yiming Wang, Long-Yue Wang
This paper describes the acquisition of a large scale and high quality parallel corpora for English and Chinese.