Search Results for author: Kai Yu

Found 131 papers, 42 papers with code

Glyph Enhanced Chinese Character Pre-Training for Lexical Sememe Prediction

1 code implementation • Findings (EMNLP) 2021 • Boer Lyu, Lu Chen, Kai Yu

Sememes are defined as the atomic units to describe the semantic meaning of concepts.

Paper
Code

The AISP-SJTU Simultaneous Translation System for IWSLT 2022

no code implementations • IWSLT (ACL) 2022 • Qinpei Zhu, Renshou Wu, Guangfeng Liu, Xinyu Zhu, Xingyu Chen, Yang Zhou, Qingliang Miao, Rui Wang, Kai Yu

This paper describes AISP-SJTU’s submissions for the IWSLT 2022 Simultaneous Translation task.

Translation

Paper
Add Code

Attention-Constrained Inference for Robust Decoder-Only Text-to-Speech

no code implementations • 30 Apr 2024 • Hankun Wang, Chenpeng Du, Yiwei Guo, Shuai Wang, Xie Chen, Kai Yu

We call the attention maps of those heads Alignment-Emerged Attention Maps (AEAMs).

Paper
Add Code

StoryTTS: A Highly Expressive Text-to-Speech Dataset with Rich Textual Expressiveness Annotations

no code implementations • 23 Apr 2024 • Sen Liu, Yiwei Guo, Xie Chen, Kai Yu

While acoustic expressiveness has long been studied in expressive text-to-speech (ETTS), the inherent expressiveness in text lacks sufficient attention, especially for ETTS of artistic works.

Paper
Add Code

The X-LANCE Technical Report for Interspeech 2024 Speech Processing Using Discrete Speech Unit Challenge

no code implementations • 9 Apr 2024 • Yiwei Guo, Chenrun Wang, Yifan Yang, Hankun Wang, Ziyang Ma, Chenpeng Du, Shuai Wang, Hanzheng Li, Shuai Fan, HUI ZHANG, Xie Chen, Kai Yu

Discrete speech tokens have been more and more popular in multiple speech processing fields, including automatic speech recognition (ASR), text-to-speech (TTS) and singing voice synthesis (SVS).

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Paper
Add Code

Cell-Free Multi-User MIMO Equalization via In-Context Learning

2 code implementations • 8 Apr 2024 • Matteo Zecchin, Kai Yu, Osvaldo Simeone

In this work, we demonstrate that ICL can be also used to tackle the problem of multi-user equalization in cell-free MIMO systems with limited fronthaul capacity.

In-Context Learning

181

Paper
Code

Multilingual Brain Surgeon: Large Language Models Can be Compressed Leaving No Language Behind

2 code implementations • 6 Apr 2024 • Hongchuan Zeng, Hongshen Xu, Lu Chen, Kai Yu

MBS overcomes the English-centric limitations of existing methods by sampling calibration data from various languages proportionally to the language distribution of the model training datasets.

Model Compression

181

Paper
Code

Rejection Improves Reliability: Training LLMs to Refuse Unknown Questions Using RL from Knowledge Feedback

no code implementations • 27 Mar 2024 • Hongshen Xu, Zichen Zhu, Situo Zhang, Da Ma, Shuai Fan, Lu Chen, Kai Yu

Large Language Models (LLMs) often generate erroneous outputs, known as hallucinations, due to their limitations in discerning questions beyond their knowledge scope.

Hallucination

Paper
Add Code

TDT-KWS: Fast And Accurate Keyword Spotting Using Token-and-duration Transducer

no code implementations • 20 Mar 2024 • Yu Xi, Hao Li, Baochen Yang, Haoyu Li, Hainan Xu, Kai Yu

Designing an efficient keyword spotting (KWS) system that delivers exceptional performance on resource-constrained edge devices has long been a subject of significant attention.

Keyword Spotting

Paper
Add Code

ChatCite: LLM Agent with Human Workflow Guidance for Comparative Literature Summary

no code implementations • 5 Mar 2024 • Yutong Li, Lu Chen, Aiwei Liu, Kai Yu, Lijie Wen

In this work, we firstly focus on the independent literature summarization step and introduce ChatCite, an LLM agent with human workflow guidance for comparative literature summary.

Retrieval

Paper
Add Code

A BiRGAT Model for Multi-intent Spoken Language Understanding with Hierarchical Semantic Frames

1 code implementation • 28 Feb 2024 • Hongshen Xu, Ruisheng Cao, Su Zhu, Sheng Jiang, Hanchong Zhang, Lu Chen, Kai Yu

Previous work on spoken language understanding (SLU) mainly focuses on single-intent settings, where each input utterance merely contains one user intent.

Graph Attention Spoken Language Understanding

Paper
Code

Hierarchical Multimodal Pre-training for Visually Rich Webpage Understanding

1 code implementation • 28 Feb 2024 • Hongshen Xu, Lu Chen, Zihan Zhao, Da Ma, Ruisheng Cao, Zichen Zhu, Kai Yu

Additionally, we propose several pre-training tasks to model the interaction among text, structure, and image modalities effectively.

document understanding Information Retrieval +1

Paper
Code

Is Cognition and Action Consistent or Not: Investigating Large Language Model's Personality

no code implementations • 22 Feb 2024 • Yiming Ai, Zhiwei He, Ziyin Zhang, Wenhong Zhu, Hongkun Hao, Kai Yu, Lingjun Chen, Rui Wang

In this study, we investigate the reliability of Large Language Models (LLMs) in professing human-like personality traits through responses to personality questionnaires.

Paper
Add Code

MULTI: Multimodal Understanding Leaderboard with Text and Images

no code implementations • 5 Feb 2024 • Zichen Zhu, Yang Xu, Lu Chen, Jingkai Yang, Yichuan Ma, Yiming Sun, Hailin Wen, Jiaqi Liu, Jinyu Cai, Yingzi Ma, Situo Zhang, Zihan Zhao, Liangtai Sun, Kai Yu

Rapid progress in multimodal large language models (MLLMs) highlights the need to introduce challenging yet realistic benchmarks to the academic community, while existing benchmarks primarily focus on understanding simple natural images and short context.

In-Context Learning

Paper
Add Code

ChemDFM: Dialogue Foundation Model for Chemistry

no code implementations • 26 Jan 2024 • Zihan Zhao, Da Ma, Lu Chen, Liangtai Sun, Zihao Li, Hongshen Xu, Zichen Zhu, Su Zhu, Shuai Fan, Guodong Shen, Xin Chen, Kai Yu

To this end, we develop ChemDFM, the first LLM towards CGI.

Paper
Add Code

VALL-T: Decoder-Only Generative Transducer for Robust and Decoding-Controllable Text-to-Speech

no code implementations • 25 Jan 2024 • Chenpeng Du, Yiwei Guo, Hankun Wang, Yifan Yang, Zhikang Niu, Shuai Wang, HUI ZHANG, Xie Chen, Kai Yu

Recent TTS models with decoder-only Transformer architecture, such as SPEAR-TTS and VALL-E, achieve impressive naturalness and demonstrate the ability for zero-shot adaptation given a speech prompt.

Hallucination

Paper
Add Code

Contrastive Learning With Audio Discrimination For Customizable Keyword Spotting In Continuous Speech

no code implementations • 12 Jan 2024 • Yu Xi, Baochen Yang, Hao Li, Jiaqi Guo, Kai Yu

Furthermore, experiments on the continuous speech dataset LibriSpeech demonstrate that, by incorporating audio discrimination, CLAD achieves significant performance gain over CL without audio discrimination.

Contrastive Learning Keyword Spotting +1

Paper
Add Code

DiffusionGAN3D: Boosting Text-guided 3D Generation and Domain Adaptation by Combining 3D GANs and Diffusion Priors

no code implementations • 28 Dec 2023 • Biwen Lei, Kai Yu, Mengyang Feng, Miaomiao Cui, Xuansong Xie

Extensive experiments demonstrate that the proposed framework achieves excellent results in both domain adaptation and text-to-avatar tasks, outperforming existing methods in terms of generation quality and efficiency.

3D Generation Domain Adaptation

Paper
Add Code

SEF-VC: Speaker Embedding Free Zero-Shot Voice Conversion with Cross Attention

no code implementations • 14 Dec 2023 • Junjie Li, Yiwei Guo, Xie Chen, Kai Yu

Zero-shot voice conversion (VC) aims to transfer the source speaker timbre to arbitrary unseen target speaker timbre, while keeping the linguistic content unchanged.

Position Voice Conversion

Paper
Add Code

DreaMoving: A Human Video Generation Framework based on Diffusion Models

no code implementations • 8 Dec 2023 • Mengyang Feng, Jinlin Liu, Kai Yu, Yuan YAO, Zheng Hui, Xiefan Guo, Xianhui Lin, Haolan Xue, Chen Shi, Xiaowen Li, Aojie Li, Xiaoyang Kang, Biwen Lei, Miaomiao Cui, Peiran Ren, Xuansong Xie

In this paper, we present DreaMoving, a diffusion-based controllable video generation framework to produce high-quality customized human videos.

Video Generation

Paper
Add Code

Boosting3D: High-Fidelity Image-to-3D by Boosting 2D Diffusion Prior to 3D Prior with Progressive Learning

no code implementations • 22 Nov 2023 • Kai Yu, Jinlin Liu, Mengyang Feng, Miaomiao Cui, Xuansong Xie

After the progressive training, the LoRA learns the 3D information of the generated object and eventually turns to an object-level 3D prior.

3D Generation Image to 3D +1

Paper
Add Code

In-Context Learning for MIMO Equalization Using Transformer-Based Sequence Models

1 code implementation • 10 Nov 2023 • Matteo Zecchin, Kai Yu, Osvaldo Simeone

In ICL, a decision on a new input is made via a direct mapping of the input and of a few examples from the given task, serving as the task's context, to the output variable.

In-Context Learning Meta-Learning +1

Paper
Code

DiffDub: Person-generic Visual Dubbing Using Inpainting Renderer with Diffusion Auto-encoder

no code implementations • 3 Nov 2023 • Tao Liu, Chenpeng Du, Shuai Fan, Feilong Chen, Kai Yu

Our rigorous experiments comprehensively highlight that our ground-breaking approach outpaces existing methods with considerable margins and delivers seamless, intelligible videos in person-generic and multilingual scenarios.

Data Augmentation

Paper
Add Code

Expressive TTS Driven by Natural Language Prompts Using Few Human Annotations

no code implementations • 2 Nov 2023 • Hanglei Zhang, Yiwei Guo, Sen Liu, Xie Chen, Kai Yu

The LLM selects the best-matching style references from annotated utterances based on external style prompts, which can be raw input text or natural language style descriptions.

Language Modelling Large Language Model +1

Paper
Add Code

ASTormer: An AST Structure-aware Transformer Decoder for Text-to-SQL

no code implementations • 28 Oct 2023 • Ruisheng Cao, Hanchong Zhang, Hongshen Xu, Jieyu Li, Da Ma, Lu Chen, Kai Yu

Text-to-SQL aims to generate an executable SQL program given the user utterance and the corresponding database schema.

Text-To-SQL

Paper
Add Code

ACT-SQL: In-Context Learning for Text-to-SQL with Automatically-Generated Chain-of-Thought

1 code implementation • 26 Oct 2023 • Hanchong Zhang, Ruisheng Cao, Lu Chen, Hongshen Xu, Kai Yu

Recently Large Language Models (LLMs) have been proven to have strong abilities in various domains and tasks.

In-Context Learning Text-To-SQL

Paper
Code

Towards Universal Speech Discrete Tokens: A Case Study for ASR and TTS

1 code implementation • 14 Sep 2023 • Yifan Yang, Feiyu Shen, Chenpeng Du, Ziyang Ma, Kai Yu, Daniel Povey, Xie Chen

Self-supervised learning (SSL) proficiency in speech-related tasks has driven research into utilizing discrete tokens for speech tasks like recognition and translation, which offer lower storage requirements and great potential to employ natural language processing techniques.

Self-Supervised Learning speech-recognition +2

780

Paper
Code

VoiceFlow: Efficient Text-to-Speech with Rectified Flow Matching

no code implementations • 10 Sep 2023 • Yiwei Guo, Chenpeng Du, Ziyang Ma, Xie Chen, Kai Yu

Although diffusion models in text-to-speech have become a popular choice due to their strong generative ability, the intrinsic complexity of sampling from diffusion models harms their efficiency.

Paper
Add Code

SciEval: A Multi-Level Large Language Model Evaluation Benchmark for Scientific Research

1 code implementation • 25 Aug 2023 • Liangtai Sun, Yang Han, Zihan Zhao, Da Ma, Zhennan Shen, Baocai Chen, Lu Chen, Kai Yu

This design suffers from data leakage problem and lacks the evaluation of subjective Q/A ability.

Language Modelling Large Language Model

Paper
Code

Diverse Data Augmentation with Diffusions for Effective Test-time Prompt Tuning

1 code implementation • ICCV 2023 • Chun-Mei Feng, Kai Yu, Yong liu, Salman Khan, WangMeng Zuo

In this paper, we focus on a particular setting of learning adaptive prompts on the fly for each test sample from an unseen new domain, which is known as test-time prompt tuning (TPT).

Data Augmentation

Paper
Code

Towards Instance-adaptive Inference for Federated Learning

1 code implementation • ICCV 2023 • Chun-Mei Feng, Kai Yu, Nian Liu, Xinxing Xu, Salman Khan, WangMeng Zuo

However, the performance of the global model is often hampered by non-i. i. d.

Federated Learning

Paper
Code

DSE-TTS: Dual Speaker Embedding for Cross-Lingual Text-to-Speech

no code implementations • 25 Jun 2023 • Sen Liu, Yiwei Guo, Chenpeng Du, Xie Chen, Kai Yu

Although high-fidelity speech can be obtained for intralingual speech synthesis, cross-lingual text-to-speech (CTTS) is still far from satisfactory as it is difficult to accurately retain the speaker timbres(i. e. speaker similarity) and eliminate the accents from their first language(i. e. nativeness).

Speech Synthesis

Paper
Add Code

Improving Audio Caption Fluency with Automatic Error Correction

no code implementations • 16 Jun 2023 • Hanxue Zhang, Zeyu Xie, Xuenan Xu, Mengyue Wu, Kai Yu

Automated audio captioning (AAC) is an important cross-modality translation task, aiming at generating descriptions for audio clips.

Audio captioning Sentence

Paper
Add Code

Improving Code-Switching and Named Entity Recognition in ASR with Speech Editing based Data Augmentation

no code implementations • 14 Jun 2023 • Zheng Liang, Zheshu Song, Ziyang Ma, Chenpeng Du, Kai Yu, Xie Chen

Recently, end-to-end (E2E) automatic speech recognition (ASR) models have made great strides and exhibit excellent performance in general speech recognition.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +5

Paper
Add Code

Large Language Models Are Semi-Parametric Reinforcement Learning Agents

1 code implementation • NeurIPS 2023 • Danyang Zhang, Lu Chen, Situo Zhang, Hongshen Xu, Zihan Zhao, Kai Yu

By equipping the LLM with a long-term experience memory, REMEMBERER is capable of exploiting the experiences from the past episodes even for different task goals, which excels an LLM-based agent with fixed exemplars or equipped with a transient working memory.

Language Modelling Large Language Model +1

Paper
Code

CSS: A Large-scale Cross-schema Chinese Text-to-SQL Medical Dataset

1 code implementation • 25 May 2023 • Hanchong Zhang, Jieyu Li, Lu Chen, Ruisheng Cao, Yunyan Zhang, Yu Huang, Yefeng Zheng, Kai Yu

Furthermore, we present CSS, a large-scale CrosS-Schema Chinese text-to-SQL dataset, to carry on corresponding studies.

Benchmarking Text-To-SQL

Paper
Code

TeCS: A Dataset and Benchmark for Tense Consistency of Machine Translation

1 code implementation • 23 May 2023 • Yiming Ai, Zhiwei He, Kai Yu, Rui Wang

Tense inconsistency frequently occurs in machine translation.

Machine Translation Translation

Paper
Code

PointGPT: Auto-regressively Generative Pre-training from Point Clouds

1 code implementation • NeurIPS 2023 • Guangyan Chen, Meiling Wang, Yi Yang, Kai Yu, Li Yuan, Yufeng Yue

Large language models (LLMs) based on the generative pre-training transformer (GPT) have demonstrated remarkable effectiveness across a diverse range of downstream tasks.

Ranked #3 on 3D Point Cloud Classification on ScanObjectNN (using extra training data)

3D Point Cloud Classification Few-Shot Learning

165

Paper
Code

Mobile-Env: An Evaluation Platform and Benchmark for LLM-GUI Interaction

1 code implementation • 14 May 2023 • Danyang Zhang, Hongshen Xu, Zihan Zhao, Lu Chen, Ruisheng Cao, Kai Yu

A GUI task set based on WikiHow app is collected on Mobile-Env to form a benchmark covering a range of GUI interaction capabilities.

Language Modelling

Paper
Code

DiffVoice: Text-to-Speech with Latent Diffusion

no code implementations • 23 Apr 2023 • Zhijun Liu, Yiwei Guo, Kai Yu

In this work, we present DiffVoice, a novel text-to-speech model based on latent diffusion.

Paper
Add Code

DAE-Talker: High Fidelity Speech-Driven Talking Face Generation with Diffusion Autoencoder

no code implementations • 30 Mar 2023 • Chenpeng Du, Qi Chen, Xie Chen, Kai Yu

Additionally, we propose a novel method for generating continuous video frames with the DDIM image decoder trained on individual frames, eliminating the need for modelling the joint distribution of consecutive frames directly.

Talking Face Generation

Paper
Add Code

Reliable Federated Disentangling Network for Non-IID Domain Feature

no code implementations • 30 Jan 2023 • Meng Wang, Kai Yu, Chun-Mei Feng, Yiming Qian, Ke Zou, Lianyu Wang, Rick Siow Mong Goh, Yong liu, Huazhu Fu

To the best of our knowledge, our proposed RFedDis is the first work to develop an FL approach based on evidential uncertainty combined with feature disentangling, which enhances the performance and reliability of FL in non-IID domain features.

Federated Learning

Paper
Add Code

On the Structural Generalization in Text-to-SQL

no code implementations • 12 Jan 2023 • Jieyu Li, Lu Chen, Ruisheng Cao, Su Zhu, Hongshen Xu, Zhi Chen, Hanchong Zhang, Kai Yu

Exploring the generalization of a text-to-SQL parser is essential for a system to automatically adapt the real-world databases.

Text-To-SQL

Paper
Add Code

Spectral Efficiency Analysis of Uplink-Downlink Decoupled Access in C-V2X Networks

1 code implementation • 5 Dec 2022 • Luofang Jiao, Kai Yu, Yunting Xu, Tianqi Zhang, Haibo Zhou, Xuemin, Shen

The uplink (UL)/downlink (DL) decoupled access has been emerging as a novel access architecture to improve the performance gains in cellular networks.

Spectral Efficiency Analysis of Uplink-Downlink Decoupled Access in C-V2X Networks

Paper
Code

Reliable Joint Segmentation of Retinal Edema Lesions in OCT Images

no code implementations • 1 Dec 2022 • Meng Wang, Kai Yu, Chun-Mei Feng, Ke Zou, Yanyu Xu, Qingquan Meng, Rick Siow Mong Goh, Yong liu, Huazhu Fu

Specifically, aiming at improving the model's ability to learn the complex pathological features of retinal edema lesions in OCT images, we develop a novel segmentation backbone that integrates a wavelet-enhanced feature extractor network and a multi-scale transformer module of our newly designed.

Segmentation

Paper
Add Code

EmoDiff: Intensity Controllable Emotional Text-to-Speech with Soft-Label Guidance

no code implementations • 17 Nov 2022 • Yiwei Guo, Chenpeng Du, Xie Chen, Kai Yu

Specifically, instead of being guided with a one-hot vector for the specified emotion, EmoDiff is guided with a soft label where the value of the specified emotion and \textit{Neutral} is set to $\alpha$ and $1-\alpha$ respectively.

Denoising

Paper
Add Code

BER: Balanced Error Rate For Speaker Diarization

2 code implementations • 8 Nov 2022 • Tao Liu, Kai Yu

DER is the primary metric to evaluate diarization performance while facing a dilemma: the errors in short utterances or segments tend to be overwhelmed by longer ones.

speaker-diarization Speaker Diarization

Paper
Code

OPAL: Ontology-Aware Pretrained Language Model for End-to-End Task-Oriented Dialogue

no code implementations • 10 Sep 2022 • Zhi Chen, Yuncong Liu, Lu Chen, Su Zhu, Mengyue Wu, Kai Yu

The second phase is to fine-tune the pretrained model on the TOD data.

Language Modelling Text Generation

Paper
Add Code

DFM: Dialogue Foundation Model for Universal Large-Scale Dialogue-Oriented Task Learning

no code implementations • 25 May 2022 • Zhi Chen, Jijia Bao, Lu Chen, Yuncong Liu, Da Ma, Bei Chen, Mengyue Wu, Su Zhu, Xin Dong, Fujiang Ge, Qingliang Miao, Jian-Guang Lou, Kai Yu

In this work, we aim to build a unified dialogue foundation model (DFM) which can be used to solve massive diverse dialogue tasks.

Dialogue Generation Knowledge Distillation

Paper
Add Code

D4: a Chinese Dialogue Dataset for Depression-Diagnosis-Oriented Chat

no code implementations • 24 May 2022 • Binwei Yao, Chao Shi, Likai Zou, Lingfeng Dai, Mengyue Wu, Lu Chen, Zhen Wang, Kai Yu

In a depression-diagnosis-directed clinical session, doctors initiate a conversation with ample emotional support that guides the patients to expose their symptoms based on clinical diagnosis criteria.

Response Generation

Paper
Add Code

META-GUI: Towards Multi-modal Conversational Agents on Mobile GUI

no code implementations • 23 May 2022 • Liangtai Sun, Xingyu Chen, Lu Chen, Tianle Dai, Zichen Zhu, Kai Yu

However, this API-based architecture greatly limits the information-searching capability of intelligent assistants and may even lead to task failure if TOD-specific APIs are not available or the task is too complicated to be executed by the provided APIs.

Scheduling

Paper
Add Code

TIE: Topological Information Enhanced Structural Reading Comprehension on Web Pages

1 code implementation • NAACL 2022 • Zihan Zhao, Lu Chen, Ruisheng Cao, Hongshen Xu, Xingyu Chen, Kai Yu

Recently, the structural reading comprehension (SRC) task on web pages has attracted increasing research interests.

Graph Attention Language Modelling +2

Paper
Code

Climate and Weather: Inspecting Depression Detection via Emotion Recognition

no code implementations • 29 Apr 2022 • Wen Wu, Mengyue Wu, Kai Yu

Automatic depression detection has attracted increasing amount of attention but remains a challenging task.

Depression Detection Emotion Recognition

Paper
Add Code

UniDU: Towards A Unified Generative Dialogue Understanding Framework

no code implementations • SIGDIAL (ACL) 2022 • Zhi Chen, Lu Chen, Bei Chen, Libo Qin, Yuncong Liu, Su Zhu, Jian-Guang Lou, Kai Yu

With the development of pre-trained language models, remarkable success has been witnessed in dialogue understanding (DU).

Dialogue State Tracking dialogue summary +3

Paper
Add Code

VQTTS: High-Fidelity Text-to-Speech Synthesis with Self-Supervised VQ Acoustic Feature

no code implementations • 2 Apr 2022 • Chenpeng Du, Yiwei Guo, Xie Chen, Kai Yu

The mainstream neural text-to-speech(TTS) pipeline is a cascade system, including an acoustic model(AM) that predicts acoustic feature from the input transcript and a vocoder that generates waveform according to the given acoustic feature.

Speech Synthesis Text-To-Speech Synthesis

Paper
Add Code

Audio-text Retrieval in Context

no code implementations • 25 Mar 2022 • Siyu Lou, Xuenan Xu, Mengyue Wu, Kai Yu

Using pre-trained audio features and a descriptor-based aggregation method, we build our contextual audio-text retrieval system.

AudioCaps Retrieval +1

Paper
Add Code

Unsupervised word-level prosody tagging for controllable speech synthesis

no code implementations • 15 Feb 2022 • Yiwei Guo, Chenpeng Du, Kai Yu

Although word-level prosody modeling in neural text-to-speech (TTS) has been investigated in recent research for diverse speech synthesis, it is still challenging to control speech synthesis manually without a specific reference.

Speech Synthesis

Paper
Add Code

Few-Shot NLU with Vector Projection Distance and Abstract Triangular CRF

no code implementations • 9 Dec 2021 • Su Zhu, Lu Chen, Ruisheng Cao, Zhi Chen, Qingliang Miao, Kai Yu

In this paper, we propose to improve prototypical networks with vector projection distance and abstract triangular Conditional Random Field (CRF) for the few-shot NLU.

intent-classification Intent Classification +5

Paper
Add Code

Exploring Separable Attention for Multi-Contrast MR Image Super-Resolution

1 code implementation • 3 Sep 2021 • Chun-Mei Feng, Yunlu Yan, Kai Yu, Yong Xu, Ling Shao, Huazhu Fu

Our SANet could explore the areas of high-intensity and low-intensity regions in the "forward" and "reverse" directions with the help of the auxiliary contrast, while learning clearer anatomical structure and edge information for the SR of a target-contrast MR image.

Image Super-Resolution

Paper
Code

THE SJTU SYSTEM FOR DCASE2021 CHALLENGE TASK 6: AUDIO CAPTIONING BASED ON ENCODER PRE-TRAINING AND REINFORCEMENT LEARNING

1 code implementation • DCASE Challenge 2021 • Xuenan Xu, Zeyu Xie, Mengyue Wu, Kai Yu

This report proposes an audio captioning system for the Detection and Classification of Acoustic Scenes and Events (DCASE) 2021 challenge task Task 6.

Ranked #2 on Audio captioning on Clotho (using extra training data)

Audio captioning Audio Tagging +2

Paper
Code

Decoupled Dialogue Modeling and Semantic Parsing for Multi-Turn Text-to-SQL

no code implementations • Findings (ACL) 2021 • Zhi Chen, Lu Chen, Hanqi Li, Ruisheng Cao, Da Ma, Mengyue Wu, Kai Yu

A dual learning approach is also proposed for the utterance rewrite model to address the data sparsity problem.

Semantic Parsing SQL Parsing +1

Paper
Add Code

LGESQL: Line Graph Enhanced Text-to-SQL Model with Mixed Local and Non-Local Relations

1 code implementation • ACL 2021 • Ruisheng Cao, Lu Chen, Zhi Chen, Yanbin Zhao, Su Zhu, Kai Yu

This work aims to tackle the challenging heterogeneous graph encoding problem in the text-to-SQL task.

Text-To-SQL

143

Paper
Code

ShadowGNN: Graph Projection Neural Network for Text-to-SQL Parser

no code implementations • NAACL 2021 • Zhi Chen, Lu Chen, Yanbin Zhao, Ruisheng Cao, Zihan Xu, Su Zhu, Kai Yu

Given a database schema, Text-to-SQL aims to translate a natural language question into the corresponding SQL query.

Semantic Parsing Text-To-SQL

Paper
Add Code

Quantum Dimensionality Reduction by Linear Discriminant Analysis

no code implementations • 4 Mar 2021 • Kai Yu, Gong-De Guo, Song Lin

In this paper, we present a quantum algorithm and a quantum circuit to efficiently perform linear discriminant analysis (LDA) for dimensionality reduction.

Dimensionality Reduction Quantum Physics

Paper
Add Code

LET: Linguistic Knowledge Enhanced Graph Transformer for Chinese Short Text Matching

1 code implementation • 25 Feb 2021 • Boer Lyu, Lu Chen, Su Zhu, Kai Yu

Additionally, we adopt the word lattice graph as input to maintain multi-granularity information.

Text Matching

Paper
Code

Rich Prosody Diversity Modelling with Phone-level Mixture Density Network

2 code implementations • 1 Feb 2021 • Chenpeng Du, Kai Yu

Generating natural speech with diverse and smooth prosody pattern is a challenging task.

Speech Synthesis Text-To-Speech Synthesis Sound

313

Paper
Code

WebSRC: A Dataset for Web-Based Structural Reading Comprehension

1 code implementation • EMNLP 2021 • Xingyu Chen, Zihan Zhao, Lu Chen, Danyang Zhang, Jiabao Ji, Ao Luo, Yuxuan Xiong, Kai Yu

In this paper, we introduce the task of structural reading comprehension (SRC) on web.

Reading Comprehension

Paper
Code

Towards duration robust weakly supervised sound event detection

1 code implementation • 19 Jan 2021 • Heinrich Dinkel, Mengyue Wu, Kai Yu

Our model outperforms other approaches on the DCASE2018 and URBAN-SED datasets without requiring prior duration knowledge.

Data Augmentation Sound Event Detection Sound Audio and Speech Processing

Paper
Code

A relic sketch extraction framework based on detail-aware hierarchical deep network

no code implementations • 17 Jan 2021 • Jinye Peng, Jiaxin Wang, Jun Wang, Erlei Zhang, Qunxi Zhang, Yongqin Zhang, Xianlin Peng, Kai Yu

For the fine extraction stage, we design a new multiscale U-Net (MSU-Net) to effectively remove disease noise and refine the sketch.

Edge Detection Transfer Learning

Paper
Add Code

A 3D Non-stationary MmWave Channel Model for Vacuum Tube Ultra-High-Speed Train Channels

no code implementations • 17 Jan 2021 • YingJie Xu, Kai Yu, Li Li, Xianfu Lei, Li Hao, Cheng-Xiang Wang

As a potential development direction of future transportation, the vacuum tube ultra-high-speed train (UHST) wireless communication systems have newly different channel characteristics from existing high-speed train (HST) scenarios.

Paper
Add Code

An Investigation on Different Underlying Quantization Schemes for Pre-trained Language Models

no code implementations • 14 Oct 2020 • Zihan Zhao, Yuncong Liu, Lu Chen, Qi Liu, Rao Ma, Kai Yu

Recently, pre-trained language models like BERT have shown promising performance on multiple natural language processing tasks.

Clustering Quantization

Paper
Add Code

CREDIT: Coarse-to-Fine Sequence Generation for Dialogue State Tracking

no code implementations • 22 Sep 2020 • Zhi Chen, Lu Chen, Zihan Xu, Yanbin Zhao, Su Zhu, Kai Yu

In dialogue systems, a dialogue state tracker aims to accurately find a compact representation of the current dialogue status, based on the entire dialogue history.

Dialogue State Tracking

Paper
Add Code

Structured Hierarchical Dialogue Policy with Graph Neural Networks

no code implementations • 22 Sep 2020 • Zhi Chen, Xiaoyuan Liu, Lu Chen, Kai Yu

A novel ComNet is proposed to model the structure of a hierarchical agent.

Paper
Add Code

Deep Reinforcement Learning for On-line Dialogue State Tracking

no code implementations • 22 Sep 2020 • Zhi Chen, Lu Chen, Xiang Zhou, Kai Yu

To the best of our knowledge, this is the first effort to optimize the DST module within DRL framework for on-line task-oriented spoken dialogue systems.

Dialogue Management Dialogue State Tracking +4

Paper
Add Code

Dual Learning for Dialogue State Tracking

no code implementations • 22 Sep 2020 • Zhi Chen, Lu Chen, Yanbin Zhao, Su Zhu, Kai Yu

In task-oriented multi-turn dialogue systems, dialogue state refers to a compact representation of the user goal in the context of dialogue history.

Dialogue State Tracking Sentence

Paper
Add Code

Distributed Structured Actor-Critic Reinforcement Learning for Universal Dialogue Management

no code implementations • 22 Sep 2020 • Zhi Chen, Lu Chen, Xiaoyuan Liu, Kai Yu

The task-oriented spoken dialogue system (SDS) aims to assist a human user in accomplishing a specific task (e. g., hotel booking).

Decision Making Dialogue Management +3

Paper
Add Code

Vector Projection Network for Few-shot Slot Tagging in Natural Language Understanding

1 code implementation • 21 Sep 2020 • Su Zhu, Ruisheng Cao, Lu Chen, Kai Yu

Few-shot slot tagging becomes appealing for rapid domain transfer and adaptation, motivated by the tremendous development of conversational dialogue systems.

Few-Shot Learning Natural Language Understanding +2

Paper
Code

Robust Spoken Language Understanding with RL-based Value Error Recovery

no code implementations • 7 Sep 2020 • Chen Liu, Su Zhu, Lu Chen, Kai Yu

The framework consists of a slot tagging model and a rule-based value error recovery module.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

Paper
Add Code

An Investigation on Deep Learning with Beta Stabilizer

no code implementations • 31 Jul 2020 • Qi Liu, Tian Tan, Kai Yu

It is concluded that beta stabilizer parameters can reduce the sensitivity of learning rate with almost the same performance on DNN with relu activation function and LSTM.

Handwriting Recognition speech-recognition +1

Paper
Add Code

Future Vector Enhanced LSTM Language Model for LVCSR

no code implementations • 31 Jul 2020 • Qi Liu, Yanmin Qian, Kai Yu

For the speech recognition rescoring, although the proposed LSTM LM obtains very slight gains, the new model seems obtain the great complementary with the conventional LSTM LM.

Language Modelling speech-recognition +1

Paper
Add Code

Neural Graph Matching Networks for Chinese Short Text Matching

no code implementations • ACL 2020 • Lu Chen, Yanbin Zhao, Boer Lyu, Lesheng Jin, Zhi Chen, Su Zhu, Kai Yu

Chinese short text matching usually employs word sequences rather than character sequences to get better performance.

Chinese Word Segmentation Graph Matching +3

Paper
Add Code

Line Graph Enhanced AMR-to-Text Generation with Mix-Order Graph Attention Networks

no code implementations • ACL 2020 • Yanbin Zhao, Lu Chen, Zhi Chen, Ruisheng Cao, Su Zhu, Kai Yu

We also adopt graph attention networks with higher-order neighborhood information to encode the rich structure in AMR graphs.

AMR-to-Text Generation Graph Attention +2

Paper
Add Code

Quantum Criticism: A Tagged News Corpus Analysed for Sentiment and Named Entities

no code implementations • 5 Jun 2020 • Ashwini Badgujar, Sheng Chen, Andrew Wang, Kai Yu, Paul Intrevado, David Guy Brizan

In this research, we continuously collect data from the RSS feeds of traditional news sources.

named-entity-recognition Named Entity Recognition +3

Paper
Add Code

Unsupervised Dual Paraphrasing for Two-stage Semantic Parsing

1 code implementation • ACL 2020 • Ruisheng Cao, Su Zhu, Chenyu Yang, Chen Liu, Rao Ma, Yanbin Zhao, Lu Chen, Kai Yu

One daunting problem for semantic parsing is the scarcity of annotation.

Semantic Parsing Vocal Bursts Valence Prediction

Paper
Code

Jointly Encoding Word Confusion Network and Dialogue Context with BERT for Spoken Language Understanding

1 code implementation • 24 May 2020 • Chen Liu, Su Zhu, Zijian Zhao, Ruisheng Cao, Lu Chen, Kai Yu

In this paper, a novel BERT based SLU model (WCN-BERT SLU) is proposed to encode WCNs and the dialogue context jointly.

Spoken Language Understanding

Paper
Code

Semi-Supervised Text Simplification with Back-Translation and Asymmetric Denoising Autoencoders

no code implementations • 30 Apr 2020 • Yanbin Zhao, Lu Chen, Zhi Chen, Kai Yu

When modeling simple and complex sentences with autoencoders, we introduce different types of noise into the training process.

Denoising Language Modelling +4

Paper
Add Code

Dual Learning for Semi-Supervised Natural Language Understanding

2 code implementations • 26 Apr 2020 • Su Zhu, Ruisheng Cao, Kai Yu

The framework is composed of dual pseudo-labeling and dual learning method, which enables an NLU model to make full use of data (labeled and unlabeled) through a closed-loop of the primal and dual tasks.

Natural Language Understanding Sentence

143

Paper
Code

Efficient Context and Schema Fusion Networks for Multi-Domain Dialogue State Tracking

no code implementations • Findings of the Association for Computational Linguistics 2020 • Su Zhu, Jieyu Li, Lu Chen, Kai Yu

In this paper, a novel context and schema fusion network is proposed to encode the dialogue context and schema graph by using internal and external attention mechanisms.

Ranked #8 on Multi-domain Dialogue State Tracking on MULTIWOZ 2.0

Dialogue State Tracking Multi-domain Dialogue State Tracking

Paper
Add Code

Schema-Guided Multi-Domain Dialogue State Tracking with Graph Attention Neural Networks

no code implementations • 3 Apr 2020 • Lu Chen, Boer Lv, Chi Wang, Su Zhu, Bowen Tan, Kai Yu

For multi-domain DST, the data sparsity problem is also a major obstacle due to the increased number of state candidates.

Ranked #12 on Multi-domain Dialogue State Tracking on MULTIWOZ 2.1

Dialogue State Tracking Graph Attention +1

Paper
Add Code

Voice activity detection in the wild via weakly supervised sound event detection

1 code implementation • 27 Mar 2020 • Heinrich Dinkel, Yefei Chen, Mengyue Wu, Kai Yu

We proposed two GPVAD models, one full (GPV-F), trained on 527 Audioset sound events, and one binary (GPV-B), only distinguishing speech and noise.

Sound Audio and Speech Processing

140

Paper
Code

Prior Knowledge Driven Label Embedding for Slot Filling in Natural Language Understanding

no code implementations • 22 Mar 2020 • Su Zhu, Zijian Zhao, Rao Ma, Kai Yu

The proposed approaches are evaluated on three datasets.

Domain Adaptation Natural Language Understanding +2

Paper
Add Code

Data Augmentation with Atomic Templates for Spoken Language Understanding

1 code implementation • IJCNLP 2019 • Zijian Zhao, Su Zhu, Kai Yu

The atomic templates produce exemplars for fine-grained constituents of semantic representations.

Data Augmentation Domain Adaptation +1

Paper
Code

Semantic Parsing with Dual Learning

1 code implementation • ACL 2019 • Ruisheng Cao, Su Zhu, Chen Liu, Jieyu Li, Kai Yu

Semantic parsing converts natural language queries into structured logical forms.

Natural Language Queries Semantic Parsing

Paper
Code

Margin Matters: Towards More Discriminative Deep Neural Network Embeddings for Speaker Recognition

no code implementations • 18 Jun 2019 • Xu Xiang, Shuai Wang, Houjun Huang, Yanmin Qian, Kai Yu

The proposed approach can achieve the state-of-the-art performance, with 25% ~ 30% equal error rate (EER) reduction on both tasks when compared to strong baselines using cross entropy loss with softmax, obtaining 2. 238% EER on VoxCeleb1 test set and 2. 761% EER on SITW core-core test set, respectively.

Speaker Recognition

Paper
Add Code

Audio Caption in a Car Setting with a Sentence-Level Loss

1 code implementation • 31 May 2019 • Xuenan Xu, Heinrich Dinkel, Mengyue Wu, Kai Yu

Captioning has attracted much attention in image and video understanding while a small amount of work examines audio captioning.

Audio captioning Semantic Similarity +5

Paper
Code

AgentGraph: Towards Universal Dialogue Management with Structured Deep Reinforcement Learning

no code implementations • 27 May 2019 • Lu Chen, Zhi Chen, Bowen Tan, Sishan Long, Milica Gasic, Kai Yu

Experiments show that AgentGraph models significantly outperform traditional reinforcement learning approaches on most of the 18 tasks of the PyDial benchmark.

Dialogue Management Management +4

Paper
Add Code

A Hierarchical Decoding Model For Spoken Language Understanding From Unaligned Data

1 code implementation • 9 Apr 2019 • Zijian Zhao, Su Zhu, Kai Yu

In the paper, we focus on spoken language understanding from unaligned data whose annotation is a set of act-slot-value triples.

Spoken Language Understanding

Paper
Code

Text-based depression detection on sparse data

1 code implementation • 8 Apr 2019 • Heinrich Dinkel, Mengyue Wu, Kai Yu

Previous text-based depression detection is commonly based on large user-generated data.

Depression Detection Sentence +1

Paper
Code

Duration robust sound event detection

1 code implementation • 8 Apr 2019 • Heinrich Dinkel, Kai Yu

Task 4 of the Dcase2018 challenge demonstrated that substantially more research is needed for a real-world application of sound event detection.

Sound Audio and Speech Processing

Paper
Code

Audio Caption: Listen and Tell

1 code implementation • 25 Feb 2019 • Mengyue Wu, Heinrich Dinkel, Kai Yu

A baseline encoder-decoder model is provided for both English and Mandarin.

General Classification

Paper
Code

End-to-End Monaural Multi-speaker ASR System without Pretraining

no code implementations • 5 Nov 2018 • Xuankai Chang, Yanmin Qian, Kai Yu, Shinji Watanabe

The experiments demonstrate that the proposed methods can improve the performance of the end-to-end model in separating the overlapping speech and recognizing the separated streams.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Paper
Add Code

Towards Universal Dialogue State Tracking

1 code implementation • EMNLP 2018 • Liliang Ren, Kaige Xie, Lu Chen, Kai Yu

Dialogue state tracking is the core part of a spoken dialogue system.

Ranked #2 on Dialogue State Tracking on Second dialogue state tracking challenge

Dialogue State Tracking

Paper
Code

Sequence Discriminative Training for Deep Learning based Acoustic Keyword Spotting

no code implementations • 2 Aug 2018 • Zhehuai Chen, Yanmin Qian, Kai Yu

The few studies on sequence discriminative training for KWS are limited for fixed vocabulary or LVCSR based methods and have not been compared to the state-of-the-art deep learning based KWS approaches.

Keyword Spotting speech-recognition +1

Paper
Add Code

Structured Dialogue Policy with Graph Neural Networks

no code implementations • COLING 2018 • Lu Chen, Bowen Tan, Sishan Long, Kai Yu

The proposed structured deep reinforcement learning is based on graph neural networks (GNN), which consists of some sub-networks, each one for a node on a directed graph.

Automatic Speech Recognition (ASR) Decision Making +5

Paper
Add Code

Cost-Sensitive Active Learning for Dialogue State Tracking

no code implementations • WS 2018 • Kaige Xie, Cheng Chang, Liliang Ren, Lu Chen, Kai Yu

Dialogue state tracking (DST), when formulated as a supervised learning problem, relies on labelled data.

Active Learning Dialogue State Tracking

Paper
Add Code

Binarized LSTM Language Model

no code implementations • NAACL 2018 • Xuan Liu, Di Cao, Kai Yu

Although excellent performance is obtained for large vocabulary tasks, tremendous memory consumption prohibits the use of LSTM LM in low-resource devices.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Paper
Add Code

On Modular Training of Neural Acoustics-to-Word Model for LVCSR

no code implementations • 3 Mar 2018 • Zhehuai Chen, Qi Liu, Hao Li, Kai Yu

Finally, modules are integrated into an acousticsto-word model (A2W) and jointly optimized using acoustic data to retain the advantage of sequence modeling.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Paper
Add Code

Affordable On-line Dialogue Policy Learning

no code implementations • EMNLP 2017 • Cheng Chang, Runzhe Yang, Lu Chen, Xiang Zhou, Kai Yu

The key to building an evolvable dialogue system in real-world scenarios is to ensure an affordable on-line dialogue policy learning, which requires the on-line learning process to be safe, efficient and economical.

Dialogue Management

Paper
Add Code

Agent-Aware Dropout DQN for Safe and Efficient On-line Dialogue Policy Learning

no code implementations • EMNLP 2017 • Lu Chen, Xiang Zhou, Cheng Chang, Runzhe Yang, Kai Yu

Hand-crafted rules and reinforcement learning (RL) are two popular choices to obtain dialogue policy.

Automatic Speech Recognition (ASR) Dialogue Management +3

Paper
Add Code

Concept Transfer Learning for Adaptive Language Understanding

no code implementations • WS 2018 • Su Zhu, Kai Yu

Concept definition is important in language understanding (LU) adaptation since literal definition difference can easily lead to data sparsity even if different data sets are actually semantically correlated.

Domain Adaptation Transfer Learning

Paper
Add Code

On-line Dialogue Policy Learning with Companion Teaching

no code implementations • EACL 2017 • Lu Chen, Runzhe Yang, Cheng Chang, Zihao Ye, Xiang Zhou, Kai Yu

On-line dialogue policy learning is the key for building evolvable conversational agent in real world scenarios.

Dialogue Management

Paper
Add Code

A Large-scale Distributed Video Parsing and Evaluation Platform

no code implementations • 29 Nov 2016 • Kai Yu, Yang Zhou, Da Li, Zhang Zhang, Kaiqi Huang

Visual surveillance systems have become one of the largest data sources of Big Visual Data in real world.

Paper
Add Code

Weakly-supervised Learning of Mid-level Features for Pedestrian Attribute Recognition and Localization

no code implementations • 17 Nov 2016 • Kai Yu, Biao Leng, Zhang Zhang, Dangwei Li, Kaiqi Huang

Based on GoogLeNet, firstly, a set of mid-level attribute features are discovered by novelly designed detection layers, where a max-pooling based weakly-supervised object detection technique is used to train these layers with only image-level labels without the need of bounding box annotations of pedestrian attributes.

Attribute Clustering +5

Paper
Add Code

Encoder-decoder with Focus-mechanism for Sequence Labelling Based Spoken Language Understanding

no code implementations • 6 Aug 2016 • Su Zhu, Kai Yu

This paper investigates the framework of encoder-decoder with attention for sequence labelling based spoken language understanding.

speech-recognition Speech Recognition +1

Paper
Add Code

Text Flow: A Unified Text Detection System in Natural Scene Images

no code implementations • ICCV 2015 • Shangxuan Tian, Yifeng Pan, Chang Huang, Shijian Lu, Kai Yu, Chew Lim Tan

With character candidates detected by cascade boosting, the min-cost flow network model integrates the last three sequential steps into a single process which solves the error accumulation problem at both character level and text line level effectively.

Scene Text Detection Text Detection +1

Paper
Add Code

On Training Bi-directional Neural Network Language Model with Noise Contrastive Estimation

1 code implementation • 19 Feb 2016 • Tianxing He, Yu Zhang, Jasha Droppo, Kai Yu

We propose to train bi-directional neural network language model(NNLM) with noise contrastive estimation(NCE).

Language Modelling

Paper
Code

Recurrent Polynomial Network for Dialogue State Tracking with Mismatched Semantic Parsers

no code implementations • WS 2015 • Qizhe Xie, Kai Sun, Su Zhu, Lu Chen, Kai Yu

Decision Making Dialogue Management +3

Paper
Add Code

Bidirectional LSTM-CRF Models for Sequence Tagging

25 code implementations • 9 Aug 2015 • Zhiheng Huang, Wei Xu, Kai Yu

It can also use sentence level tag information thanks to a CRF layer.

Ranked #1 on Named Entity Recognition (NER) on FindVehicle

Chunking POS +2

2,308

Paper
Code

Recurrent Polynomial Network for Dialogue State Tracking

no code implementations • 14 Jul 2015 • Kai Sun, Qizhe Xie, Kai Yu

Dialogue state tracking (DST) is a process to estimate the distribution of the dialogue states as a dialogue progresses.

dialog state tracking Dialogue State Tracking

Paper
Add Code

Deep Multiple Instance Learning for Image Classification and Auto-Annotation

no code implementations • CVPR 2015 • Jiajun Wu, Yinan Yu, Chang Huang, Kai Yu

The recent development in learning deep representations has demonstrated its wide applications in traditional vision tasks like classification and detection.

Classification General Classification +3

Paper
Add Code

Communication Efficient Distributed Machine Learning with the Parameter Server

no code implementations • NeurIPS 2014 • Mu Li, David G. Andersen, Alexander J. Smola, Kai Yu

This paper describes a third-generation parameter server framework for distributed machine learning.

BIG-bench Machine Learning regression

Paper
Add Code

The SJTU System for Dialog State Tracking Challenge 2

no code implementations • WS 2014 • Kai Sun, Lu Chen, Su Zhu, Kai Yu

dialog state tracking Dialogue State Tracking +2

Paper
Add Code

High-dimensional Joint Sparsity Random Effects Model for Multi-task Learning

no code implementations • 26 Sep 2013 • Krishnakumar Balasubramanian, Kai Yu, Tong Zhang

The traditional convex formulation employs the group Lasso relaxation to achieve joint sparsity across tasks.

Multi-Task Learning Vocal Bursts Intensity Prediction

Paper
Add Code

Large Scale Strongly Supervised Ensemble Metric Learning, with Applications to Face Verification and Retrieval

1 code implementation • 25 Dec 2012 • Chang Huang, Shenghuo Zhu, Kai Yu

Learning Mahanalobis distance metrics in a high- dimensional feature space is very difficult especially when structural sparsity and low rank are enforced to improve com- putational efficiency in testing phase.

Face Verification Metric Learning +1

5,263

Paper
Code

The Effect of Cognitive Load on a Statistical Dialogue System

no code implementations • WS 2012 • Milica Ga{\v{s}}i{\'c}, Pirros Tsiakoulis, Matthew Henderson, Blaise Thomson, Kai Yu, Eli Tzirkel, Steve Young

Speech Recognition

Paper
Add Code

Deep Coding Network

no code implementations • NeurIPS 2010 • Yuanqing Lin, Tong Zhang, Shenghuo Zhu, Kai Yu

This paper proposes a principled extension of the traditional single-layer flat sparse coding scheme, where a two-layer coding scheme is derived based on theoretical analysis of nonlinear functional approximation that extends recent results for local coordinate coding.

Paper
Add Code

Nonlinear Learning using Local Coordinate Coding

no code implementations • NeurIPS 2009 • Kai Yu, Tong Zhang, Yihong Gong

This paper introduces a new method for semi-supervised learning on high dimensional nonlinear manifolds, which includes a phase of unsupervised basis learning and a phase of supervised function learning.

Paper
Add Code

Stochastic Relational Models for Large-scale Dyadic Data using MCMC

no code implementations • NeurIPS 2008 • Shenghuo Zhu, Kai Yu, Yihong Gong

Stochastic relational models provide a rich family of choices for learning and predicting dyadic data between two sets of entities.

Bayesian Inference Collaborative Filtering

Paper
Add Code

Deep Learning with Kernel Regularization for Visual Recognition

no code implementations • NeurIPS 2008 • Kai Yu, Wei Xu, Yihong Gong

In this paper we focus on training deep neural networks for visual recognition tasks.

Paper
Add Code

Predictive Matrix-Variate t Models

no code implementations • NeurIPS 2007 • Shenghuo Zhu, Kai Yu, Yihong Gong

It is becoming increasingly important to learn from a partially-observed random matrix and predict its missing elements.

Missing Elements Model Selection

Paper
Add Code

Gaussian Process Models for Link Analysis and Transfer Learning

no code implementations • NeurIPS 2007 • Kai Yu, Wei Chu

In this paper we develop a Gaussian process (GP) framework to model a collection of reciprocal random variables defined on the \emph{edges} of a network.

Link Prediction Transfer Learning

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.