no code implementations • 11 Apr 2024 • Arushi Goel, Zhifeng Kong, Rafael Valle, Bryan Catanzaro
Existing datasets for audio understanding primarily focus on single-turn interactions (i. e. audio captioning, audio question answering) for describing audio in natural language, thus limiting understanding audio via interactive dialogue.
no code implementations • 2 Feb 2024 • Zhifeng Kong, Arushi Goel, Rohan Badlani, Wei Ping, Rafael Valle, Bryan Catanzaro
Augmenting large language models (LLMs) to understand audio -- including non-speech sounds and non-verbal speech -- is critically important for diverse real-world applications of LLMs.
no code implementations • 12 Sep 2023 • Zhifeng Kong, Wei Ping, Ambrish Dantrey, Bryan Catanzaro
In this work, we present CleanUNet 2, a speech denoising model that combines the advantages of waveform denoiser and spectrogram denoiser and achieves the best of both worlds.
no code implementations • 18 May 2023 • Zhifeng Kong, Kamalika Chaudhuri
Deep generative models are known to produce undesirable samples such as harmful content.
no code implementations • 7 Mar 2023 • Zhifeng Kong, Amrita Roy Chowdhury, Kamalika Chaudhuri
Given a machine learning model, a data point and some auxiliary information, the goal of an MI attack is to determine whether the data point was used to train the model.
no code implementations • 29 Jun 2022 • Zhifeng Kong, Scott Alfeld
Using this framework, we introduce a fast method for approximate data deletion and a statistical test for estimating whether or not training points have been deleted.
no code implementations • 29 Jun 2022 • Zhifeng Kong, Kamalika Chaudhuri
Large pre-trained generative models are known to occasionally output undesirable samples, which undermines their trustworthiness.
1 code implementation • 15 Feb 2022 • Zhifeng Kong, Wei Ping, Ambrish Dantrey, Bryan Catanzaro
In this work, we present CleanUNet, a causal speech denoising model on the raw waveform.
1 code implementation • ICLR 2022 • Zhaoyang Lyu, Zhifeng Kong, Xudong Xu, Liang Pan, Dahua Lin
The RFNet refines the coarse output of the CGNet and further improves quality of the completed point cloud.
1 code implementation • ICML Workshop INNF 2021 • Zhifeng Kong, Wei Ping
In this work, we propose FastDPM, a unified framework for fast sampling in diffusion probabilistic models.
1 code implementation • NeurIPS 2021 • Zhifeng Kong, Kamalika Chaudhuri
Instance-based interpretation methods have been widely studied for supervised learning methods as they help explain how black box neural networks predict.
no code implementations • ICML Workshop INNF 2021 • Zhifeng Kong, Kamalika Chaudhuri
Normalizing flows are a class of flexible deep generative models that offer easy likelihood computation.
no code implementations • ICLR Workshop GTRL 2021 • Zhangyu Wang, Lantian Xu, Zhifeng Kong, Weilong Wang, Xuyu Peng, Enyang Zheng
Hyperbolic embeddings are a class of representation learning methods that offer competitive performances when data can be abstracted as a tree-like graph.
11 code implementations • ICLR 2021 • Zhifeng Kong, Wei Ping, Jiaji Huang, Kexin Zhao, Bryan Catanzaro
In this work, we propose DiffWave, a versatile diffusion probabilistic model for conditional and unconditional waveform generation.
Ranked #2 on Speech Synthesis on LJSpeech
no code implementations • 31 May 2020 • Zhifeng Kong, Kamalika Chaudhuri
Normalizing flows have received a great deal of recent attention as they allow flexible generative modeling as well as easy likelihood computation.
1 code implementation • 2 Dec 2019 • Zhaoyang Lyu, Ching-Yun Ko, Zhifeng Kong, Ngai Wong, Dahua Lin, Luca Daniel
We draw inspiration from such work and further demonstrate the optimality of deterministic CROWN (Zhang et al. 2018) solutions in a given linear programming problem under mild constraints.
1 code implementation • 19 Nov 2017 • Zhifeng Kong
In this paper, we made an extension to the convergence analysis of the dynamics of two-layered bias-free networks with one $ReLU$ output.
no code implementations • 22 Oct 2017 • Qi Lyu, Zhifeng Kong, Chao Shen, Tianwei Yue
This paper presents a novel user authentication system through wrist-worn devices by analyzing the interaction behavior with users, which is both accurate and efficient for future usage.
no code implementations • 27 Sep 2017 • Zhifeng Kong, Shuo Ding
In this paper we introduce a new structure to Generative Adversarial Networks by adding an inverse transformation unit behind the generator.