no code implementations • 3 Apr 2024 • Weidi Luo, Siyuan Ma, Xiaogeng Liu, XIAOYU GUO, Chaowei Xiao
With the rapid advancements in Multimodal Large Language Models (MLLMs), securing these models against malicious inputs while aligning them with human values has emerged as a critical challenge.
1 code implementation • 27 Dec 2023 • Zhengjia Wang, Danding Wang, Qiang Sheng, Juan Cao, Silong Su, Yifan Sun, Beizhe Hu, Siyuan Ma
As the disruptive changes in the media economy and the proliferation of alternative news media outlets, news intent has progressively deviated from ethical standards that serve the public interest.
no code implementations • 4 Apr 2023 • Yiheng Liu, Tianle Han, Siyuan Ma, Jiayue Zhang, Yuanyuan Yang, Jiaming Tian, Hao He, Antong Li, Mengshen He, Zhengliang Liu, Zihao Wu, Lin Zhao, Dajiang Zhu, Xiang Li, Ning Qiang, Dingang Shen, Tianming Liu, Bao Ge
This paper presents a comprehensive survey of ChatGPT-related (GPT-3. 5 and GPT-4) research, state-of-the-art large language models (LLM) from the GPT series, and their prospective applications across diverse domains.
no code implementations • 6 Dec 2022 • Soroosh Mariooryad, Matt Shannon, Siyuan Ma, Tom Bagby, David Kao, Daisy Stanton, Eric Battenberg, RJ Skerry-Ryan
We present a noisy channel generative model of two sequences, for example text and speech, which enables uncovering the association between the two modalities when limited paired data is available.
2 code implementations • 26 Mar 2021 • Minghao Liu, Shengqi Ren, Siyuan Ma, Jiahui Jiao, Yizhou Chen, Zhiguang Wang, Wei Song
In this work, we explored a simple extension of the current Transformer Networks with gating, named Gated Transformer Networks (GTN) for the multivariate time series classification problem.
2 code implementations • 28 Dec 2018 • Mikhail Belkin, Daniel Hsu, Siyuan Ma, Soumik Mandal
This connection between the performance and the structure of machine learning models delineates the limits of classical analyses, and has implications for both the theory and practice of machine learning.
no code implementations • 6 Nov 2018 • Like Hui, Siyuan Ma, Mikhail Belkin
We apply a fast kernel method for mask-based single-channel speech enhancement.
no code implementations • 6 Nov 2018 • Raef Bassily, Mikhail Belkin, Siyuan Ma
Large over-parametrized models learned via stochastic gradient descent (SGD) methods have become a key element in modern machine learning.
2 code implementations • 15 Jun 2018 • Siyuan Ma, Mikhail Belkin
In this paper we develop the first analytical framework that extends linear scaling to match the parallel computing capacity of a resource.
no code implementations • ICML 2018 • Mikhail Belkin, Siyuan Ma, Soumik Mandal
Certain key phenomena of deep learning are manifested similarly in kernel methods in the modern "overfitted" regime.
no code implementations • ICML 2018 • Siyuan Ma, Raef Bassily, Mikhail Belkin
We show that there is a critical batch size $m^*$ such that: (a) SGD iteration with mini-batch size $m\leq m^*$ is nearly equivalent to $m$ iterations of mini-batch size $1$ (\emph{linear scaling regime}).
1 code implementation • NeurIPS 2017 • Siyuan Ma, Mikhail Belkin
An analysis based on the spectral properties of the kernel demonstrates that only a vanishingly small portion of the function space is reachable after a polynomial number of gradient descent iterations.