Search Results for author: Hongkang Li

Found 9 papers, 0 papers with code

How does promoting the minority fraction affect generalization? A theoretical study of the one-hidden-layer neural network on group imbalance

no code implementations • 12 Mar 2024 • Hongkang Li, Shuai Zhang, Yihua Zhang, Meng Wang, Sijia Liu, Pin-Yu Chen

Despite algorithmic efforts to improve the minority group accuracy, a theoretical generalization analysis of ERM on individual groups remains elusive.

Binary Classification Image Classification

Paper
Add Code

Training Nonlinear Transformers for Efficient In-Context Learning: A Theoretical Learning and Generalization Analysis

no code implementations • 23 Feb 2024 • Hongkang Li, Meng Wang, Songtao Lu, Xiaodong Cui, Pin-Yu Chen

Despite the empirical success, the mechanics of how to train a Transformer to achieve ICL and the corresponding ICL capacity is mostly elusive due to the technical challenges of analyzing the nonconvex training problems resulting from the nonlinear self-attention and nonlinear activation in Transformers.

Binary Classification In-Context Learning

Paper
Add Code

On the Convergence and Sample Complexity Analysis of Deep Q-Networks with $ε$-Greedy Exploration

no code implementations • 24 Oct 2023 • Shuai Zhang, Hongkang Li, Meng Wang, Miao Liu, Pin-Yu Chen, Songtao Lu, Sijia Liu, Keerthiram Murugesan, Subhajit Chaudhury

This paper provides the first theoretical convergence and sample complexity analysis of the practical setting of DQNs with $\epsilon$-greedy policy.

Q-Learning

Paper
Add Code

How Can Context Help? Exploring Joint Retrieval of Passage and Personalized Context

no code implementations • 26 Aug 2023 • Hui Wan, Hongkang Li, Songtao Lu, Xiaodong Cui, Marina Danilevsky

The integration of external personalized context information into document-grounded conversational systems has significant potential business value, but has not been well-studied.

Passage Retrieval Retrieval

Paper
Add Code

Enhancing Graph Transformers with Hierarchical Distance Structural Encoding

no code implementations • 22 Aug 2023 • Yuankai Luo, Hongkang Li, Lei Shi, Xiao-Ming Wu

Empirically, we demonstrate that graph transformers with HDSE excel in graph classification, regression on 7 graph-level datasets, and node classification on 11 large-scale graphs, including those with up to a billion nodes.

Graph Classification Node Classification

Paper
Add Code

A Theoretical Understanding of Shallow Vision Transformers: Learning, Generalization, and Sample Complexity

no code implementations • 12 Feb 2023 • Hongkang Li, Meng Wang, Sijia Liu, Pin-Yu Chen

Based on a data model characterizing both label-relevant and label-irrelevant tokens, this paper provides the first theoretical analysis of training a shallow ViT, i. e., one self-attention layer followed by a two-layer perceptron, for a classification task.

Paper
Add Code

Generalization Guarantee of Training Graph Convolutional Networks with Graph Topology Sampling

no code implementations • 7 Jul 2022 • Hongkang Li, Meng Wang, Sijia Liu, Pin-Yu Chen, JinJun Xiong

Graph convolutional networks (GCNs) have recently achieved great empirical success in learning graph-structured data.

Node Classification

Paper
Add Code

Learning and generalization of one-hidden-layer neural networks, going beyond standard Gaussian data

no code implementations • 7 Jul 2022 • Hongkang Li, Shuai Zhang, Meng Wang

In addition, for the first time, this paper characterizes the impact of the input distributions on the sample complexity and the learning rate.

Paper
Add Code

Learning One-hidden-layer Neural Networks on Gaussian Mixture Models with Guaranteed Generalizability

no code implementations • 1 Jan 2021 • Hongkang Li, Shuai Zhang, Meng Wang

Instead of following the conventional and restrictive assumption in the literature that the input features follow the standard Gaussian distribution, this paper, for the first time, analyzes a more general and practical scenario that the input features follow a Gaussian mixture model of a finite number of Gaussian distributions of various mean and variance.

Binary Classification

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.