Search Results for author: Pedram Akbarian

Found 4 papers, 0 papers with code

Statistical Advantages of Perturbing Cosine Router in Sparse Mixture of Experts

no code implementations • 23 May 2024 • Huy Nguyen, Pedram Akbarian, Trang Pham, Trang Nguyen, Shujian Zhang, Nhat Ho

The cosine router in sparse Mixture of Experts (MoE) has recently emerged as an attractive alternative to the conventional linear router.

Paper
Add Code

Is Temperature Sample Efficient for Softmax Gaussian Mixture of Experts?

no code implementations • 25 Jan 2024 • Huy Nguyen, Pedram Akbarian, Nhat Ho

We demonstrate that due to interactions between the temperature and other model parameters via some partial differential equations, the convergence rates of parameter estimations are slower than any polynomial rates, and could be as slow as $\mathcal{O}(1/\log(n))$, where $n$ denotes the sample size.

Paper
Add Code

A General Theory for Softmax Gating Multinomial Logistic Mixture of Experts

no code implementations • 22 Oct 2023 • Huy Nguyen, Pedram Akbarian, TrungTin Nguyen, Nhat Ho

Mixture-of-experts (MoE) model incorporates the power of multiple submodels via gating functions to achieve greater performance in numerous regression and classification applications.

Density Estimation regression

Paper
Add Code

Statistical Perspective of Top-K Sparse Softmax Gating Mixture of Experts

no code implementations • 25 Sep 2023 • Huy Nguyen, Pedram Akbarian, Fanqi Yan, Nhat Ho

When the true number of experts $k_{\ast}$ is known, we demonstrate that the convergence rates of density and parameter estimations are both parametric on the sample size.

Density Estimation

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.