Search Results for author: Rouzbeh Ghaderi

Found 1 papers, 0 papers with code

On the Optimization and Generalization of Multi-head Attention

no code implementations19 Oct 2023 Puneesh Deora, Rouzbeh Ghaderi, Hossein Taheri, Christos Thrampoulidis

Finally, we demonstrate that these conditions are satisfied for a simple tokenized-mixture model.

Cannot find the paper you are looking for? You can Submit a new open access paper.