no code implementations • ICML 2020 • Aadirupa Saha, Pierre Gaillard, Michal Valko
The best existing efficient (i. e., polynomial-time) algorithms for this problem only guarantee a $O(T^{2/3})$ upper-bound on the regret.
no code implementations • 29 Feb 2024 • Aadirupa Saha, Pierre Gaillard
In this paper, we designed efficient algorithms for the problem of regret minimization in assortment selection with \emph{Plackett Luce} (PL) based user choices.
no code implementations • 23 Feb 2024 • Julien Zhou, Pierre Gaillard, Thibaud Rahier, Houssam Zenati, Julyan Arbel
We address the problem of stochastic combinatorial semi-bandits, where a player can select from P subsets of a set containing d base items.
no code implementations • 7 Feb 2024 • Camila Fernandez, Pierre Gaillard, Joseph de Vilmarest, Olivier Wintenberger
We introduce an online mathematical framework for survival analysis, allowing real time adaptation to dynamic environments and censored data.
no code implementations • 30 Nov 2023 • Bianca Marin Moreno, Margaux Brégère, Pierre Gaillard, Nadia Oudjane
Many machine learning tasks can be solved by minimizing a convex function of an occupancy measure over the policies that generate them.
no code implementations • 14 Sep 2023 • Pierre Gaillard, Sébastien Gerchinovitz, Étienne de Montbrun
We prove that GreedyBox achieves an optimal sample complexity for any function $f$, up to logarithmic factors.
1 code implementation • 23 Feb 2023 • Houssam Zenati, Eustache Diemert, Matthieu Martin, Julien Mairal, Pierre Gaillard
Counterfactual Risk Minimization (CRM) is a framework for dealing with the logged bandit feedback problem, where the goal is to improve a logging policy using offline data.
no code implementations • 16 Feb 2023 • Bianca Marin Moreno, Margaux Brégère, Pierre Gaillard, Nadia Oudjane
Integrating renewable energy into the power grid while balancing supply and demand is a complex issue, given its intermittent nature.
no code implementations • 26 Oct 2022 • Pierre Gaillard, Aadirupa Saha, Soham Dan
We address the problem of \emph{`Internal Regret'} in \emph{Sleeping Bandits} in the fully adversarial setup, as well as draw connections between different existing notions of sleeping regrets in the multiarmed bandits (MAB) literature and consequently analyze the implications: Our first contribution is to propose the new notion of \emph{Internal Regret} for sleeping MAB.
no code implementations • 14 Feb 2022 • Aadirupa Saha, Pierre Gaillard
We study the problem of $K$-armed dueling bandit for both stochastic and adversarial environments, where the goal of the learner is to aggregate information through relative preferences of pair of decisions points queried in an online sequential manner.
1 code implementation • 11 Feb 2022 • Houssam Zenati, Alberto Bietti, Eustache Diemert, Julien Mairal, Matthieu Martin, Pierre Gaillard
While standard methods require a O(CT^3) complexity where T is the horizon and the constant C is related to optimizing the UCB rule, we propose an efficient contextual algorithm for large-scale problems.
no code implementations • NeurIPS 2021 • Mathieu Even, Raphaël Berthier, Francis Bach, Nicolas Flammarion, Hadrien Hendrikx, Pierre Gaillard, Laurent Massoulié, Adrien Taylor
We introduce the ``continuized'' Nesterov acceleration, a close variant of Nesterov acceleration whose variables are indexed by a continuous time parameter.
no code implementations • NeurIPS 2021 • Reda Ouhamma, Rémy Degenne, Pierre Gaillard, Vianney Perchet
In the fixed budget thresholding bandit problem, an algorithm sequentially allocates a budgeted number of samples to different distributions.
no code implementations • NeurIPS 2021 • Rémi Jézéquel, Pierre Gaillard, Alessandro Rudi
Mixability has been shown to be a powerful tool to obtain algorithms with optimal regret.
no code implementations • NeurIPS 2021 • Aadirupa Saha, Pierre Gaillard
The goal is to find an optimal `no-regret' policy that can identify the best available item at each round, as opposed to the standard `fixed best-arm regret objective' of dueling bandits.
1 code implementation • 10 Jun 2021 • Mathieu Even, Raphaël Berthier, Francis Bach, Nicolas Flammarion, Pierre Gaillard, Hadrien Hendrikx, Laurent Massoulié, Adrien Taylor
We introduce the continuized Nesterov acceleration, a close variant of Nesterov acceleration whose variables are indexed by a continuous time parameter.
no code implementations • 11 Feb 2021 • Raphaël Berthier, Francis Bach, Nicolas Flammarion, Pierre Gaillard, Adrien Taylor
We introduce the "continuized" Nesterov acceleration, a close variant of Nesterov acceleration whose variables are indexed by a continuous time parameter.
Distributed, Parallel, and Cluster Computing Optimization and Control
no code implementations • 6 Feb 2021 • Oleksandr Zadorozhnyi, Pierre Gaillard, Sebastien Gerschinovitz, Alessandro Rudi
In this work we investigate the variation of the online kernelized ridge regression algorithm in the setting of $d-$dimensional adversarial nonparametric regression.
no code implementations • 13 Nov 2020 • Anant Raj, Pierre Gaillard, Christophe Saad
To the best of our knowledge, this work is the first extension of non-stationary online regression to non-stationary kernel regression.
no code implementations • NeurIPS 2020 • Raphaël Berthier, Francis Bach, Pierre Gaillard
In the context of statistical supervised learning, the noiseless linear model assumes that there exists a deterministic linear relation $Y = \langle \theta_*, X \rangle$ between the random output $Y$ and the random feature vector $\Phi(U)$, a potentially non-linear transformation of the inputs $U$.
1 code implementation • 22 Apr 2020 • Houssam Zenati, Alberto Bietti, Matthieu Martin, Eustache Diemert, Pierre Gaillard, Julien Mairal
Counterfactual reasoning from logged data has become increasingly important for many applications such as web advertising or healthcare.
no code implementations • 14 Apr 2020 • Aadirupa Saha, Pierre Gaillard, Michal Valko
We then study the most general version of the problem where at each round available sets are generated from some unknown arbitrary distribution (i. e., without the independence assumption) and propose an efficient algorithm with $O(\sqrt {2^K T})$ regret guarantee.
no code implementations • 18 Mar 2020 • Rémi Jézéquel, Pierre Gaillard, Alessandro Rudi
We consider the setting of online logistic regression and consider the regret with respect to the 2-ball of radius B.
1 code implementation • 13 Mar 2020 • Camila Fernandez, Chung Shue Chen, Pierre Gaillard, Alonso Silva
In this paper, we make an experimental comparison of semi-parametric (Cox proportional hazards model, Aalen's additive regression model), parametric (Weibull AFT model), and machine learning models (Random Survival Forest, Gradient Boosting with Cox Proportional Hazards Loss, DeepSurv) through the concordance index on two different datasets (PBC and GBCSG2).
1 code implementation • NeurIPS 2019 • Rémi Jézéquel, Pierre Gaillard, Alessandro Rudi
For $d$-dimensional inputs, we provide a (close to) optimal regret of order $O((\log n)^{d+1})$ with per-round time complexity and space complexity $O((\log n)^{2d})$.
no code implementations • 28 Jan 2019 • Margaux Brégère, Pierre Gaillard, Yannig Goude, Gilles Stoltz
We propose a contextual-bandit approach for demand side management by offering price incentives.
no code implementations • 29 May 2018 • Pierre Gaillard, Sébastien Gerchinovitz, Malo Huard, Gilles Stoltz
In the case of sequentially revealed features, we also derive an asymptotic regret bound of $d B^2 \ln T$ for any individual sequence of features and bounded observations.
no code implementations • NeurIPS 2018 • Pierre Gaillard, Olivier Wintenberger
setting, we establish new risk bounds that are adaptive to the sparsity of the problem and to the regularity of the risk (ranging from a rate 1 / $\sqrt T$ for general convex risk to 1 /T for strongly convex risk).
1 code implementation • 22 May 2018 • Raphaël Berthier, Francis Bach, Pierre Gaillard
We develop a method solving the gossip problem that depends only on the spectral dimension of the network, that is, in the communication network set-up, the dimension of the space in which the agents live.
no code implementations • 27 Feb 2017 • Nicolò Cesa-Bianchi, Pierre Gaillard, Claudio Gentile, Sébastien Gerchinovitz
We investigate contextual online learning with nonparametric (Lipschitz) comparison classes under different assumptions on losses and feedback information.
no code implementations • 26 Feb 2015 • Pierre Gaillard, Sébastien Gerchinovitz
We consider the problem of online nonparametric regression with arbitrary deterministic sequences.
no code implementations • 7 May 2014 • Pierre Gaillard, Paul Baudin
We study online prediction of bounded stationary ergodic processes.
no code implementations • 10 Feb 2014 • Pierre Gaillard, Gilles Stoltz, Tim van Erven
We study online aggregation of the predictions of experts, and first show new second-order regret bounds in the standard setting, which are obtained via a version of the Prod algorithm (and also a version of the polynomially weighted average algorithm) with multiple learning rates.
no code implementations • NeurIPS 2012 • Nicolò Cesa-Bianchi, Pierre Gaillard, Gabor Lugosi, Gilles Stoltz
Mirror descent with an entropic regularizer is known to achieve shifting regret bounds that are logarithmic in the dimension.