no code implementations • 4 Jul 2023 • Sarah Sachs, Tim van Erven, Liam Hodgkinson, Rajiv Khanna, Umut Simsekli
Algorithm- and data-dependent generalization bounds are required to explain the generalization behavior of modern machine learning algorithms.
1 code implementation • 1 Jun 2023 • Hidde Fokkema, Damien Garreau, Tim van Erven
Algorithmic recourse provides explanations that help users overturn an unfavorable decision by a machine learning system.
no code implementations • 6 Mar 2023 • Sarah Sachs, Hedi Hadiji, Tim van Erven, Cristobal Guzman
In the fully adversarial case our bounds gracefully deteriorate to match the minimax regret.
no code implementations • 14 Sep 2022 • Thom Neuteboom, Tim van Erven
Hence, we provide a new algorithm, Squint-CE, which is suitable for a changing environment and preserves the properties of Squint.
no code implementations • 31 May 2022 • Hidde Fokkema, Rianne de Heide, Tim van Erven
Finally, we strengthen our impossibility result for the restricted case where users are only able to change a single attribute of $x$, by providing an exact characterization of the functions $f$ to which impossibility applies.
no code implementations • 15 Feb 2022 • Sarah Sachs, Hédi Hadiji, Tim van Erven, Cristóbal Guzmán
case, our bounds match the rates one would expect from results in stochastic acceleration, and in the fully adversarial case they gracefully deteriorate to match the minimax regret.
no code implementations • 11 Feb 2022 • Jack J. Mayo, Hédi Hadiji, Tim van Erven
We follow up on this observation by showing that there is in fact never a price to pay for adaptivity if we specialise to any of the other common supervised online learning losses: our results cover log loss, (linear and non-parametric) logistic regression, square loss prediction, and (linear and non-parametric) least-squares regression.
no code implementations • 5 Jul 2021 • Tim van Erven, Sarah Sachs, Wouter M. Koolen, Wojciech Kotłowski
If the outliers are chosen adversarially, we show that a simple filtering strategy on extreme gradients incurs O(k) additive overhead compared to the usual regret bounds, and that this is unimprovable, which means that k needs to be sublinear in the number of rounds.
no code implementations • 15 Feb 2021 • Dirk van der Hoeven, Hédi Hadiji, Tim van Erven
Each round, an adversary first activates one of the agents to issue a prediction and provides a corresponding gradient, and then the agents are allowed to send a $b$-bit message to their neighbors in the graph.
no code implementations • 12 Feb 2021 • Tim van Erven, Wouter M. Koolen, Dirk van der Hoeven
We provide a new adaptive method for online convex optimization, MetaGrad, that is robust to general convex losses but achieves faster rates for a broad class of special functions, including exp-concave and strongly convex functions, but also various types of stochastic and non-stochastic functions without any curvature.
no code implementations • 14 Jun 2020 • Georgios Vlassopoulos, Tim van Erven, Henry Brighton, Vlado Menkovski
We address this by introducing a new benchmark data set with artificially generated Iris images, and showing that we can recover the latent attributes that locally determine the class.
no code implementations • 27 Feb 2019 • Zakaria Mhammedi, Wouter M. Koolen, Tim van Erven
For MetaGrad, we further improve the computational efficiency of handling constraints on the domain of prediction, and we remove the need to specify the number of rounds in advance.
no code implementations • 21 Feb 2018 • Dirk van der Hoeven, Tim van Erven, Wojciech Kotłowski
A standard introduction to online learning might place Online Gradient Descent at its center and then proceed to develop generalizations and extensions like Online Mirror Descent and second-order methods.
no code implementations • NeurIPS 2016 • Wouter M. Koolen, Peter Grünwald, Tim van Erven
We consider online learning algorithms that guarantee worst-case regret rates in adversarial environments (so they can be deployed safely and will perform robustly), yet adapt optimally to favorable stochastic environments (so they will perform well in a variety of settings of practical importance).
1 code implementation • NeurIPS 2016 • Tim van Erven, Wouter M. Koolen
In online convex optimization it is well known that certain subclasses of objective functions are much easier than arbitrary convex functions.
no code implementations • 9 Jul 2015 • Tim van Erven, Peter D. Grünwald, Nishant A. Mehta, Mark D. Reid, Robert C. Williamson
For bounded losses, we show how the central condition enables a direct proof of fast rates and we prove its equivalence to the Bernstein condition, itself a generalization of the Tsybakov margin condition, both of which have played a central role in obtaining fast rates in statistical learning.
no code implementations • 27 Feb 2015 • Wouter M. Koolen, Tim van Erven
We aim to design strategies for sequential decision making that adjust to the difficulty of the learning problem.
no code implementations • NeurIPS 2014 • Wouter M. Koolen, Tim van Erven, Peter Grünwald
Most standard algorithms for prediction with expert advice depend on a parameter called the learning rate.
no code implementations • 7 May 2014 • Tim van Erven
When I first encountered PAC-Bayesian concentration inequalities they seemed to me to be rather disconnected from good old-fashioned results like Hoeffding's and Bernstein's inequalities.
no code implementations • 10 Feb 2014 • Pierre Gaillard, Gilles Stoltz, Tim van Erven
We study online aggregation of the predictions of experts, and first show new second-order regret bounds in the standard setting, which are obtained via a version of the Prod algorithm (and also a version of the polynomially weighted average algorithm) with multiple learning rates.
no code implementations • 12 Jun 2012 • Tim van Erven, Peter Harremoës
R\'enyi divergence is related to R\'enyi entropy much like Kullback-Leibler divergence is related to Shannon's entropy, and comes up in many settings.