no code implementations • 7 May 2024 • Sanath Kumar Krishnamurthy, Susan Athey, Emma Brunskill
However, for a class of estimate-estimand-error tuples, nontrivial high probability upper bounds on the maximum error often require class complexity as input -- limiting the practicality of such methods and often resulting in loose bounds.
no code implementations • 1 Feb 2023 • Sanath Kumar Krishnamurthy, Shrey Modi, Tanmay Gangwani, Sumeet Katariya, Branislav Kveton, Anshuka Rangi
We consider the finite-horizon offline reinforcement learning (RL) setting, and are motivated by the challenge of learning the policy at any step h in dynamic programming (DP) algorithms.
no code implementations • 22 Nov 2022 • Susan Athey, Undral Byambadalai, Vitor Hadad, Sanath Kumar Krishnamurthy, Weiwen Leung, Joseph Jay Williams
We design and implement an adaptive experiment (a ``contextual bandit'') to learn a targeted treatment assignment policy, where the goal is to use a participant's survey responses to determine which charity to expose them to in a donation solicitation.
no code implementations • 30 Mar 2022 • Aldo Gael Carranza, Sanath Kumar Krishnamurthy, Susan Athey
Contextual bandit algorithms often estimate reward models to inform decision-making.
no code implementations • 11 Jun 2021 • Sanath Kumar Krishnamurthy, Adrienne Margaret Propp, Susan Athey
Our algorithm is based on a novel misspecification test, and our analysis demonstrates the benefits of using model selection for reward estimation.
no code implementations • 26 Feb 2021 • Sanath Kumar Krishnamurthy, Vitor Hadad, Susan Athey
Computationally efficient contextual bandits are often based on estimating a predictive model of rewards given contexts and arms using past data.
no code implementations • 25 Oct 2020 • Sanath Kumar Krishnamurthy, Vitor Hadad, Susan Athey
When realizability does not hold, our algorithm ensures the same guarantees on regret achieved by realizability-based algorithms under realizability, up to an additive term that accounts for the misspecification error.
no code implementations • 23 Feb 2020 • Sanath Kumar Krishnamurthy, Susan Athey
We consider a variant of the contextual bandit problem.
no code implementations • 21 Nov 2017 • Siddharth Barman, Arpita Biswas, Sanath Kumar Krishnamurthy, Y. Narahari
We also establish the existence of approximate GMMS allocations under additive valuations, and develop a polynomial-time algorithm to find such allocations.