When
February 18, 2025 | 3:15 pm
–
February 18, 2025 | 4:30 pm
Where
613 Kern Building
Bo Zhou from Virginia Tech will present "Valid Post-Contextual Bandit Inference.".
Abstract
“We establish an asymptotic framework for the statistical analysis of the stochastic contextual multi-armed bandit problem (CMAB), which is widely employed in adaptively randomized experiments across various fields. While algorithms for maximizing rewards, or equivalently, minimizing regret have received considerable attention, our focus centers on statistical inference with adaptively collected data under the CMAB model. To this end we derive the limit experiment (in the Le Cam sense) for CMAB. This limit experiment is nonstandard and, applying Girsanov’s theorem, we obtain a structural representation in terms of stochastic differential equations. This structural representation allows us to easily study size and power properties of commonly used tests to evaluate a single arm and to compare arms. We study classical t-tests, Adaptively Weighted tests and Inverse Propensity Weighted tests. We show that, when comparing both arms, validity of these tests requires the sampling scheme to be translation invariant in a way we make precise. We propose translation-invariant versions of Thompson, tempered greedy and tempered Upper Confidence Bound sampling. Simulation results corroborate our asymptotic analysis.”