SPEAKER: Raymond Wong, Associate Professor, Department of Statistics, Texas A&M University
TITLE: "Balancing Weights for Offline Reinforcement Learning"
ABSTRACT: Offline policy evaluation is considered a fundamental and challenging problem in reinforcement learning. In this talk, I will focus on the value estimation of a target policy based on pre-collected data generated from a possibly different policy, under the framework of infinite-horizon Markov decision processes. I will discuss a novel estimator with approximately projected state-action balancing weights for the policy value estimation. These weights are motivated by the marginal importance sampling method in reinforcement learning and the covariate balancing idea in causal inference. Corresponding asymptotic convergence will be presented. Our results scale with both the number of trajectories and the number of decision points at each trajectory. As such, consistency can still be achieved with a limited number of subjects when the number of decision points diverges.
Faculty web page: https://raymondkww.github.io/
Seminar Date/Time: Thursday May 25, 2023 at 4:10pm
Refreshments: 3:#0pm, MSB 1147 (Courtyard)