Monday, March 27, 2023 2pm to 3pm
About this Event
3620 South Vermont Avenue, Los Angeles, CA 90089
Yu-Jui Huang, University of Colorado Boulder
In-person
or
Zoom Meeting: https://usc.zoom.us/j/93185392412?pwd=MEdxLzFCMTFKU2pZOURXY1dFd1J2dz09
Meeting ID: 931 8539 2412
Passcode: 117947
Abstract: For a general entropy-regularized stochastic control problem on an infinite horizon, we prove that a policy improvement algorithm (PIA) converges to an optimal relaxed control. Contrary to the standard stochastic control literature, classical Hölder estimates of value functions do not ensure the convergence of the PIA, due to the added entropy-regularizing term. To circumvent this, we carry out a delicate estimation by moving back and forth between appropriate Hölder and Sobolev spaces. This requires new Sobolev estimates designed specifically for the purpose of policy improvement and a nontrivial technique to contain the entropy growth. Ultimately, we obtain a uniform Hölder bound for the sequence of value functions generated by the PIA, thereby achieving the desired convergence result. Characterization of the optimal value function as the unique solution to an exploratory Hamilton-Jacobi-Bellman equation comes as a by-product.
0 people are interested in this event
User Activity
No recent activity