Abstract: Linear-quadratic (LQ) framework is widely studied in the literature of stochastic control, game theory, and mean-field analysis due to its simple structure, tractable solution, and local approximation power to nonlinear control problems. In this talk, we discuss several theoretical results of the policy gradient (PG) method, a popular reinforcement learning algorithm, for several LQ problems where agents are assumed to have limited information about the stochastic system. In the single-agent setting, we explain how the PG method is guaranteed to learn the global optimal policy. In the multi-agent setting, we show that (a modified) PG method could guide agents to find the Nash equilibrium solution provided there is a certain level of noise in the system. The noise can either come from the underlying dynamics or carefully designed explorations from the agents. Finally when the number of agents goes to infinity, we propose an exploration scheme with entropy regularization that could help each individual agent to explore the unknown system as well as the behavior of other agents. The proposed scheme is shown to be able to speed up and stabilize the learning procedure. The numerical performance of PG methods is demonstrated with two examples, one is the optimal execution problem under the single-agent setting and the other one is the institutional negotiation/bargaining problem under the multi-agent setting. This talk is based on several projects with Xin Guo (UC Berkeley), Ben Hambly (U of Oxford), Huining Yang (U of Oxford), and Thaleia Zariphopoulou (UT Austin).

]]>Abstract: There is a long history of networked dynamical systems that models the spread of opinions over social networks, with the graph Laplacian playing a lead role. One of the difficulties in modeling opinion dynamics is the presence of polarization: not everyone comes to consensus. This talk will describe joint work with Jakob Hansen introducing a new model for opinion dynamics using sheaves of vector spaces over social networks. The graph Laplacian is enriched to a Hodge Laplacian, and the resulting dynamics on discourse sheaves can lead to some very interesting and perhaps more realistic outcomes. Additional work with Hans Riess extending the theory will be hinted at. The talk requires no background in sheaf theory and is suitable for graduate students in the mathematical sciences.

]]>Title: TBA

Abstract: TBA

Abstract: Consider the steady solution to the incompressible Euler equation $Ae_1$ in the periodic tunnel $\Omega=[0,1]\times \mathbb T^2$. Consider now the family of solutions $U_\nu$ to the associated Navier-Stokes equation with no-slip condition on the flat boundaries, for small viscosities $\nu=1/ Re$, and initial values close in $L^2$ to $Ae_1$. Under a conditional assumption on the energy dissipation close to the boundary, Kato showed in 1984 that $U_\nu$ converges to $Ae_1$ when the viscosity converges to 0 and the initial value converge to $A e_1$. It is still unknown whether this inviscid is unconditionally true. Actually, the convex integration method predicts the possibility of a layer separation. It produces solutions to the Euler equation with initial values $Ae_1 $, but with layer separation energy at time T up to: $$\|U(T)-Ae_1\|^2_{L^2}\equiv A^3T.$$

In this work we prove that at the double limit for the inviscid asymptotic $\bar{U}$, where both the Reynolds number $Re$ converges to infinity and the initial value $U_{\nu}$ converges to $Ae_1$ in $L^2$, the energy of layer separation cannot be more than:

$$\| \bar{U}(T)-Ae_1\|^2_{L^2}\lesssim A^3T.$$ Especially, it shows that, even if if the limit is not unique, the shear flow pattern is observable up to time $1/A$. This provides a notion of stability despite the possible non-uniqueness of the limit predicted by the convex integration theory. The result relies on a new boundary vorticity estimate for the Navier-Stokes equation. This new estimate, inspired by previous work on higher regularity estimates for Navier-Stokes, provides a non-linear control scalable through the inviscid limit.