Policy optimization (PO) is a key ingredient of modern reinforcement learning (RL), and can be used for efficient design of optimal controllers. For control design, certain constraints are generally enforced on the policies to be implemented, such as stability, robustness, and/or safety concerns on the closed-loop system. Hence, PO entails, by its nature, a constrained optimization in most cases, which is also nonconvex, and analysis of its global convergence is generally very challenging. Further, another element that compounds the challenge is that some of the constraints that are safety-critical, such as closed-loop stability or the H-infinity (H∞) norm constraint that guarantees system robustness, can be difficult to enforce on the controller while being learned as the PO methods proceed. We have recently overcome this difficulty for a special class of such problems, which I will discuss in this presentation, while also placing this in a broader context.
Specifically, I will introduce the problem of PO for H2 optimal control with a guarantee of robustness according to the H∞ criterion, for both continuous- and discrete-time linear systems. I will argue, with justification, that despite the nonconvexity of the problem, PO methods can enjoy the global convergence property. More importantly, I will show that the iterates of two specific PO methods (namely, natural policy gradient and Gauss-Newton) automatically preserve the H∞ norm (i.e., the robustness) during iterations, thus enjoying what we refer to as “implicit regularization” property. Furthermore, under certain conditions, convergence to the globally optimal policies features globally sub-linear and locally super-linear rates. Due to the inherent connection of this optimal robust control model to risk-sensitive optimal control and linear quadratic (LQ) dynamic games, these results also apply as a byproduct to these settings as well, with however some adjustments. The latter, in particular, entails PO with two agents, and the order in which the updates are carried out becomes a challenging issue, which I will also discuss. The talk will conclude with some informative simulations, and a brief discussion of extensions to the model-free framework and associated sample complexity analyses.
(Based on joint work with Kaiqing Zhang and Bin Hu, UIUC)
Tamer Başar received B.S.E.E. degree from Robert College, Istanbul, in 1969, and M.S., M.Phil, and Ph.D. degrees in engineering and applied science from Yale University, in 1970, 1971 and 1972, respectively. After stints at Harvard University, Marmara Research Institute (Gebze, Turkey), and Bogaziçi University (Istanbul), he joined the University of Illinois at Urbana-Champaign (UIUC) in 1981, where he currently is Swanlund Endowed Chair Emeritus and Center for Advanced Study (CAS) Professor Emeritus of Electrical and Computer Engineering, with also affiliations with the Coordinated Science Laboratory, Information Trust Institute, and Mechanical Science and Engineering. At Illinois, during the period 2014-2020, he was the Director of the Center for Advanced Study; during 2018, he was Interim Dean of the College of Engineering; and during 2008-2010, he was Interim Director of the Beckman Institute for Advanced Science and Technology. He spent sabbatical years at Twente University of Technology (the Netherlands; 1978-79), and INRIA (France; 1987-88, 1994-95).
Dr. Başar has authored or co-authored around 1,000 publications in the general areas of optimal, robust, and adaptive control; large-scale and decentralized systems and control; dynamic games; stochastic control; estimation theory; stochastic processes; information theory; communication systems and networks; social networks; security and trust; and mathematical economics.
He is also a member of the National Academy of Engineering (of the USA), and also carries memberships in several scientific organizations, among which are SIAM, SEDC (Society for Economic Dynamics and Control), ISDG (International Society of Dynamic Games), GTS (Game Theory Society), European Academy of Sciences, and IEEE (Institute of Electrical and Electronics Engineers). He was elected a Fellow of IEEE in 1983, and has served its Control Systems Society in various capacities, among which are: Past President (2001), President (2000), President-Elect (1999), Vice-President for Financial Affairs (1998), Vice-President for Publications (1997), the Editor for Technical Notes and Correspondence for its Transactions on Automatic Control (1992-1994), and as the general chairman (1992) and program chairman (1989) of its flagship conference (Conference on Decision and Control). He has also been active in IFAC (International Federation of Automatic Control) in various capacities, more recently as Chair of Publications Managing Board (2017-2020, 2020-2023), Chair of Publications Committee (2014-2017), member of the IFAC Council (2011-2014), and Editor-in-Chief of its flagship journal Automatica Automatica (2004-2014). During the period 1990-1994, he was the President of the International Society of Dynamic Games (ISDG), and is currently the Series Editor of the Annals of ISDG (published by Birkhäuser), the Series Editor of Systems & Control: Foundations and Applications (published by Birkhäuser), the Series Editor of Static and Dynamic Game Theory: Foundations and Applications, an Editor of SpringerBriefs in Electronic and Computer Engineering: Control, Automation and Robotics, and Honorary Editor of Applied and Computational Mathematics. He is also on the editorial and advisory boards of a number of other international journals. He was the President of the American Automatic Control Council (2010-2011), its Past President in 2012-2013, and Chair of its Awards Committee (2017-2019). Among some of the honors and awards he has received are (in reverse chronological order): Wilbur Cross Medal from Yale University, New Haven, Connecticut (2021); Honorary Doctorate from KTH Royal Institute of Technology, Stockholm, Sweden (2019); Honorary Professorship from Shandong University, Jinan, China (2019); IFAC Advisor (2017–for indefinite term); IEEE Control Systems (Technical Field) Award (2014); Honorary Chair Professorship from Tsinghua University, Beijing, China (2014); Honorary Doctorate (Doctor Honoris Causa) from Bogaziçi University, Istanbul (2012); SIAM Fellow (2012); Honorary Doctorate from the National Academy of Sciences of Azerbaijan (2011); Isaacs Award of ISDG (2010); Honorary Professorship from Northeastern University, Shenyang, China (2008); Swanlund Endowed Chair at UIUC (2007); Honorary Doctorate (Doctor Honoris Causa) from Doguş University, Istanbul (2007); Richard E. Bellman Control Heritage Award of the American Automatic Control Council (2006); Giorgio Quazza Medal of IFAC (2005); Outstanding Service Award of IFAC (2005); IFAC Fellow (2005); Center for Advanced Study Professorship at UIUC (2005); Hendrik W. Bode Lecture Prize of the IEEE Control Systems Society (2004); Tau Beta Pi Daniel C. Drucker Eminent Faculty Award of the College of Engineering of UIUC (2004); election to the National Academy of Engineering (of the USA) (2000); IEEE Millennium Medal (2000); Fredric G. and Elizabeth H. Nearing Distinguished Professorship at UIUC (1998); Axelby Outstanding Paper Award (1995) and Distinguished Member Award (1993) of the IEEE Control Systems Society; and Medal of Science of Turkey (1993).