Data-driven Dynamic Multi-objective Optimal Control: An Aspiration-satisfying Reinforcement Learning Approach
This paper presents an iterative data-driven algorithm for solving dynamic multi-objective (MO) optimal control problems arising in control of nonlinear continuous-time systems. It is first shown that the Hamiltonian functional corresponding to each objective can be leveraged to compare the performance of admissible policies. Hamiltonian-inequalities are then used for which their satisfaction guarantees satisfying the objectives' aspirations. An aspiration-satisfying dynamic optimization framework is then presented to optimize the main objective while satisfying the aspiration of other objectives. Relation to satisficing (good enough) decision-making framework is shown. A Sum-of-Square (SOS) based iterative algorithm is developed to solve the formulated aspiration-satisfying MO optimization. To obviate the requirement of complete knowledge of the system dynamics, a data-driven satisficing reinforcement learning approach is proposed to solve the SOS optimization problem in real-time using only the information of the system trajectories measured during a time interval without having full knowledge of the system dynamics. Finally, two simulation examples are provided to show the effectiveness of the proposed algorithm.
PDF Abstract