LQR for Continuous Systems

Posted Jun 13, 2024 Updated Mar 10, 2025

By Guang-Ze Yang 2 min read

By the HJB equation, we discuss the linear quadratic regulators (LQR) for continuous systems.

Continuous systems

Consider a continuous system as the following equation.

\[\dot{x}_{(t)}=f(x_{(t)},u_{(t)},t)=A_{(t)}x_{(t)}+B_{(t)}u_{(t)}\]

where $x_{(t)}\in\mathbb{R}^n,u_{(t)}\in\mathbb{R}^p$ are system state and input, $A_{(t)}\in\mathbb{R}^{n\times n}$, $B_{(t)}\in\mathbb{R}^{n\times p}$ are state matrices for system.

Cost function

Similar the discrete-time system, we define the quadratic cost function as

\[J=h(x_{(t_{f})},t_{f})+\int_{0}^{t_{f}}g(x_{(\tau)},u_{(\tau)},\tau)\mathrm{d}\tau\]

where the $t_f$ is the terminal time,

\[\begin{aligned}&h(x_{(t_{f})},t_{t})=\frac{1}{2}x_{(t_{f})}^{T}Sx_{(t_{f})},\\&g(x_{(t)},u_{(t)},t)=\frac{1}{2}(x_{(t)}^{T}Q_{(t)}x_{(t)}+u_{(t)}^{T}R_{(t)}u_{(t)}).\end{aligned}\]

$S,Q_{[t]}\in\mathbb{R}^{n\times n}$ are symmetrically positive semi definite matrix as the weight matrix for terminal cost and running cost. $R_{[t]}\in\mathbb{R}^{p\times p}$ is symmetrically positive definite matrix as the weight matrix for input cost.

The cost of anytime $0\le t\le t_f$ is defined as

\[J_{t\to t_{f}} (x_{(t)},t)=h(x_{(t_{f})},t_{f})+\int_{t}^{t_{f}}g(x_{(\tau)},u_{(\tau)},\tau)\mathrm{d}\tau\]

We can obtain the cost for $t+\Delta t\to t_f$ as

\[\begin{aligned} J_{t+\Delta t\to t_f}(x_{(t+\Delta t)},t+\Delta t,u_{(\tau)}) &= h(x_{(t_{f})},t_{f})+\int_{t+_\Delta t}^{t_{f}}g(x_{(\tau)},u_{(\tau)},\tau)\mathrm{d}\tau \\&=\frac{1}{2} x_{(t_{f})}^{T}Sx_{(t_{f})}+\frac{1}{2}\int_{t+\Delta t}^{t_{f}}(x_{(\tau)}^{T}Q_{(\tau)}x_{(\tau)}+u_{(\tau)}^{T}R_{(\tau)}u_{(\tau)})\mathrm{d}\tau \end{aligned}\]

Fact: the optimal cost $J_{t\to t_f}^\ast\left(x_{(t)}\right)$ is quadratic, as

\[J_{t\to t_f}^\ast(x_{(t)})=\frac{1}{2}x_{(t)}^TP_{(t)}x_{(t)}\]

where $P_{(t)}=P_{(t)}^T>0$.

Then, We can also obtain the cost for $t+\Delta t\to t_f$ as

\[\begin{aligned} J^\ast_{t+\Delta t\to t_f}(x_{(t+\Delta t)})&=\frac{1}{2}x_{(t+\Delta t)}^TP_{(t+\Delta t)}x_{(t+\Delta t)} \\&=\frac{1}{2}\left(x_{(t)}+\Delta t(Ax(t)+Bu(t))\right)^T(P_{(t)}+\Delta t \dot P_{(t)} )\left(x_{(t)}+\Delta t(Ax(t)+Bu(t))\right) \end{aligned}\]

And, the cost-to-go for $t\to t+\Delta t$ as

\[\begin{aligned} J_{t\to t+\Delta t}&=\frac{1}{2}\int_{t}^{t+\Delta t}g(x_{(\tau)},u_{(\tau)},\tau)\mathrm{d}\tau \\&=\frac{1}{2}\int_{t}^{t+\Delta t}(x_{(\tau)}^{T}Q_{(\tau)}x_{(\tau)}+u_{(\tau)}^{T}R_{(\tau)}u_{(\tau)})\mathrm{d}\tau \\&=\frac{1}{2}\Delta t(x_{(t)}^{T}Q_{(t)}x_{(t)}+u_{(t)}^{T}R_{(t)}u_{(t)}) \end{aligned}\]

Then, we can rewrite the optimal cost for $t\to t_f$ as

\[\begin{aligned} J_{t\to t_f}^\ast(x_{(t)})&=J_{t\to t+\Delta t}+J^\ast_{t+\Delta t\to t_f}(x_{(t+\Delta t)})\\&=\frac{1}{2}\left[ x_{(t)}^TP_{(t)}x_{(t)}+\right.\Delta t\left(x_{(t)}^TQ_{(t)}x_{(t)}+u_{(t)}^TR_{(t)}u_{(t)} \right. \\&\left.\left. +(Ax_{(t)}+Bu_{(t)})^TP_{(t)}x_{(t)}+x_{(t)}^TP_{(t)}(Ax_{(t)}+Bu_{(t)})+x_{(t)}^TP_{(t)}x_{(t)}\right) \right] \end{aligned}\]

To the minimum of cost $\frac{\partial J^\ast_{t\to t_f}(x_{(t)})}{\partial u_{(t)}}=0$ , we obtain the equation to get optimal $u_{(t)}$ as

\[\Delta t \left(u_{(t)}^TR_{(t)}+x_{(t)}^TP_{(t)}B\right)=0\]

Then, we have the optimal input:

\[u^\ast_{(t)}=-R_{(t)}^{-1}B^TP_{(t)}x_{(t)}\]

The linear state feedback:

\[u(t)=K_{(t)}x_{(t)}, \; K_{(t)}=-R_{(t)}^{-1}B^TP_{(t)}\]

HJB equation

using the optimal input to the the cost for $t\to t_f$ as

\[\begin{aligned} J^\ast_{t\to t_f}(x_{(t)})&=\frac{1}{2}x_{(t)}^TP_{(t)}x_{(t)} +\frac{1}{2}\left(x_{(t)}+\Delta t(Ax(t)+Bu^\ast(t))\right)^T(P_{(t)}+\Delta t \dot P_{(t)} )\left(x_{(t)}+\Delta t(Ax(t)+Bu^\ast(t))\right) \end{aligned}\]

According to the $J_{t\to t_f}^\ast(x_{(t)})=\frac{1}{2}x_{(t)}^TP_{(t)}x_{(t)}$, we have the following equation:

\[\frac{1}{2}x_{(t)}^TP_{(t)}x_{(t)}=\frac{1}{2}x_{(t)}^TP_{(t)}x_{(t)} +\frac{1}{2}\left(x_{(t)}+\Delta t(Ax(t)+Bu^\ast(t))\right)^T(P_{(t)}+\Delta t \dot P_{(t)} )\left(x_{(t)}+\Delta t(Ax(t)+Bu^\ast(t))\right)\]

Then, the equation can be simplified as

\[-\dot P_{(t)}=A^TP_{(t)}+P_{(t)}A-P_{(t)}BR_{(t)}^{-1}B^TP_{(t)}+Q\]

which is the Riccati differential equation for the LQR problem.

In additional, for linear time-invariant systems, the $\dot P_{(t)}=0$ for $t\to\infty$.

Then, we have the algebraic Riccati equation as

\[0=A^TP+PA-PBR^{-1}B^TP+Q\]

Using the condition from systems, we solve it to get the $P_{(t)}$. Then, We have the optimal input.

Reference

clqr.dvi (stanford.edu)
王天威. 控制之美(卷2). 清华大学出版社. 2023.
Rakovic, S. V., & Levine, W. S. Handbook of model predictive control. 2018.

Control Theory, Model Predict Control

This post is licensed under CC BY 4.0 by the author.

Continuous systems

Cost function

HJB equation

Reference

Trending Tags