Linear Quadratic Regulator for Continuous-Time Systems

The continuous-time Linear Quadratic Regulator (LQR) extends the optimal control theory to systems described by differential equations. This document presents the complete derivation using the Hamilton-Jacobi-Bellman (HJB) equation approach.

Problem Formulation

System Dynamics

Consider a continuous-time linear time-varying system:

\dot{x}(t) = f(x(t), u(t), t) = A(t)x(t) + B(t)u(t)

Where:

$x(t) \in \mathbb{R}^n$ : state vector at time $t$
$u(t) \in \mathbb{R}^p$ : control input at time $t$
$A(t) \in \mathbb{R}^{n \times n}$ : state transition matrix
$B(t) \in \mathbb{R}^{n \times p}$ : input matrix

Quadratic Performance Function

The objective is to minimize the quadratic cost function:

\begin{aligned}J = h(x(t_f), t_f) + \int_0^{t_f} g(x(\tau), u(\tau), \tau) \, d\tau\end{aligned}

Where:

h(x(t_f), t_f) = \frac{1}{2} x(t_f)^T S x(t_f)

g(x(t), u(t), t) = \frac{1}{2} [x(t)^T Q(t) x(t) + u(t)^T R(t) u(t)]

And $t_f$ is the terminal (final) time.

Weight Matrices

Terminal weight matrix:

$S \in \mathbb{R}^{n \times n}$ : symmetric positive semi-definite

State weight matrix:

$Q(t) \in \mathbb{R}^{n \times n}$ : symmetric positive semi-definite

Input weight matrix:

$R(t) \in \mathbb{R}^{p \times p}$ : symmetric positive definite

Matrix Requirements

$S, Q(t)$ : Positive semi-definite ( $\geq 0$ )
$R(t)$ : Positive definite ( $> 0$ )

These conditions ensure the optimization problem is well-posed and has a unique solution.

Cost-to-Go Analysis

General Cost-to-Go Function

The cost-to-go from any time $t$ to the final time $t_f$ is:

J_{t \to t_f}(x(t), t)= h(x(t_f), t_f) + \int _ t ^ {t_f} g(x(\tau), u(\tau), \tau) d\tau

Incremental Cost Analysis

For a small time increment $\Delta t$ , the cost-to-go can be decomposed as:

\begin{aligned} J_{t \to t_f}(x(t)) = J_{t \to t + \Delta t} + J_{t+\Delta t \to t_f} (x(t + \Delta t)) \end{aligned}

Cost for interval $[t, t+\Delta t]$ :

\begin{aligned}J_{t \to t+\Delta t} &= \frac{1}{2}\int_t^{t + \Delta t} [x(\tau)^T Q(\tau) x(\tau) + u(\tau)^T R(\tau) u(\tau)] \, d\tau \\ &\approx \frac{1}{2}\Delta t [x(t)^T Q(t) x(t) + u(t)^T R(t) u(t)]\end{aligned}

Cost for interval $[t+\Delta t, t_f]$ :

$\begin{aligned}J_{t+\Delta t \to t_f}(x(t+\Delta t)) = \frac{1}{2} x(t+\Delta t)^T P(t+\Delta t) x(t+\Delta t)\end{aligned}$

Optimal Cost-to-Go Structure

Key Insight

The optimal cost-to-go for the LQR problem has a quadratic form:

\begin{aligned}J_{t \to t_f}^*(x(t)) = \frac{1}{2} x(t)^T P(t) x(t)\end{aligned}

where $P(t) = P(t)^T > 0$ is the cost-to-go matrix.

State Evolution

The state at time $t+\Delta t$ can be approximated as:

\begin{aligned}x(t+\Delta t) = x(t) + \Delta t \dot{x}(t) = x(t) + \Delta t [A(t)x(t) + B(t)u(t)]\end{aligned}

Cost-to-Go Matrix Evolution

The cost-to-go matrix also evolves over time:

P(t+\Delta t) = P(t) + \Delta t \dot P(t)

Optimization Using HJB Equation

Complete Cost Expression

Substituting the state evolution and matrix evolution:

\begin{aligned}J_{t \to t_f}^*(x(t)) &= \frac{1}{2}\Delta t [x(t)^T Q(t) x(t) + u(t)^T R(t) u(t)] \\ &\quad + \frac{1}{2}[x(t) + \Delta t(A(t)x(t) + B(t)u(t))]^T \\ &\quad \times [P(t) + \Delta t \dot P (t)] \\ &\quad \times [x(t) + \Delta t(A(t)x(t) + B(t)u(t))]\end{aligned}

Expanding the Quadratic Form

After algebraic manipulation (ignoring second-order terms in $\Delta t$ ):

\begin{aligned} J_{t \to t_f}^*(x(t)) &= \frac{1}{2}x(t)^T P(t) x(t) \\ &\quad + \frac{1}{2}\Delta t [x(t)^T Q(t) x(t) + u(t)^T R(t) u(t) \\ &\quad + x(t)^T \dot P (t) x(t) \\ &\quad + 2x(t)^T P(t) A(t) x(t) + 2x(t)^T P(t) B(t) u(t)] \end{aligned}

Finding the Optimal Control

To minimize the cost, we take the derivative with respect to $u(t)$ and set it to zero:

\begin{aligned}\frac{ \partial J_{t \to t_f}^*(x(t))}{\partial u(t)} = 0\end{aligned}

This gives:

Delta t [u(t)^T R(t) + x(t)^T P(t) B(t)] = 0

Optimal control law:

u^*(t) = -R(t)^{-1} B(t)^T P(t) x(t)

Feedback gain:

K(t) = -R(t)^{-1} B(t)^T P(t)

So the control law is:

u(t) = K(t) x(t)

Riccati Differential Equation

Derivation

Since the optimal cost must satisfy $J_{t \to t_f}^*(x(t)) = \frac{1}{2} x(t)^T P(t) x(t)$ for all time, substituting the optimal control back into the cost expression and equating coefficients yields:

-\dot P(t) = A(t)^T P(t) + P(t) A(t) - P(t) B(t) R(t)^{-1} B(t)^T P(t) + Q(t)

This is the continuous-time Riccati differential equation.

Boundary Condition

The Riccati equation is solved backward in time with the terminal condition:

P(t_f) = S

Alternative Forms

The Riccati equation can also be written as:

\dot P(t) + A(t)^T P(t) + P(t) A(t) - P(t) B(t) R(t)^{-1} B(t)^T P(t) + Q(t) = 0

Or in terms of the closed-loop matrix $A_{closed}(t) = A(t) + B(t)K(t)$ :

\dot P(t) + A_{closed}(t)^T P(t) + P(t) A_{closed}(t) + K(t)^T R(t) K(t) + Q(t) = 0

Time-Invariant Case

Steady-State Solution

For Linear Time-Invariant (LTI) systems where $A(t) = A$ and $B(t) = B$ are constant, and considering the infinite-horizon case ( $t_f \to \infty$ ):

The Riccati matrix converges to a constant:

$\begin{aligned}\lim_{t \to \infty} P(t) = P_\infty\end{aligned}$

Setting $\dot P(t) = 0$ , we obtain the continuous-time algebraic Riccati equation (CARE):

0 = A^T P_\infty + P_\infty A - P_\infty B R^{-1} B^T P_\infty + Q

Optimal Feedback Gain

The steady-state optimal feedback gain is:

K_\infty = -R^{-1} B^T P_\infty

Algorithm Summary

Finite-Horizon Solution

Backward Integration:

Initialize: $P(t_f) = S$
Integrate backward: Solve the Riccati differential equation from $t_f$ to $0$
Compute gains: $K(t) = -R(t)^{-1} B(t)^T P(t)$ for all $t$

Forward Implementation:

Given: Initial state $x(0)$
For $t \in [0, t_f]$ : Apply control $u(t) = K(t) x(t)$

Infinite-Horizon Solution

Offline Computation:

Solve CARE: Find $P_\infty$ satisfying the algebraic Riccati equation
Compute gain: $K_\infty = -R^{-1} B^T P_\infty$

Online Implementation:

Apply constant gain: $u(t) = K_\infty x(t)$ for all $t \geq 0$

Implementation

MATLAB Implementation

function [K, P] = lqr_continuous(A, B, Q, R, S, tf)
% Continuous-time LQR solution
% Inputs: A, B (system matrices), Q, R, S (weight matrices), tf (final time)
% Outputs: K (feedback gains), P (cost matrices)

    % Time vector (backward integration)
    tspan = [tf, 0];
    
    % Initial condition (terminal cost matrix)
    P0 = reshape(S, [], 1);  % Vectorize matrix
    
    % Solve Riccati differential equation backward
    [t, P_vec] = ode45(@(t, P) riccati_ode(t, P, A, B, Q, R), tspan, P0);
    
    % Reshape solution
    n = size(A, 1);
    P = zeros(n, n, length(t));
    K = zeros(size(B, 2), n, length(t));
    
    for i = 1:length(t)
        P(:, :, i) = reshape(P_vec(i, :), n, n);
        K(:, :, i) = -R \ (B' * P(:, :, i));
    end
    
    % Reverse time order (forward time)
    t = flipud(tf - t);
    P = flip(P, 3);
    K = flip(K, 3);
end

function dPdt = riccati_ode(t, P_vec, A, B, Q, R)
    % Riccati differential equation
    n = size(A, 1);
    P = reshape(P_vec, n, n);
    
    % Riccati equation: -dP/dt = A'P + PA - PBR^(-1)B'P + Q
    dP = -(A' * P + P * A - P * B * (R \ (B' * P)) + Q);
    dPdt = reshape(dP, [], 1);
end

Python Implementation

import numpy as np
from scipy.integrate import solve_ivp
from scipy.linalg import solve_continuous_are

def lqr_continuous(A, B, Q, R, S=None, tf=None):
    """
    Continuous-time LQR solution
    """
    n, p = A.shape[0], B.shape[1]
    
    if tf is None:
        # Infinite-horizon case: solve CARE
        P = solve_continuous_are(A, B, Q, R)
        K = -np.linalg.solve(R, B.T @ P)
        return K, P
    else:
        # Finite-horizon case: solve Riccati ODE
        if S is None:
            S = np.zeros_like(Q)
        
        def riccati_ode(t, P_vec):
            P = P_vec.reshape(n, n)
            dP = -(A.T @ P + P @ A - P @ B @ np.linalg.solve(R, B.T @ P) + Q)
            return dP.flatten()
        
        # Solve backward from tf to 0
        t_span = [tf, 0]
        P0 = S.flatten()
        
        sol = solve_ivp(riccati_ode, t_span, P0, dense_output=True)
        
        # Evaluate at desired time points
        t_eval = np.linspace(0, tf, 100)
        P_traj = sol.sol(tf - t_eval).T  # Reverse time
        
        # Compute feedback gains
        K_traj = np.zeros((len(t_eval), p, n))
        P_matrices = np.zeros((len(t_eval), n, n))
        
        for i, P_vec in enumerate(P_traj):
            P_mat = P_vec.reshape(n, n)
            P_matrices[i] = P_mat
            K_traj[i] = -np.linalg.solve(R, B.T @ P_mat)
        
        return K_traj, P_matrices, t_eval

# Example usage for infinite-horizon LQR
A = np.array([[0, 1], [-2, -3]])
B = np.array([[0], [1]])
Q = np.eye(2)
R = np.array([[1]])

K, P = lqr_continuous(A, B, Q, R)
print(f"Optimal gain: {K}")
print(f"Cost matrix: {P}")

Properties and Analysis

1. Stability

For the infinite-horizon case, if $(A, B)$ is controllable and $(A, Q^{1/2})$ is observable, then:

The CARE has a unique positive definite solution $P_\infty$
The closed-loop system $\dot{x} = (A + BK_\infty)x$ is asymptotically stable

2. Optimality

The LQR controller minimizes the quadratic cost function subject to the linear system dynamics.

3. Robustness

Continuous-time LQR controllers have excellent robustness properties:

Gain margin: Infinite at low frequencies, $\geq 6$ dB at all frequencies
Phase margin: $\geq 60°$
Return difference: $\|I + K_\infty (j\omega) G(j\omega)\| \geq 1$ for all $\omega$

4. Frequency Domain Interpretation

The LQR solution can be interpreted as shaping the open-loop transfer function to achieve desired closed-loop characteristics.

Design Guidelines

Weight Matrix Selection

State weights ( $Q$ ):

Diagonal elements penalize deviations in corresponding states
Larger values → tighter regulation
Can be interpreted as inverse of allowable state variance

Input weights ( $R$ ):

Diagonal elements penalize control effort
Larger values → more conservative control
Trade-off between performance and control effort

Terminal weights ( $S$ ):

Often chosen as steady-state solution: $S = P_\infty$
Can be zero for regulation problems

Tuning Process

Start simple: Use identity matrices scaled appropriately
Analyze response: Check settling time, overshoot, control effort
Iterative design: Adjust weights based on performance requirements
Physical constraints: Consider actuator limitations and system physics

Extensions and Applications

1. LQG (Linear Quadratic Gaussian)

Combines LQR with Kalman filter for systems with process noise and measurement noise.

2. Tracking Problems

Modify the cost function to track reference trajectories: $g(x, u, t) = \frac{1}{2}[(x - x_{ref})^T Q (x - x_{ref}) + (u - u_{ref})^T R (u - u_{ref})]$

3. Constrained LQR

Handle input and state constraints using:

Model Predictive Control (MPC)
Barrier functions
Penalty methods

4. Robust LQR

Design controllers that maintain performance under:

Model uncertainties
Parameter variations
Disturbances

Comparison: Discrete vs. Continuous

Aspect	Discrete-Time	Continuous-Time
System	$x_{k+1} = Ax_k + Bu_k$	$\dot{x} = Ax + Bu$
Cost	$\sum_{k=0}^{N-1} (\cdots)$	$\int_0^{t_f} (\cdots) dt$
Riccati	Difference equation	Differential equation
DARE/CARE	$P = A^TPA + Q - A^TPB(R + B^TPA)^{-1}B^TPA$	$0 = A^TP + PA - PBR^{-1}B^TP + Q$
Solution	Backward recursion	Backward integration
Implementation	Direct	Requires integration

References

Continuous-time LQR Notes - Stanford
Wang, T. (2023). 控制之美 (卷2). Tsinghua University Press.
Rakovic, S. V., & Levine, W. S. (2018). Handbook of Model Predictive Control. Birkhäuser.
Anderson, B. D. O., & Moore, J. B. (2007). Optimal Control: Linear Quadratic Methods. Dover Publications.
Lewis, F. L., Vrabie, D., & Syrmos, V. L. (2012). Optimal Control (3rd ed.). John Wiley & Sons.

Problem Formulation​

System Dynamics​

Quadratic Performance Function​

Weight Matrices​

Cost-to-Go Analysis​

General Cost-to-Go Function​

Incremental Cost Analysis​

Optimal Cost-to-Go Structure​

State Evolution​

Cost-to-Go Matrix Evolution​

Optimization Using HJB Equation​

Complete Cost Expression​

Expanding the Quadratic Form​

Finding the Optimal Control​

Riccati Differential Equation​

Derivation​

Boundary Condition​

Alternative Forms​

Time-Invariant Case​

Steady-State Solution​

Optimal Feedback Gain​

Algorithm Summary​

Finite-Horizon Solution​

Infinite-Horizon Solution​

Implementation​

MATLAB Implementation​

Python Implementation​

Properties and Analysis​

1. Stability​

2. Optimality​

3. Robustness​

4. Frequency Domain Interpretation​

Design Guidelines​

Weight Matrix Selection​

Tuning Process​

Extensions and Applications​

1. LQG (Linear Quadratic Gaussian)​

2. Tracking Problems​

3. Constrained LQR​

4. Robust LQR​

Comparison: Discrete vs. Continuous​

References​