Linear Quadratic Gaussian Control

June 30, 2023

This post gives a gentle introduction to the control theory and linear quadratic Gaussian control.

Introduction
Linear Quadratic Gaussian
- Linear Quadratic Regularizer
- Kalman Filter
References

Introduction

The subject of control theory is a dynamical system. Let $\mathbf{x} \in \R^n$ denotes the state of the system, then

\dot{\mathbf{x}} = f(\mathbf{x})

where $f: \R^n \to \R^n$ is a function, describes the system. For example, a moving particle in a field may be described by its position and velocity, a 2-dimensional space, and the physics of the system determines $f$ . In general, determining $f$ is a hard physics problem which will not be the focus of this post. We will consider only linear systems, for which $f$ is a linear function, or we can simply write the system of equations as

\dot{\mathbf{x}} = A\mathbf{x}

where $A \in \R^{k \times k}$ In reality, $f$ is almost never linear, so we linearize the system around the fixed points, i.e., where $f(\mathbf{x}) = 0$ , so that we can approximate the system near the fixed points as a linear system with $A = \frac{Df}{D\mathbf{x}}$ , the Jacobian.

To control the system, we assume that we have $k$ control “knobs” which affects the system linearly,

\dot{\mathbf{x}} = A\mathbf{x} + B\mathbf{u}

where $B \in R^{n\times k}, \mathbf{u} \in R^k$ . Here $B$ is completely determined by the physics of the control knobs, and it is fixed once the system is manufactured.

For linear closed-loop feedback control, we assume $\mathbf{u} = K\mathbf{x}$ , where $K \in \R^{n\times k}$ is the control law. Therefore, the system of equations becomes

\dot{\mathbf{x}} = (A-BK)\mathbf{x}

The solution to this system of first-order ODE is

\mathbf{x}(t) = Te^{Dt}T^{-1}\mathbf{x}(0)

where $TDT^{-1} = A-BK$ is the spectral decomposition. The stability of the system is determined by the eigen-values of $D$ .

Linear Quadratic Gaussian

Linear Quadratic Regularizer

Linear Quadratic Regularizer assumes a quadratic cost with the form

(\mathbf{x}-\bar{\mathbf{x}})Q(\mathbf{x}-\bar{\mathbf{x}})^T + \mathbf{u}R\mathbf{u}^T

where $Q, R \in \R^{n \times n}$ , and $\bar{\mathbf{x}}$ is the desired state.

Figure 1. Inverted Cart Pole Problem.

For example, consider a inverted cart pole problem, where the state space is $\mathbf{x} = (x, \dot{x}, \theta, \dot{\theta})$ . Consider the desired state $\bar{\mathbf{x}} = (0,0,\pi,0)$ , which is also a fixed point to linearize around. Assume there is no friction, then

A = \begin{bmatrix} 0 & 1 & 0 & 0 \\ 0 & -\frac{1}{M} & -\frac{mg}{M} & 0 \\ 0 & 0 & 0 & 1 \\ 0 & -\frac{1}{Ml} & -\frac{(m+M)g}{Ml} & 0 \\ \end{bmatrix} \quad \quad B = \begin{bmatrix} 0 \\ \frac{1}{M} \\ 0 \\ \frac{1}{Ml} \\ \end{bmatrix}

We can define the cost matrices as

Q = \begin{bmatrix} 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 10 & 1 \\ 0 & 0 & 0 & 100 \\ \end{bmatrix} \quad \quad R = 0.1

The Linear Quadratic Regularizer finds the optimal control law $K$ such that

J = \int_0^\infty (\mathbf{x}-\bar{\mathbf{x}})Q(\mathbf{x}-\bar{\mathbf{x}})^T + \mathbf{u}R\mathbf{u}^T dt

is minimized as long as the system is controllable.

Kalman Filter

In reality, often we cannot measure the full state of the system. The Kalman Filter allows us to measure only a partial state, and recover an estimation of the full state together with the control input. Suppose the measured values are $\mathbf{y} = C \mathbf{x}$ , where $C$ is a matrix and $\mathbf{y}$ is a smaller vector. The Kalman Filter defines the Kalman Filter solves for the optimla Kalman Filter gain $K_f$ to estimate the state $\mathbf{\hat{x}}$ from $\mathbf{y}$ , such that the estimate is updated according the following ODE:

\dot{\hat{\mathbf{x}}} = A \hat{\mathbf{x}} + B \mathbf{u} + K_f (\mathbf{y}-\hat{\mathbf{y}})

where $\hat{\mathbf{y}} = C \hat{\mathbf{x}}$ . Intuitively, the Kalman Filter is a closed-loop system for state estimation. Substitue in the actual dynamics

\dot{\mathbf{x}} = A\mathbf{x} + B\mathbf{u}

we have

\dot{\mathbf{x}} - \dot{\hat{\mathbf{x}}} = A (\mathbf{x}-\hat{\mathbf{x}}) + K_f (\mathbf{y}-\hat{\mathbf{y}}) = A (\mathbf{x}-\hat{\mathbf{x}}) + K_f C (\mathbf{x}-\hat{\mathbf{x}})

Write $\epsilon = \mathbf{x} - \mathbf{\hat{x}}$ , we have

\dot{\mathbf{\epsilon}} = (A -K_f C) \mathbf{\epsilon}

Now similar to LQR, we can solve for an optimal $K_f$ .

The crucial reason for using the Kalman Filter is that we have disturbance to our system $W_d$ and noise to our sensor measurements $W_n$ ,

\dot{\mathbf{x}} = A\mathbf{x} + B\mathbf{u} + W_d \\ \mathbf{y} = C\mathbf{x}+W_n

The Linear Quadratic Gaussian assumes that $W_d$ and $W_n$ are Gaussian to find the optimal $K$ and $K_f$ .

References

[1] Steve Brunton, Control Bootcamp, Videos