NowhereLog

Linear Quadratic Gaussian Control

June 30, 2023

This post gives a gentle introduction to the control theory and linear quadratic Gaussian control.

Introduction

The subject of control theory is a dynamical system. Let xRn\mathbf{x} \in \R^n denotes the state of the system, then

x˙=f(x)\dot{\mathbf{x}} = f(\mathbf{x})

where f:RnRnf: \R^n \to \R^n is a function, describes the system. For example, a moving particle in a field may be described by its position and velocity, a 2-dimensional space, and the physics of the system determines ff. In general, determining ff is a hard physics problem which will not be the focus of this post. We will consider only linear systems, for which ff is a linear function, or we can simply write the system of equations as

x˙=Ax\dot{\mathbf{x}} = A\mathbf{x}

where ARk×kA \in \R^{k \times k} In reality, ff is almost never linear, so we linearize the system around the fixed points, i.e., where f(x)=0f(\mathbf{x}) = 0, so that we can approximate the system near the fixed points as a linear system with A=DfDxA = \frac{Df}{D\mathbf{x}}, the Jacobian.

To control the system, we assume that we have kk control “knobs” which affects the system linearly,

x˙=Ax+Bu\dot{\mathbf{x}} = A\mathbf{x} + B\mathbf{u}

where BRn×k,uRkB \in R^{n\times k}, \mathbf{u} \in R^k. Here BB is completely determined by the physics of the control knobs, and it is fixed once the system is manufactured.

For linear closed-loop feedback control, we assume u=Kx\mathbf{u} = K\mathbf{x}, where KRn×kK \in \R^{n\times k} is the control law. Therefore, the system of equations becomes

x˙=(ABK)x\dot{\mathbf{x}} = (A-BK)\mathbf{x}

The solution to this system of first-order ODE is

x(t)=TeDtT1x(0)\mathbf{x}(t) = Te^{Dt}T^{-1}\mathbf{x}(0)

where TDT1=ABKTDT^{-1} = A-BK is the spectral decomposition. The stability of the system is determined by the eigen-values of DD.

Linear Quadratic Gaussian

Linear Quadratic Regularizer

Linear Quadratic Regularizer assumes a quadratic cost with the form

(xxˉ)Q(xxˉ)T+uRuT(\mathbf{x}-\bar{\mathbf{x}})Q(\mathbf{x}-\bar{\mathbf{x}})^T + \mathbf{u}R\mathbf{u}^T

where Q,RRn×nQ, R \in \R^{n \times n}, and xˉ\bar{\mathbf{x}} is the desired state.

cartpole Figure 1. Inverted Cart Pole Problem.

For example, consider a inverted cart pole problem, where the state space is x=(x,x˙,θ,θ˙)\mathbf{x} = (x, \dot{x}, \theta, \dot{\theta}). Consider the desired state xˉ=(0,0,π,0)\bar{\mathbf{x}} = (0,0,\pi,0), which is also a fixed point to linearize around. Assume there is no friction, then

A=[010001MmgM0000101Ml(m+M)gMl0]B=[01M01Ml]A = \begin{bmatrix} 0 & 1 & 0 & 0 \\ 0 & -\frac{1}{M} & -\frac{mg}{M} & 0 \\ 0 & 0 & 0 & 1 \\ 0 & -\frac{1}{Ml} & -\frac{(m+M)g}{Ml} & 0 \\ \end{bmatrix} \quad \quad B = \begin{bmatrix} 0 \\ \frac{1}{M} \\ 0 \\ \frac{1}{Ml} \\ \end{bmatrix}

We can define the cost matrices as

Q=[1000010000101000100]R=0.1Q = \begin{bmatrix} 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 10 & 1 \\ 0 & 0 & 0 & 100 \\ \end{bmatrix} \quad \quad R = 0.1

The Linear Quadratic Regularizer finds the optimal control law KK such that

J=0(xxˉ)Q(xxˉ)T+uRuTdtJ = \int_0^\infty (\mathbf{x}-\bar{\mathbf{x}})Q(\mathbf{x}-\bar{\mathbf{x}})^T + \mathbf{u}R\mathbf{u}^T dt

is minimized as long as the system is controllable.

Kalman Filter

In reality, often we cannot measure the full state of the system. The Kalman Filter allows us to measure only a partial state, and recover an estimation of the full state together with the control input. Suppose the measured values are y=Cx\mathbf{y} = C \mathbf{x}, where CC is a matrix and y\mathbf{y} is a smaller vector. The Kalman Filter defines the Kalman Filter solves for the optimla Kalman Filter gain KfK_f to estimate the state x^\mathbf{\hat{x}} from y\mathbf{y}, such that the estimate is updated according the following ODE:

x^˙=Ax^+Bu+Kf(yy^)\dot{\hat{\mathbf{x}}} = A \hat{\mathbf{x}} + B \mathbf{u} + K_f (\mathbf{y}-\hat{\mathbf{y}})

where y^=Cx^\hat{\mathbf{y}} = C \hat{\mathbf{x}}. Intuitively, the Kalman Filter is a closed-loop system for state estimation. Substitue in the actual dynamics

x˙=Ax+Bu\dot{\mathbf{x}} = A\mathbf{x} + B\mathbf{u}

we have

x˙x^˙=A(xx^)+Kf(yy^)=A(xx^)+KfC(xx^)\dot{\mathbf{x}} - \dot{\hat{\mathbf{x}}} = A (\mathbf{x}-\hat{\mathbf{x}}) + K_f (\mathbf{y}-\hat{\mathbf{y}}) = A (\mathbf{x}-\hat{\mathbf{x}}) + K_f C (\mathbf{x}-\hat{\mathbf{x}})

Write ϵ=xx^\epsilon = \mathbf{x} - \mathbf{\hat{x}}, we have

ϵ˙=(AKfC)ϵ\dot{\mathbf{\epsilon}} = (A -K_f C) \mathbf{\epsilon}

Now similar to LQR, we can solve for an optimal KfK_f.

The crucial reason for using the Kalman Filter is that we have disturbance to our system WdW_d and noise to our sensor measurements WnW_n,

x˙=Ax+Bu+Wdy=Cx+Wn\dot{\mathbf{x}} = A\mathbf{x} + B\mathbf{u} + W_d \\ \mathbf{y} = C\mathbf{x}+W_n

The Linear Quadratic Gaussian assumes that WdW_d and WnW_n are Gaussian to find the optimal KK and KfK_f.

References

[1] Steve Brunton, Control Bootcamp, Videos


By NowhereMan who goes nowhere.


© 2024, NowhereLog by NowhereMan