Inverse Optimal Filtering of Linear Distributed Parameter Systems

A constructive method is developed to design inverse optimal filters to estimate the states of a class of linear distributed parameter systems (DPSs) based on the calculus of variation approach. Inverse optimality guarantees that the cost functional to be minimized is meaningful in the sense that the symmetric and positive definite weighting kernel matrix on the states is chosen after the filter design instead of being specified at the start of the filter design. Inverse optimal design enables that the Riccati nonlinear partial differential equation (PDE) can be simplified to a Bernoulli PDE, which can be solved analytically. The filter design is based on a new Green matrix formula, a new unique and bounded solution of a linear PDE, and analytical solution of a Bernoulli PDE. The inverse optimal filter design is first developed for the case where the measurements are spatially available, then is extended to the practical case where only a finite number of measurements is available.


Introduction
In many physical processes, the dynamical system one wishes to estimate is described by PDEs such as chemical reactors, heat exchangers, transmission lines, vibrating beams and electrical, optical or acoustic waves.All of these are DPSs, which may be subject to bounded disturbances, and the problem of estimating the state from noisy observations is an important engineering one.The optimal filtering design techniques of these systems can be roughly classified into two main approaches.
The first approach referred to as the model approach, see for example [19], [15], [17], [6], to this problem is to obtain an approximate lumped parameter model for a DPS and then to apply the well-known finite-dimensional techniques [1], [8], [10].The modal approach can only observe a finite number of modes of a DPS and has a significant drawback of computing appropriate gain matrices.
The second approach applies semigroup theory to represent PDEs as ODEs in Hilbert spaces.From here the classical optimal filtering results are extended into infinite-dimensional systems [22], [3], [11], [2].This approach eventually results in operator Riccati equations which have similarities to the results presented here.The operator Riccati equations are nonlinear and are to be solved backward in time.These operator Riccati equations are equivalent to Riccati nonlinear PDEs in the Euclidean space.Solutions to the Riccati nonlinear PDEs are extremely rare and are only available for some very simple systems such as in [14], [20], and approximate solutions by an eigenfuncion expansion in [18], [21].The eigenfunction expansion is usually appropriate for linear PDEs but not for nonlinear PDEs like the Riccati nonlinear PDEs.Moreover, there have no formal proof of existence and uniqueness of the symmetric and positive definite solution of these Riccati nonlinear PDEs for a sufficiently large class of linear DPSs, which cover the important practical processes.
Difficulties arisen in solving the Riccati nonlinear PDEs and two-point-boundary value problems, which are resulted from the classical design of optimal filters for distributed parameter systems, motivate the approach of designing inverse optimal filters in this paper.The difference between the direct and the inverse optimal filtering problems is that the former designs a filter that minimizes a given cost, while the latter seeks a filter that minimizes a "meaningful" cost functional, which is a part of the solution of the Bernoulli PDE.The proposed development of the inverse optimal filter design in this paper is related to the development of inverse optimal controllers for systems governed by nonlinear ODEs in [12], [9].The inverse approach in [12], [9] uses a control Lyapunov function, which is a solution of Hamilton-Jacobi-Bellman with a meaningful cost, for systems governed by nonlinear ODEs obtained by solving a stabilization problem.
The rest of the paper is organized as follows.The inverse optimal filtering problem is formulated in Section 2. In Section 3, we derive preliminary results that will be used in the filter design.The calculus of variation approach is utilized to derive the inverse optimal filters in Section 4. Section 5 presents an extension of the filter designed in Section 4 to the practical case where only a finite number of measurements is available.Proof of results are given in Appendices A, B, C, and D.
Notations: For a r × r positive definite matrix A(x, y), the notation A + (x, y) denotes its generalized inverse such that ∫ D A(x, y)A + (y, x ′ )dy = Iδ(x − x ′ ) with I being the r × r identity matrix, and δ(x − x ′ ) being the Dirac delta function of (x − x ′ ).For two vectors a and b with the same size, the notation < a, b > denotes their inner product.For a matrix operator A x , the notation A * x denotes its adjoint.

Problem formulation
Let D be a open bounded set in Euclidean n-space E n with piecewise smooth boundary S, and let t denote time defined on an interval T = [t 0 , t f ] with t f > t 0 .In this paper, we consider a class of DPSs governed by the following linear PDE: where is the r-dimensional vector state; w d (x, t) and w 0 (x) are rdimensional bounded disturbance vectors distributed over the interior; and w b (ξ, t) is a r-dimensional bounded disturbance vector distributed over the boundary.
We assume in addition that the m d -dimensional measurement vector z d (x, t) distributed over the interior and the m b -dimensional measurement vector z b (x, t) distributed over the boundary are available in the form where defined for all ξ ∈ S and t ∈ T , and ε d (x, t) and ε b (ξ, t) are the m d -dimensional and m b -dimensional vectors of bounded measurement disturbances, respectively.In this paper, we impose the following assumption.

Assumption 2.1
1.The matrix operators A x and β ξ are given by where the A ij (x, t), B i (x, t), C(x, t), and F (ξ, t) are r × r matrices, and with n ξ being the outward normal to the boundary S at the point ξ ∈ S, and (n ξ , x i ) being the angle between the outward normal n ξ and the x i -axis.Furthermore, the matrix In the second equation of (3), we have denoted the notation 2. There exist symmetric and positive definite matrices Q + d (x, y, t) and Q + b (ξ, α, t) such that the matrices Qd (x, y, t) and Qb (ξ, α, t) defined by are bounded and symmetric and positive definite.
Filter Design Objective 2.1 Subject to the constraints defined by (1), design an estimate χ(x, t) for χ(x, t) so as to minimize the following cost functional: where with the symmetric and positive definite matrices

Preliminaries
In this section, we derive a matrix Green's formula, a formula for differentiation of a generalized inverse matrix, a derivation of the unique and bounded solution of a linear PDE, and an analytical solution of a Bernoulli nonlinear PDE.These results will be used in the filter design.

Matrix Green's Formula
Lemma 3.1 Consider the matrix differential operators A x and β ξ defined in (3).
where dx = dx 1 dx 2 ...dx n and dS ξ is the surface element of S at the point ξ.The adjoint operators A * x and β * ξ are given by with We have denoted the notation x=ξ in (13) and (14).
Proof.See Appendix A.

Time derivative of the generalized inverse of a matrix
Lemma 3.2 Let A + (x, y, t) denote the generalized inverse of a matrix A(x, y, t), i.e, The time derivative of the generalized inverse of A + (x, y, t) is given by Proof.See Appendix B.

Existence and uniqueness of the solution of a PDE
Lemma 3.3 Consider the following matrix linear PDE: where Āx and βξ are defined in (9), R(x, y, t) and N 0 (x, y) are bounded and symmetric matrices.Then there exists a unique bounded solution N (x, y, t) given by where the Green function G(x, t; x ′ , t 0 ) satisfies Proof.See Appendix C.

Analytical solution of a
where Rd (x, y, t), Rb (ξ, γ, t), and P 0 (x, y) are a symmetric and positive definite matrices, and Āx and βξ are defined in (9).Then, there exists a unique bounded solution P (x, y, t) of ( 20), and P (x, y, t) is given by where and the Green function Proof.See Appendix D.

Distributed measurements
We reformulate the filter design problem as an optimal control to avoid statistical assumptions on the disturbance vectors w d (x, t), w 0 (x), w b (ξ, t), ε d (x, t), and ε b (ξ, t), and to avoid a probabilistic treatment.As such, we desire to minimize the following cost functional J defined in (10) with respect to χ(x, t), u d (x, t), and u b (ξ, t) where we recast the functional L to subject to the constraints To remove the above constraints, we introduce the Lagrange multiplier λ(w, t) with w ∈ D or w ∈ S and construct the following extended functional Thus, the problem of minimizing the cost functional (10) with L given in (24) subject to the constraints (25) is equivalent to the one of minimizing the cost functional without any constraints.Deriving and setting the weak variation δJ 1 , see Appendix E, with respect to χ(x, t), λ(x, t), u d (x, t), and u b (ξ, t) to zero result in the following necessary conditions referred to as the set of Euler-Lagrange equations for the cost functional J 1 to be minimized: (28) The whole system (28) is coupled and is not of two initial-value problems but constitutes a single two-point boundary value (TPBV) problem in the time interval T .Thus, we will convert the TPBV problem to an initial value problem.We define where χ(x, t) and χ(ξ, t) are the state vectors of the system (28) (not of the original system (1)).Substituting (29) into (28) results in where Qd (x, y, t) and Qb (ξ, α, t) are defined in (5).The set of equations (30) suggests that we choose where Ω(x, t) is a vector function to be determined later.The reason of introducing Ω(x, t) is that χ(x, t) and λ(x, t) are not independent so as χ(x, t) and λ(x, t).As such, substituting (32) into (30) results in We now convert (33) and (31) to an initial value problem.As such, the third equation of (31) suggests that we introduce the following coordinate transformation where P (x, y, t) is symmetric and positive semidefinite to be determined so that λ(x, t) satisfies the last four equations of (33).

Determination of P (x, y, t 0 )
The fifth equation of ( 33) and (34) suggest that we choose the initial condition for P (x, y, t) as: It is seen from (35) that λ(x, y, t) satisfies the initial condition specified by the fifth equation of (33).At t = t f , substituting λ(x, t f ) = 0, see the last equation of (33), to (34) results in χ(x, t f ) = 0.This implies that χ(x, t f ) = χ(x, t f ), i.e., χ(x, t) is the exact estimate of χ(x, t) at t = t f .

Determination of β ξ P (ξ, y, t)
We need to determine the condition for β ξ P (ξ, y, t) so that the second equation of (33) satisfies.As such, substituting x by ξ into (34) then applying the operator β ξ to the resulting equation yields Next, applying the formula (7) to the term ∫ S R b (ξ, α, t)λ(α, t)dS α in the second equation of (33) yields Now substituting (37) into the second equation of (33) gives Comparing ( 38) with (36) suggests that we choose the condition for β ξ P (ξ, y, t) as (39)

Determination of P (x, y, t)
We now determine P (x, y, t) so that λ(x, t) satisfies ( 33) and (31).Differentiating both sides of (34) along the solutions of ( 33) and (31) gives where we have used (34).Taking inner product both sides of (40) with λ(x, t) then integrating over D yields (41) Applying Lemma 3.1 to the second integral term in the right hand side of (41) gives Substituting the third equation of ( 33) and (39) into (42) gives Applying the formula (7) to the second term in the right hand side of (43) yields On the other hand, substituting (34) into the right hand side of the first equation of (33) then taking inner product with λ(x, t) and integrating over D yields Since λ(x, t) is a nontrivial solution of the third equation of (33), comparing (45) with substitution of (44) into (41) suggests that we choose the term Ω(x, t) as where the matrices L d (x, y, t) and L b (ξ, t) are defined in Item 3) of Assumption 2.1.
The observer is given by substituting Ω(x, t) defined in ( 46) into (32) as where Āx and βξ are defined in (9).Note that (49) is an initial value problem.

Observer error dynamics and stability analysis 4.2.1 Observer error dynamics
Define the original observer error where χ(x, t) and χ(ξ, t) are now the state vector of the original system (1).It is noted that in (51), χ(x, t) and χ(ξ, t) are the state vectors of the original system (1) (not of the system (28)).Differentiating both sides of the first equation of (51) with respect to t along the solutions of the observer system (49), the original system (1), and the measurement system (2) yields (52)

Lyapunov stability analysis
To analyze stability of the observer error system (52) at the origin, we consider the following Lyapunov functional candidate ⟨ χ e (x, t), P + (x, y, t)χ e (y, t) where P (x, y, t) is the symmetric and positive definite solution of the third equation of (52).A calculation shows that differentiating both sides of (53) with respect time t results in ⟨ χ e (x, t), where which is symmetric and positive definite.Since 2 (D × T ), and ε b (x, t) ∈ L m b 2 (S × T ), an the matrices P + (x, y, t), Qd (x, y, t), Qb (ξ, α, t), and Rd (x, y, t) are symmetric and positive definite, we conclude from (53) and (54) that the observer error vector χ e (x, t) exponentially converges in L 2 norm to a ball centered the origin.The radius of the ball can be made arbitrarily small by a choice of sufficiently large Qd (x, y, t), Qb (ξ, α, t), and R d (x, y, t).Moreover, convergence of χ e (x, t) also implies that of χ e (ξ, t).This can be seen by applying the formula ( 7) to (53).

Discrete measurements
We now address the problem of M d measurements taken at x 1 , x 2 , ...., x M d discrete locations over D and M b measurements taken at ξ 1 , ξ 2 , ...., ξ M b discrete locations over S. As such, the measurement equation ( 2) is changed to Thus, by setting the cost functional L defined in ( 11) is changed to Following the same procedure as in the previous subsection with the cost functional L given in (58) yields the following observer where P (x, y, t) is generated by P (x, y i , t) Qd (y i , y j , t)P (y j , y, t) which has a very similar structure with the observer (49) and (50) but the distributed measurements are replaced by the spatial discrete ones.
The observer error dynamics are obtained by differentiating both sides of the first equation of (51) with respect to t along the solutions of the observer system (59), the original system (1), and the measurement system (56) as follows ) , χ e (x, t 0 ) = w 0 (x), β ξ χ e (ξ, t) = w b (ξ, t). (61)

Lyapunov stability analysis
To analyze stability of the observer error system (61) at the origin, we still consider the Lyapunov functional candidate (53).Differentiating both sides of (53) with respect time t along the solutions of (61) yields χ e (x i , t), Qd (x i , y j , t)χ e (y j , t) − χ e (ξ i , t), Qb (ξ i , α j , t)χ e (y j , t) Convergence of χ e (x, t) and χ e (ξ, t) to a ball centered at the origin can be analyzed using the same arguments as in Subsection 4.2.2 with a note that one should apply the formula (7) to the last integral term in the right hand side of (62).

Conclusions
The calculus of variation approach was used to propose a constructive method to design inverse optimal filters for a class of linear distributed parameter systems.The systems can contain disturbances distributed over the interior and the boundary.Moreover, both spatial continuous and discrete measurements corrupted by disturbances were addressed.The most shining point of the paper is the introduction of inverse optimality concept that relaxes difficulties in solving the Riccati nonlinear PDEs.Novel fundamental results, e.g., the Green matrix formula, the derivation of unique and bounded solution of a linear PDE, and the analytical solution of a Bernoulli PDE, on linear and nonlinear PDEs developed in this paper can be further used for solving other filter design problems for distributed parameter systems and further improve performance of controlling practical distributed parameter systems as as marine riser systems in [4], [5], [16].

B Proof of Lemma 3.2
Differentiating both sides of ( 15) with respect to t gives Multiplying both sides of (70) with A + (x, x ′ , t) then integrating over D yield ∫

C Proof of Lemma 3.3
C.1 Verification of the solution N (x, y, t)

C.1.1 Verification of the initial condition and boundary condition
From (18), we have where we have used the second equation of (19).Thus the initial condition is verified.Also, from (18), we have where we have used the third equation of (19).Thus the boundary condition is verified.

C.1.2 Verification of N (x, y, t) satisfied the first equation of (17)
Differentiating both sides of ( 18) with respect to t gives where we have used the first two equations of (19).Thus the solution N (x, y, t) satisfies the first equation of (17).

D Proof of Lemma 3.4
To prove Lemma 3.4, we first verify that P (x, y, t) given in (21) satisfies the Riccati nonlinear partial differential equation (20).Then, we prove that the solution P (x, y, t) given in ( 21) is unique and bounded.

D.1 Verification of the solution P (x, y, t)
We proceed by proving that P (x, y, t) satisfies the terminal condition, i.e., the second equation in (20), the boundary condition i.e., the third equation in (20), and the first equation (20).
First, to prove that P (x, y, t) satisfies the terminal condition in (20), from (22) and (23), we have Substituting (77) into the third equation of ( 21) at t = t 0 yields Second, to prove that P (x, y, t) satisfies the boundary condition in (20), from the third equation of ( 21), we have where we have used the third equation of (23).Third, to prove that P (x, y, t) satisfies the first equation of ( 20), we differentiating both sides of the third equation of ( 21) with respect to t to obtain G(x, t; x ′ , t 0 ) ] + ∂G T (y, t; y ′ , t 0 ) Āx G(x, t; x ′ , t 0 ) where we have used the first equation of (23), and Applying Lemma 3.2 to (81) arrives at Using the expression of M (x, y, t) defined in (22) gives

D.2 Proof of uniqueness of P (x, y, t)
It is seen from ( 20) that a trivial solution is P (x, y, t) = 0 for all (x, y) ∈ D ×D and t ≥ t 0 .Moreover, since P (x, y, t 0 ) = P 0 (x, y) is symmetric and positive definite, we focus on a non-zero solution of (20).As such, we let P + (x, y, t) be the generalized inverse of P (x, y, t).An application of Lemma 3.2 results in differentiation of P (x, y, t) with respect to t along the solutions of (20):

E Derivation of (28)
Let us introduce an Euler or weak variation δχ(x, t) = εϱ(x, t), where ϱ(x, t) is an arbitrary twice continuously differentiable function.The necessary conditions for the cost functional J 1 to be minimized are found by setting the weak variation δJ 1 of the functional J 1 to be equal to zero.Now assuming that the end points t 0 and t f are not fixed so as when the parameter ε varies so do the end points t 0 and t f , i.e., t 0 = t 0 (ε) and t f = t f (ε).As such, the weak variation δJ 1 is given by