· Inverse Probleme unter advektivem und di usivem Transport von Sto en Masterarbeit -...

Technische Universitat Munchen

Department of Mathematics

Master’s Thesis

Inverse problems governed byconvective-diffusive contaminant transport

by Jakob Ameres

Supervisor: Prof. Dr. Michael Ulbrich

Advisor: Dipl.-Math. Christian Bohm

Submission Date: May 31, 2013

Inverse Probleme unter advektivem unddiffusivem Transport von Stoffen

Masterarbeit- Zusammenfassung -

Inverse Probleme gehoren zu einem schnell wachsenden Bereich der Mathematik, nichtzuletzt wegen ihrer breiten Anwendungsmoglichkeiten in anderen Wissenschaften und derIndustrie. In dieser Arbeit wird ein Anwendungsfall betrachtet, der sich primar auf diePhysik von Advektion und Diffusion stutzt. Hierbei wird die Lokalisierung von Gaswolkenund Quellen als Optimierungsproblem formuliert. Anhand von synthetisierten Messdatensollen so Anfangszustand und rechte Seite (Kontrolle) der Advektions- und Diffusionsgle-ichung rekonstruiert werden.Nach einer kurzen physikalischen Einfuhrung betrachtet man das Problem im Funktio-nenraum. Hierbei ist ein Minimierungsproblem mit besagter partiellen Differentialgle-ichung als Nebenbedingung zu losen. Die gesuchte Losung ist eine Verteilungsfunktion derGaskonzentration im Ort und im Fall einer Quelle auch der Zeit. Im reduzierten Problem,wird die Nebenbedingung implizit durch einen Losungsoperator ersetzt, wodurch man eineunrestringiertes Minimierungsproblem erhalt, welches einfachere Handhabung verspricht.Wichtige Ergebnisse sind die Existenz und Eindeutigkeitsaussagen fur die Losung desOptimierungsproblems, sowie notwendige Optimalitatsbedingungen erster Ordnung. Re-duziert man die gesuchten Informationen auf wenige Parameter, wie Ort und Große einerGaswolke, so lasst sich die Verteilunsfunktion endlichdimensional parametrisieren.Die Ortsdiskretisierung der Advektions- und Diffusionsgleichung wird mit linearen finitenElementen realisiert, wahrend man fur die Zeitdiskretisierung auf das implizite Euler Ver-fahren zuruckgreift. Dadurch erhalt man ein diskretisiertes Minimierungsproblem, welchesim unparametrisierten Fall konvex quadratisch ist. Dieses kann bei geeigneter TikhonovRegularisierung mit Hilfe eines globalisierten Newton Verfahrens gelost werden.Fur numerische Experimente werden mit Hilfe einer Simulation Messdaten synthetisiertund verschiedene Szenarien in zwei und drei Raumdimensionen vorgestellt. Diese reichenvon der Rekonstruktion eines unparametrisierten Anfangszustands bis zur parametrisiertenNachverfolgung des Pfades einer sich bewegenden Gasquelle. Durch die hierbeivorgenommenen Anpassungen und insbesondere durch die neu eingefuhrte Parametrisierungeroffnet sich eine Moglichkeit, die schlecht-Gestelltheit des Problems zu umgehen.Des weiteren beschaftigt sich ein kurzer Ausblick mit der Verbesserung der sparsityeiner Losung, d.h. dem Erhohen der Anzahl der Null-Eintrage in der diskretisiertenVerteilunsfunktion durch zusatzliche `1-Penalisierung.

Acknowledgments

I want to thank my advisor, Christian Bohm, Dipl.-Math., for bringing the topic to myattention. I felt exceptionally well-supported and I am grateful for all the encouragementalong with the enlightening and inspiring discussions.

Contents

1 Introduction 11.1 The Physics of Advection and Diffusion . . . . . . . . . . . . . . . . . . . . 21.2 Diffusion Coefficent κ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31.3 Solutions and Kernel Functions . . . . . . . . . . . . . . . . . . . . . . . . 51.4 Formulating the Reconstruction Problem . . . . . . . . . . . . . . . . . . . 61.5 Velocity Field . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

2 The Functional Analytic Setup and its Theory 122.1 The Solution Operator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122.2 The Reduced Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182.3 Adjoint Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202.4 Second Derivatives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 242.5 The Finite Dimensional Parametrization . . . . . . . . . . . . . . . . . . . 26

3 Discretization 303.1 Finite Element Space Discretization . . . . . . . . . . . . . . . . . . . . . . 303.2 Implicit Euler Time Discretization . . . . . . . . . . . . . . . . . . . . . . . 353.3 Discrete Cost Functional . . . . . . . . . . . . . . . . . . . . . . . . . . . . 363.4 Constant Source . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 393.5 Time Dependent Source . . . . . . . . . . . . . . . . . . . . . . . . . . . . 393.6 Discretization of the Parameterized Problem . . . . . . . . . . . . . . . . . 42

4 Numerical Tests 454.1 Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 464.2 Stationary Incompressible Navier Stokes . . . . . . . . . . . . . . . . . . . 474.3 The Synthesizing of Measurement Data yδ . . . . . . . . . . . . . . . . . . 474.4 Unparameterized Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49

4.4.1 Globalized Newton Method . . . . . . . . . . . . . . . . . . . . . . 494.4.2 Initial Clouds u0 with us ≡ 0 . . . . . . . . . . . . . . . . . . . . . . 504.4.3 Constant Source us ≡ const. . . . . . . . . . . . . . . . . . . . . . . 534.4.4 Time Dependent Source us . . . . . . . . . . . . . . . . . . . . . . . 57

4.5 Parametrization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 614.5.1 Initial Condition u0 . . . . . . . . . . . . . . . . . . . . . . . . . . . 614.5.2 Multiple Initial Sources in u0 . . . . . . . . . . . . . . . . . . . . . 634.5.3 Linear Moving Source us of Constant Intensity over Time . . . . . . 664.5.4 General Parametrized us . . . . . . . . . . . . . . . . . . . . . . . . 67

4.6 Outlook on Sparsity (`1-penalization) . . . . . . . . . . . . . . . . . . . . 694.6.1 SNF and FXP-QA . . . . . . . . . . . . . . . . . . . . . . . . . . . 704.6.2 Application . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73

4.7 Extension Into Three Dimensions . . . . . . . . . . . . . . . . . . . . . . . 78

5 Conclusions and Outlook 84

1 INTRODUCTION 1

1 Introduction

Inverse problems are a fast growing field in Mathematics given, their broad spectrumof applications in the sciences and many industries. The main emphasis of this thesisrevolves around such an application, involving the physics of advection and diffusion.The localizing of the origin of gaseous contaminants is set up as an inverse problem, whichis solved by the apparatus of nonlinear optimization.Let us consider the following experiment from a physicist’s point of view before we delveinto the details of its mathematical treatment.Suppose a closed compartment with several objects inside is filled with air. Numeroussmall fans installed in the compartment’s walls give precise control over the air flowsalong the walls. Also some of the objects are equipped with instruments that measuregas concentrations at their edges. After enough time of constant air flow from the walls,the wind field inside the compartment will reach an equilibrium state. In this equilibriumstate, neither wind speed nor direction will change over time. Only the laws of physicsgovern the characteristics of this equilibrium wind field. Suppose there is a leak fromwhich Methane (CH4) is emitted in a concentrated bulk - a gas cloud. Let nature take itscourse and after we let some time pass, we measure the concentration of the contaminantwith the previously installed measuring devices. After a while the gas will be heavilydiffused and its concentration in the reservoir’s air will be constant almost everywhere.Then it is time to stop measuring.With the acquired data and some knowledge about the general set-up i.e. the contain-ment’s shape and the fans’ operation profile, we want to pinpoint the origin of the leakand how long contaminant has been dispersed in the room.So in a more concrete case one wants to reconstruct a gas source consistent to the mea-surement data, while obeying the laws of advection and diffusion. This involves twoincorporated problems, which a mathematician calls “inverse to each other”. Hence wespeak from an inverse problem.The first section gives a brief overview about the physical context of diffusion and itsmathematical derivation. Additionally the steady state Navier Stokes equations are in-troduced for calculating the air flow in form of a velocity field. Then the inverse problemis set up as a minimization problem. In Section 2 the optimization problem is set up in aninfinite-dimensional setting. We present results relating to the existence and uniqueness ofthe optimization problem’s solution and first order optimality conditions and additionallya finite dimensional parametrization of the infinite dimensional minimization problem.The following Section 3 introduces linear finite elements for space discretization and animplicit Euler scheme for time discretization. Section 4 concludes with numerical testsin form of application studies along with suitable Algorithms. More specifically, we syn-thesize adequate measurement data and construct the velocity field of the contaminants’bulk motion. We present two-dimensional examples in increasing complexity from simpleinitial gas clouds to gas sources following nonlinear paths.We also give an outlook on sparsity enhancing `1-penalization along with two specializedAlgorithms adapted on a selected scenario.In the end, we finish with an experiment in three dimensions. But before we start offwith mathematics, we have a closer look at the physical context.

1 INTRODUCTION 2

1.1 The Physics of Advection and Diffusion

For better understanding we want to derive the advective diffusive transport equation.Diffusion describes the process of the intermixing of different substances into each otherby random thermal motion on a molecular level. This leads to a change of the substances’macroscopic distribution. Let y(x, t) [mol

m3 ] be the density of a certain substance at acertain point (x, t) ∈ Ω × I, Ω ∈ Rn, n = 2, 3, I = [0, T ], T > 0. For every diffusionprocess, whether hetero or mutual, the diffusivity κ [m

2

s] describes how fast the considered

contaminant emerges into its environment. Since diffusion is based on microscopic thermalmovement the diffusion coefficient κ increases with higher temperature. By Fick’s firstlaw from [4, p. 166] the current of a contaminant through a surface S ⊂ Rn−1 of size |S|is given as

Φ = −κ |S| ∇y[mol

s

]respect.

[kg

s

]So the current Φ is proportional to the negative concentration gradient. This law is ratherwritten as flux1 than a current

j =Φi

|S|= −κ∇y,

where Φi denotes the i-the component of Φ. With y being the time dependent contaminantdensity the overall amount of the substance in a volume Ω ∈ Rn at time t is∫

Ω

y(x, t) dx.

With the flux j and the normal vector ~n of the surface S ⊂ Rn−1,∫S

j(ξ, t) · ~n(ξ) dS(ξ)

is the overall current per time through S. Since over time the amount of contaminant ina volume Ω is only changed by the current through the border ∂Ω we know

d

dt

∫Ω

y(x, t) dx = −∫∂Ω

j(ξ, t) · ~n(ξ) dS(ξ) for all t ∈ [0,∞], Ω ⊂ Rn. (1.1)

Note that ~n is pointing outwards the domain Ω and therefore we need the minus signsince contaminant leaving Ω reduces the whole amount. With the Theorem of Gauss andFick’s first law∫

∂Ω

j(ξ, t) · ~n(ξ) dS(ξ) =

∫Ω

div(j) dx =

∫Ω

div(−κ∇y)︸︷︷︸=−κ4y

dx.

Since Ω was free to choose we can loose the integral and write equation (1.1) as thediffusion equation

yt(x, t)− κ4 y(x, t) = 0. (1.2)

1density of the current

1 INTRODUCTION 3

This describes the non-advective flow of the contaminant, which is the flow induced bydiffusion. The advective flow depends on a velocity-field ~v(x) which describes the trans-portation direction of a contaminant. In the following we add the advection to the diffusionequation (1.2). Again by the conservation law, the only thing changing the amount ofcontaminant in Ω is the flow through the margin ∂Ω∫

∂Ω

(y(x, t)~v(x)) · ~n(ξ) dS(ξ) =

∫Ω

div(y(x, t)~v(x)) dx

=

∫Ω

∇y(x, t) · ~v(x) + y(x, t) div(~v(x)) dx

(1.3)

which is rewritten with Gauss’ theorem. Supposing a divergence free velocity field i.e.div(~v(x)) = 0, we can integrate (1.3) in (1.2) and receive the advection-diffusion equation

yt(x, t)− κ4 y(x, t) +∇y(x, t) · ~v(x) = 0. (1.4)

By introducing a right-hand side us(x, t) to equation (1.4) one models a contaminantsource with us(x, t) > 0 and a contaminant sink with us(x, t) < 0. Note that we disregardeventual advection arising from sources and sinks as we suppose they are too small. Withthe initial contaminant distribution u0(x) we complete (1.4) to

yt(x, t)− κ4 y(x, t) +∇y(x, t) · ~v(x) = us(x, t)

y(x, 0) = u0(x) for all x ∈ Ω, t ∈ R+0 .

(1.5)

1.2 Diffusion Coefficent κ

Next we give examples for the calculation of the diffusion coefficient κ, which was untilnow just a proportionality constant in Fick’s first law. Here Cussler [9, pp. 115-159] givesa good overview over simple calculation techniques. In the following we present somenumbers and laws. Table 1 shows some examples for diffusion coefficients in gases at oneatmosphere pressure and some solutes in water at 25. Based on a detailed analysis of the

Gas Pair Temperature [ °K ] [] Diffusion coefficient κ [ cm2

s]

Air-CH4 298.2 25.05 0.196Air-Ethyl 273.0 -0.150 0.102Air-H2 282.0 8.850 0.710Air-H20 289.1 15.95 0.282

298.2 25.05 0.260312.6 39.45 0.277333.2 60.05 0.305

Air-2-propanol 299.1 25.85 0.099Solute in WaterAmmonia 298.15 25 1.64 · 10−5

Ethanol 298.15 25 0.84 · 10−5

Hemoglobin 298.15 25 0.069 · 10−5

Table 1: Various diffusion coefficients for gases and liquids. Data from [9].

1 INTRODUCTION 4

molecular motion in dilute gases the Chapman-Enskog Theory, developed independentlyby Chapman and Enskog in 1970, leads to the equation

κ =1.86 · 10−3 T

32

(1M1

+ 1M2

) 12

ρ σ212 Ω

, σ12 = σ1 + σ2,

where T is the temperature in Kelvin, ρ the pressure in atmospheres and Mi are the twospecies’ molecular weights. The collision diameter given in Angstroms σ12 is calculatedwith σ1, σ2 for the two species with data provided in Table 2. The calculation of thecollision integral Ω is a bit more laborious. First take for the two gases ε1

kB, ε2kB

from Table2. Then Ω is a dimensionless function of values

ε12

kBT=

1

T

√ε1

kB

ε2

kB.

Here, Table 3 provides the mapping Ω : ε12

kBT7→ Ω( ε12

kBT).

Substance σ [A] εkB

[K]

He Helium 2.551 10.2Air Air 3.711 78.6CH3OH Methanol 3.626 481.8CH4 Methane 3.758 148.6CO2 Carbon dioxide 3.941 195.2C2H4 Ethylene 4.163 224.7

Table 2: Lennard-Jones potential parameters found from viscosities. Data from [21].

ε12

kBTΩ ε12

kBTΩ ε12

kBTΩ

0.30 2.662 1.65 1.153 4.0 0.88360.40 2.318 1.75 1.128 4.2 0.87400.50 2.066 1.85 1.105 4.4 0.86520.60 1.877 1.95 1.084 4.6 0.85680.70 1.729 2.1 1.057 4.8 0.84920.80 1.612 2.3 1.026 5.0 0.84220.90 1.517 2.5 0.9996 7 0.78961.00 1.439 2.7 0.9770 9 0.75561.10 1.375 2.9 0.9576 20 0.66401.30 1.273 3.3 0.9256 60 0.55961.50 1.198 3.7 0.8998 100 0.51301.60 1.167 3.9 0.8888 300 0.4360

Table 3: The collision integral Ω. Data from [21].

But all in all this method underlies an error of up to eight percenti.e. for the diffusion coefficient calculated according to the Chapman-Enskog TheoryκChapman−Enskog ∈ [0.92κactual, 1.08κactual].

1 INTRODUCTION 5

1.3 Solutions and Kernel Functions

To broaden our comprehension of the diffusion equation (1.2), also called the heat equa-tion, we take a quick glance on the shape of its solution for the spatial unbounded caseΩ = Rn. We begin with solving explicitly the initial value (or Cauchy) problem

yt(x, t)− κ4 y(x, t) = 0

y(x, 0) = u0(x) for all x ∈ Rn, t ∈ R+0 .

(1.6)

In Problem (1.6) we suppose u0 ∈ C(Rn)∩L∞(Rn) and a non-negative initial state u0 ≥ 0,because gas concentrations are by definition non-negative. Next in analog to [13, p. 46]we define the fundamental solution.

Definition 1.1 The function

Φ(x, t) :=

1√(4πκ)n

e−xT x4κt for x ∈ Rn, t > 0

0 for x ∈ Rn, t < 0

is called the fundamental solution of the diffusion equation.

Note that Φ(x, t) solves the heat equation for x ∈ Rn and t > 0. Therefore the singularityat t = 0 is excluded. Via the convolution of u0 with Φ we set

y(x, t) :=

∫Rn

Φ(x− z, t)u0(z) dz =1√

(4πκ)n

∫Rne−

(x−z)T (x−z)4κt u0(z) dz x ∈ Rn, t > 0.

(1.7)Then by THEOREM 1 [13, pp. 47] y is infinitely often differentiable,i.e. y ∈ C∞(Rn × (0,∞)) and y solves the diffusion equation (1.6) in the sense that

yt(x, t)− κ4 y(x, t) = 0

andlim

(x, t)→ (x0, 0)x ∈ Rn, t > 0

y(x, t) = u0(x0) for each point x0 ∈ Rn.

In the case that u0 is bounded, continuous, u0 ≥ 0, u0 6≡ 0 we know from equation (1.7)that y(x, t) > 0 for all points x ∈ Rn and times t > 0. This shows consistency with thephysical background. Above results reveal that diffusion can be seen as a mollifier. Thismeans that an initial value u0 is mollified by the Gaussian bell Φ. With increasing time t,this bell becomes flatter, leading to more intense mollification and resulting in a smootherdistribution of gas.With this in mind a suitable Ansatz for u0 is

u0(x, t) =

I 1√

(4πκ)ne−

σ4

(xT x) for x ∈ Rn, t > 0

0 for x ∈ Rn, t < 0(1.8)

with the maximum intensity I > 0 and the curve’s sharpness σ > 0. In Section 2.5 thesefindings are used to model real situations more accurately.

1 INTRODUCTION 6

1.4 Formulating the Reconstruction Problem

Before we start formulating the reconstruction problem over an infinite dimensional func-tion space, we have to introduce some notations. First begin with some scalar products.Let vT · w denote the euclidean scalar product between two vectors v, w ∈ Rn. We pro-ceed with the Lebesgue spaces. First we suppose Ω ⊂ Bn where Bn is the σ-algebra of allLebesgue measurable sets on Rn.

Definition 1.2 For a Lebesgue measurable u : Ω→ R we define the following norms.

‖u‖Lp(Ω) :=

(∫Ω

| u(x)|p dx)1/p

for all p ∈ [1,∞)

‖u‖L∞(Ω) := ess supx∈Ω

|u(x)|(

= supx∈Ω|u(x)| for u ∈ C(Ω)

)Here Ω denotes the closure of Ω. Then we define the Lebesgue spaces

Lp(Ω) =p : Ω→ R | p Lebesgue measurable, ‖u‖Lp(Ω) <∞

for 1 ≤ p ≤ ∞. By identifying elements p1, p2 ∈ Lp(Ω) as equal when p1 = p2 almosteverywhere, one defines the space of these equivalence classes as Lp(Ω).For p = 2 the space Lp(Ω) becomes a Hilbert Space with the scalarproduct

〈u1, u2〉L2(Ω) :=

(∫Ω

u1u2 dx

)1/2

for u1, u2 ∈ L2(Ω).

With this in mind we also define the Sobolev spaces Hk(Ω), k ∈ N0 andtheir norm ‖ · ‖Hk(Ω).

Hk(Ω) :=u ∈ L2(Ω) | u has weak derivatives Dαu ∈ L2(Ω) for all |α| ≤ 2

‖u‖Hk(Ω) :=

∑|α|≤k

‖Dαu‖2L2(Ω)

1/2

The corresponding scalarproduct is

〈u1, u2〉Hk(Ω) =

∑|α|≤k

〈Dαu1, Dαu2〉L2(Ω)

1/2

.

We define for a separable Banach space X equipped with norm ‖·‖X and a time dependentLebesgue space for a time interval [0, T ], T > 0, where we have the definition of Bochnerintegrable functions in mind [22, pp. 37-38]. First we set the norm

‖u‖Lp(0,T ;X) :=

(∫ T

0

‖u(t)‖pX dt

)1/p

.

We define then a Banach space valued Lebesgue space as

Lp(0, T ;X) =: u : [0, T ]→ X | u is strongly measurable and ‖u‖Lp(0,T ;X) <∞.

1 INTRODUCTION 7

For a Hilbert space V define

W (0, T ;L2(Ω), V ) := u | u ∈ L2(0, T ;L2(Ω)), ut ∈ L2(0, T, V ∗),

equipped with norm

‖y‖W (0,T ;L2(Ω),V ) =√‖y‖L2(0,T ;V ) + ‖yt‖L2(0,T ;V ∗).

Suppose X is a Banach space with dual space X∗ and x∗ ∈ X∗, x ∈ X, then the dualpairing is defined as

〈x∗, x〉X∗,X = x∗(x).

Suppose Ω ⊂ Rn, n = 2, 3 is open and bounded and has a Lipschitz boundary. Let yδ

represent the measurements on Γm over time I = (T1, T ) ⊂ I, T1 > 0, where Γm ⊂ ∂Ωis a small part of the margin of Ω and I is the time interval I = (0, T ). With this inmind, the space D := L2(T1, T ;L2(Γm)) is set up for the measurements and one supposesyδ ∈ D. For the control space we set U := Us×U0 = L2(0, T ;L2(Ω))×L2(Ω) and supposeu = (us, u0) ∈ U . Let ~v ∈ C1(Ω,Rn) be a velocity field on Ω satisfying div(~v) = 0.A contaminant concentration is always non-negative, so we define the admissible spaceYad = y ∈ Y | y ≥ 0, where we set for now Y = y | y : Rn × [0, T ]→ R. In analog weonly search for contaminant sources, not for sinks and therefore set Uad = u ∈ U : u ≥ 0.We can then formulate a first instance of the problem of finding initial distributions andcontaminant sources matching the given measurements yδ as the following minimizationproblem, constrained by a partial differential equation.

miny,u

J (y, u) :=1

2

∥∥y − yδ∥∥2

D+α

2‖u‖2

U s.t. y ∈ Yad, u ∈ Uad and E(y, u) = 0, (1.9)

where the strong form E(y, u) = 0 is given as

yt − κ∆y + ~v · ∇y = 0 on Ω× [0, T ],

y(0)− u0 = 0 on Ω,

κ∇y · ~n = 0 on ∂Ω× [0, T ].

Or in the more generalized case when including time dependent sources

yt − κ∆y + ~v · ∇y − us = 0 on Ω× [0, T ],

y(0)− u0 = 0 on Ω,

κ∇y · ~n = 0 on ∂Ω× [0, T ].

Mainly one wants to minimize the misfit, defined as the residual norm over the space D∥∥y − yδ∥∥2

D:=

∫ T

T1

∥∥y − yδ∥∥2

L2(∂Ω)dt =

∫ T

T1

∫∂Ω

(y(x, t)− yδ(x, t)

)2dx dt. (1.10)

A minimization problem only consisting of (1.10) and the state equation E(y, u) = 0corresponds to Problem (1.9) for α = 0. So why add the term ‖u‖2

U? Without thisterm the minimization of the misfit, i.e. Problem (1.9) for α = 0, yields in an ill-posedproblem. Here we call a problem ill-posed, when it does not fulfill Hadamard’s definitionof well-posedness [12, p. 31-32]:

1 INTRODUCTION 8

(I) Existence: For all admissible data, a solution exists.

(II) Uniqueness: For all admissible data, the solution is unique.

(III) Stability: The solution depends continuously on the data.

This means a problem is ill-posed in case one of the properties postulated by Hadamarddoes not hold. Of course one can always make a problem well-posed by tailoring theadmissible data or other specifications in a way, that the problem fulfills Hadamard’sdefinition. So why is our Problem 1.9 for α = 0 ill-posed? We can assume that (I) isfulfilled, since measurement data can always and mostly has to be processed and treatedproperly before use. Most certainly (II) is violated after discretization, but also beforewhen only focusing on the convex but not strictly convex residual norm (1.10). Alsoafter discretization the number of measurement points forms only a fractional amountof the number of nodes to reconstruct. Numerical problems are caused by the obviousviolation of (III). A partial cure for this is the introduction of regularization terms suchas α‖u‖2

U . With these one applies additional knowledge about a solution’s propertieswhen solving the problem. So one can recover and retrieve additional information fromthe measurements, which otherwise would have been lost. Later we will see that α‖u‖2

U

for α > 0 ensures uniqueness of a obtained solution, which is the reason why it is soimportant to have proper regularization.

1.5 Velocity Field

Till now we don’t know how exactly the velocity field ~v in the advective part of theadvection diffusion equation (1.5) is obtained. So for the case ~v isn’t provided we haveto calculate it on our own by using information about the setting. We suppose an in-compressible flow in the containment which is modeled by the steady-state Navier-Stokesequation. This means we suppose that the velocity field is in an equilibrium state anddoes not change over time. The incompressible model can be applied for air at roomtemperature and velocities way below supersonic speed. In the incompressible model onesearches for a vector field v : Ω→ Rn and a scalar field q : Ω→ R, which are satisfyingthe steady Navier-Stokes equation (1.11)-(1.13)

− 1

Re4 ~v +∇q + ~v · ∇~v = f on Ω, (1.11)

∇ · ~v = 0 on Ω, (1.12)

~v = g on ∂Ω. (1.13)

The Reynolds number Re describes the turbulent behavior of the liquid, and is set toRe = 100. Since it is just a suitable scaling for the Navier Stokes equation, it can beobtained for different applications by (1.14).

Re =ρUL

µ(1.14)

Here U is a characteristic velocity scale, L is a characteristic length scale, ρ is the density ofthe fluid and µ is its dynamic viscosity. In the above equations (1.11)-(1.13) the following

1 INTRODUCTION 9

definitions are used. Here the multidimensional Laplace operator arises from the standardLaplacian through component-wise application. Here we consider suitable v, w : Ω→ R2.

4v :=

∑n

j=1∂2

∂x2jv1

...∑nj=1

∂2

∂x2jvn

, v · ∇w :=

∑n

j=1 v1∂∂xjw1

...∑nj=1 v2

∂∂xjwn

With the Laplace operator defined, we give the definition of the divergence for a velocityfield ~v ∈ C1(Ω,Rn).

div(~v) := ∇ · ~v =n∑j=1

∂

∂xj~vj

For an introduction and overview of mathematical fluid mechanics refer to Lions’ books[24] which covers the incompressible and compressible models. Here, we are going to useresults from Temam [31]. To proof existence of a solution for the weak form of (1.11)-(1.13) we need additional assumptions on Ω to meet Temam’s requirements. From nowon we suppose that the boundary of Ω, denoted as ∂Ω, is of class C2. For more definitionssee [31, p. 2]. But note that a locally Lipschitz boundary of Ω gives us only class C1. Alsowe suppose boundary values g of the following form

g = curl h, h ∈ H2(Ω), (1.15)

where the curl for a function h : Ω→ R, h ∈ H2(Ω) is with Ω ⊂ R2 defined as

curl h =

(−∂x2h∂x1h

).

For Ω ⊂ R3 and a function h : Ω→ R3, h ∈ H2(Ω)3 define

curl h = ∇× h =

∂x2h3 − ∂x3h2

∂x3h1 − ∂x1h3

∂x1h2 − ∂x2h1

.

With f ∈ H−1(Ω) the conditions h ∈ H2(Ω), ∂xih ∈ Ln(Ω), h ∈ L∞(Ω) from [31, p. 173]reduce to h ∈ H2(Ω) because of the Sobolev Imbedding theorems 2. For this we need theweak form of (1.11)-(1.13). Since the vector field ~v we are searching for, is a vector valuedfunction ~v : Ω → Rn we need for the variational formulation a test function ϕ : Ω → Rn

in the same dimensional setting. For this we define several scalar products.

〈ϕ, ψ〉L2(Ω)n :=n∑i=1

〈ϕi, ψi〉L2(Ω) =n∑i=1

∫Ω

ϕiψi dx (1.16)

With ϕ : Ω→ Rn the gradient ∇ϕ is defined as

∇ϕ :=

∂∂x1ϕ1 . . . ∂

∂xnϕ1

. . . . . .∂∂x1ϕn . . . ∂

∂xnϕn

2see Remark 1.5 (i) [31, p. 178]

1 INTRODUCTION 10

In (1.17) the definition in (1.16) is adapted for ∇ϕ,∇ψ.

〈∇ϕ,∇ψ〉L2(Ω)n×n :=n∑i=1

〈∇ϕi,∇ψi〉L2(Ω)n =n∑i=1

∫Ω

n∑j=1

∂ϕi∂xj

∂ψi∂xj

dx (1.17)

The weak formulation of (1.11) and (1.13) with respect to the test function ϕ : Ω→ Rn,ϕ ∈ H1(Ω) is

1

Re〈∇~v,∇ϕ〉L2(Ω)n×n + 〈∇q, ϕ〉L2(Ω)n + 〈(~v · ∇)~v, ϕ〉L2(Ω)n =

1

Re〈∇~v · ~n, ϕ〉L2(∂Ω). (1.18)

Integration by parts then yields for (1.18) the following

1

Re〈∇~v,∇ϕ〉L2(Ω)n×n − 〈q · ∇, ϕ〉L2(Ω)n + 〈(~v · ∇)~v, ϕ〉L2(Ω)n

=1

Re〈∇~v · ~n, ϕ〉L2(∂Ω)n − 〈qϕ, ~n〉L2(∂Ω)n .

⇔ 1

Re〈∇~v,∇ϕ〉L2(Ω)n×n − 〈q, div(ϕ)〉L2(Ω) + 〈(~v · ∇)~v, ϕ〉L2(Ω)n

=1

Re〈∇~v · ~n, ϕ〉L2(∂Ω)n − 〈qϕ, ~n〉L2(∂Ω)n

With a second test function ρ ∈ H1(Ω) we obtain the weak form of (1.12) as

div(~v) · ρ = 0 for all ρ ∈ H1(Ω).

All in all, the weak form of the steady Navier-Stokes equation is then given as∫Ω

1

Re

n∑i=1

n∑j=1

∂xj~vi∂xjϕi − q div(ϕ) +n∑i=1

ϕi

n∑j=1

~vj∂xi~vj + div(~v)ρ dx = 0

for all ϕi ∈ H1(Ω), i = 1, . . . , n and for all ρ ∈ H1(Ω).

(1.19)

In the following we cite Theorem 1.5. from [31, p. 173] which gives us weak existenceresults for n = 2, 3.

Theorem 1.3 (Existence for non-homogeneous steady Navier-Stokes) Under theabove hypotheses there exists at least one ~v ∈ H1(Ω), and a distribution q on Ω, such that(1.19) holds.

Flow driven by side walls We consider Ω ⊂ R2 to be of circular shape

Ω =

x ∈ R2 |

∥∥∥∥x− (0.50.5

)∥∥∥∥ < 0.5

.

Then for the clockwise spinning flow arising by acceleration of the air at the box’s sidewalls up to speed ϑ > 0, one can set

h(x) = −ϑ(x1 − 2x1x2 + x2).

1 INTRODUCTION 11

For bounded Ω we have obviously h ∈ H2(Ω) and

g(x) = ϑ

(−2x1 + 12x2 − 1

)which fulfills (1.15).When speaking of a clockwise spinning or driven flow, we refer only to the flow verynear the containment’s border. This is the area where we have definite control over theflow. With this knowledge one is then able to calculate the resulting flow in the wholecontainment.

For the later theory we need ~v ∈ H1(Ω)n and div(~v) = 0 in the weak sense. So we supposethat the Assumption 1.4 holds for all later theory.

Assumption 1.4 Suppose that of the following cases for n = 2, 3 holds.

i) The domain Ω is of class C2 and ~v : Ω→ Rn is a weak solution of the Navier Stokesequation (1.11)-(1.13) with ~v ∈ H1(Ω).

ii) The domain Ω is of class C1 and ~v ∈ H1(Ω) describes a divergence free, div(~v) = 0,transportation field on Ω.

2 THE FUNCTIONAL ANALYTIC SETUP AND ITS THEORY 12

2 The Functional Analytic Setup and its Theory

The following section gives a more precise description of the optimization problem (1.9)introduced in the previous Section 1.4. A setup of suitable Banach spaces is chosen toreceive existence and uniqueness of a solution for a modified weak version of the stateequation E(y, u) = 0. Conforming to this setting a reformulation of the minimizationproblem (1.9) is introduced, along with a reduced version. After that, various statementsregarding existence, uniqueness and optimality conditions for a solution of the problemand it’s reduced version are presented. But first start off with solving the state equation.

2.1 The Solution Operator

Set I = (0, T ), T > 0 and ΩT = Ω× I. Additionally suppose thatΓ0 := clx ∈ ∂Ω | ~v(x) · ~n(x) 6= 0, where as always ~n describes the outer normal of ∂Ω.In other words Γ0 ⊂ ∂Ω is the part of the Ω’s margin where the wind ~v points partiallyinwards into the domain or away from the domain. Since this model requires, that nocontaminant enters the domain Ω from the outside or in some way leaves it, one wantsy = 0 on Γ0. In the following set Γ1 := ∂Ω\Γ0, which means ∂Ω = Γ0 ∪ Γ1.

yt − κ∆y + ~v · ∇y = us on Ω× [0, T ], (2.1)

y(0, x) = u0 on Ω, (2.2)

κ∇y · ~n = 0 on Γ1 × [0, T ], (2.3)

y = 0 on Γ0 × [0, T ]. (2.4)

Remark We assumed that div ~v = 0, so we can write (2.1) as yt−∇ · (κ∇y+~v · y) = ussince ∇ · (κ∇y) = κ∆y and ∇ · (~v · y) = (∇ · ~v)︸︷︷︸

=0

y + ~v · ∇y.

We define

V := ϕ ∈ H1(Ω) | ϕ(x) = 0, for almost all x ∈ Γ0 ⊂ H1(Ω) (2.5)

so we can drag the Dirichlet condition (2.4) into the space, in which we search for ourweak solution for problem (2.1)-(2.4).

Remark Since V is a subspace of H1(Ω) we can continue using the H1 scalar product andnorm. If the velocity field is, like in [27] driven by the side walls, one gets ~v(x) · ~n(x) = 0,which yields in Γ0 = ∅. Then we can proceed with the familiar setting V = H1(Ω).

Now we need a space, where we can search for a weak solution of (2.1)-(2.4) and thereforewe state the following definition.

Definition 2.1 (State and Control Space) For the Hilbert space V defined in (2.5),the state space Y is defined in (2.6) with the corresponding scalar product in (2.7).

Y := W (0, T ;L2(Ω), V ) := y | y ∈ L2(0, T ;L2(Ω)), yt ∈ L2(0, T ;V ∗) (2.6)

〈y1, y2〉Y,Y :=

∫ T

0

∫Ω

y1y2 dx dt for all y1, y2 ∈ Y (2.7)


In the following we define the control space U as a product of two Hilbert spaces Us andU0, i.e. U := Us×U0. With the corresponding inner products of Us and U0 given as 〈·, ·〉Usand 〈·, ·〉U0 one defines the inner product of U as

〈u, u〉U := 〈us, us〉Us + 〈u0, u0〉U0 for all u =

(usu0

)∈ U.

Then the norm of U is defined as

‖u‖2U = 〈u, u〉U for all u ∈ U.

In Definition 2.1 the control space U was only defined as a Hilbert space and not explicitlyspecified. To actually control the right hand side of the weak form of (2.1)-(2.4) we haveto introduce an additional mapping.We suppose that we have bounded linear operators

Bs ∈ L(Us, L

2(0, T ;V ∗))

and B0 ∈ L(U0, L

2(Ω)).

With these the operator B is defined as

B :=

(Bs

B0

), B ∈ L

(U, L2(0, T ;V ∗)× L2(Ω)

).

The following theory does not depend on a specific choice of the Hilbert space U , but onlyon the suitable operator B as supposed above. Next, the weak form of (2.2) writes as∫

Ω

yt(t)ϕ dx+ κ

∫Ω

∇y∇ϕ dx− κ∫∂Ω

(∇y · ~n)ϕ dx︸︷︷︸=0

+

∫Ω

~v · ∇yϕ dx =

∫Ω

(Bsus)(t)ϕ dx ∀ϕ ∈ V

⇔∫

Ω

yt(t)ϕ dx+ κ〈∇y,∇ϕ〉L2(Ω) + 〈~v · ∇y, ϕ〉L2(Ω) = 〈(Bsus)(t), ϕ〉L2(Ω) ∀ϕ ∈ V,

for a.a. t ∈ I. (2.8)

Y as defined in (2.6) is a Hilbert space. One then defines the bilinear form a : V ×V → R,which describes the weak form of the advective-diffusive operator as

a(ϕ1, ϕ2; t) := κ〈∇ϕ1,∇ϕ2〉L2(Ω) + 〈~v · ∇ϕ1, ϕ2〉L2(Ω) for all ϕ1, ϕ2 ∈ V, t ≥ 0. (2.9)

Applying the framework given in Definition 2.1 and the bilinear form (2.9) to (2.8) yieldsthe well defined weak form of (2.1)∫

Ω

yt(t)ϕ dx+ a(y, ϕ; t) = 〈(Bsus)(t), ϕ〉L2(Ω) ∀ϕ ∈ V, for a.a. t ∈ [0, T ].


In the context of the setting in [22] we additionally set H = L2(Ω) and leave V as definedin (2.5). We formulate our problem in the context of an Abstract Parabolic EvolutionProblem as in ([22, p. 43]).

〈yt(t), ϕ〉V ∗,V + a(y(t), ϕ; t) = 〈(Bsus)(t), ϕ〉V ∗,V ∀ϕ ∈ V and a.a. t ∈ [0, T ] (2.10)

〈y(0), ϕ〉L2(Ω) = 〈B0u0, ϕ〉L2(Ω) ∀ϕ ∈ V (2.11)

From Theorem 1.14 [22, p. 22] we have the following embeddings. For our setting n ≤ 3.

H1(Ω) → L6(Ω)

∀1 ≤ s ≤ 6 : ∃ Cn ≥ 0 ‖ϕ‖Ls(Ω) ≤ Cn‖ϕ‖H1(Ω)

(2.12)

We use the Trace Theorem from [13, p. 258] and get a linear operator Υ : H1(Ω)→ L2(∂Ω)with

‖Υϕ‖L2(∂Ω) ≤ C‖ϕ‖H1(Ω) for all ϕ ∈ H1(Ω). (2.13)

The next Lemma introduces the Gelfand triple and states that the problem (2.10)-(2.11)fulfills the Assumptions 1.34 from ([22, p. 43]).

Lemma 2.2 With V as defined in (2.5), H = L2(Ω) and a given by (2.9) the followingholds

i) V → H → V ∗ is a Gelfand triple where H, V are separable Hilbert spaces.

ii) a(·, ·; t) : V × V → R is for almost all t ∈ (0, T ) a bilinear form and there areα, β > 0 and γ ≥ 0 with

|a(ϕ, ψ; t)| ≤ α‖ϕ‖V ‖ψ‖V ∀ϕ, ψ ∈ V and a.a. t ∈ (0, T ) (2.14)

a(ϕ, ϕ; t) ≥ β‖ϕ‖2V − γ‖ϕ‖2

H ∀ϕ ∈ V and a.a. t ∈ (0, T ) (2.15)

Proof i) In our setting we have to show

V → L2(Ω) → V ∗.

Since Ω ⊂ Rn we know L2(Ω) to be a separable Hilbert Space. For this see Corollary2.18 and Theorem 2.21 in [1, pp. 31-32]. Also V ⊂ H1(Ω) = W 1,2(Ω) is a Hilbertspace according to Theorem 3.6 in Adam’s book [1, p. 61]. From definition weknow the continues embedding j : H1(Ω) → L2(Ω). So for every ϕ ∈ L2(Ω) onecan define a functional

φ : V 3 ψ 7−→ 〈j(ψ), ϕ〉L2(Ω)

on V and identify the mapping ϕ 7→ φ with j∗ : L2(Ω) −→ V ∗ which gives us thelatter continuous embedding and results in

j j∗

V −→ L2(Ω) −→ V ∗ .


ii) For all calculations remember ‖ϕ‖V = ‖ϕ‖H1(Ω). Our problem is governed by alinear PDE thus it is obvious that a(φ, ψ; t) is bilinear. Using Holder’s Inequalityfrom [13, pp. 622-623] with 1

4+ 1

2+ 1

4= 1 we get for ~v ∈ H1(Ω), ϕ, ψ ∈ V and the

embedding (2.12) the following estimate.

|〈~v · ∇ϕ, ψ〉L2(Ω)| ≤ ‖~v‖L4(Ω)n‖∇ϕ‖L2(Ω)‖ψ‖L4(Ω) ≤ C2‖~v‖H1(Ω)n‖∇ϕ‖L2(Ω)‖ψ‖H1(Ω)

≤ C2‖~v‖H1(Ω)n‖ϕ‖H1(Ω)‖ψ‖H1(Ω) < ∞

At first we show that (2.14) holds, where we use ‖∇ϕ‖L2(Ω) ≤ ‖ϕ‖H1(Ω).

|a(ϕ, ψ; t)| =|κ〈∇ϕ,∇ψ〉L2(Ω) + 〈~v · ∇ϕ, ψ〉L2(Ω)|≤ κ ‖∇ϕ‖L2(Ω)︸︷︷︸

≤‖ϕ‖H1(Ω)

‖∇ψ‖L2(Ω)︸︷︷︸≤‖ψ‖H1(Ω)

+C2‖~v‖H1(Ω)n‖ϕ‖H1(Ω)‖ψ‖H1(Ω)

≤ ‖ϕ‖H1(Ω)‖ψ‖H1(Ω)(κ+ C2‖~v‖H1(Ω)n)

So we get for α = (κ+ C2‖~v‖H1(Ω)n) > 0 equation (2.14) satisfied. First denote

a(ϕ, ϕ; t) = κ〈∇ϕ,∇ϕ〉L2(Ω) + 〈~v · ∇ϕ, ϕ〉L2(Ω)

= κ‖ϕ‖2H1(Ω) − κ‖ϕ‖2

L2(Ω) + 〈~v · ∇ϕ, ϕ〉L2(Ω).(2.16)

So we focus on the last term 〈~v · ∇ϕ, ϕ〉L2(Ω). Since ~v, ϕ ∈ H1(Ω) we can use theWeak Theorem of Gauss3 and perform integration by parts utilizing ∇1

2(ϕ)2 = ϕ∇ϕ

and div(~v) = 0.

〈~v · ∇ϕ, ϕ〉L2(Ω) =

∫Ω

~v(x) · (ϕ(x)∇ϕ(x)) dx =

∫∂Ω

1

2ϕ2~v · ~n dS −

∫Ω

1

2ϕ2 div(~v)︸︷︷︸

=0

dx

=

∫∂Ω

ϕ2︸︷︷︸=0 on Γ0

~v · ~n dS =1

2

∫∂Ω\Γ0

ϕ2 ~v · ~n︸︷︷︸=0 on Γ1=∂Ω\Γ0

dS

= 0

(2.17)

Now we use (2.17) in equation (2.16) and get here

a(ϕ, ϕ; t) = κ‖ϕ‖2H1(Ω) − κ‖ϕ‖2

L2(Ω).

Then for β = κ and γ = κ ≥ 0 equation (2.15) is fulfilled.The mapping t 7→ a(ϕ, ψ; t) is of course measurable because in our case a(ϕ, ψ; t) isnot time-dependent.

Remark In the case ~v doesn’t provide the above needed conditions, especially that y = 0on Γ0 one comes to the following estimate.

〈~v · ∇ϕ+ ϕ,~v · ∇ϕ+ ϕ〉L2(Ω) ≥ 0

⇒ ‖~v · ∇ϕ‖2L2(Ω) + 2 · 〈~v · ∇ϕ, ϕ〉L2(Ω) + ‖ϕ‖2

L2(Ω) ≥ 0

3cf. A6.8 Schwacher Gauß’scher Satz [2, p. 282]


⇒ 〈~v · ∇ϕ, ϕ〉L2(Ω) ≥ −1

2(‖~v‖2

L∞(Ω)n‖ϕ‖2L2(Ω) + ‖∇ϕ‖2

L2(Ω))

= −1

2‖~v‖2

L∞(Ω)n‖ϕ‖2H1(Ω) +

1

2(‖~v‖2

L∞(Ω)n − 1)‖ϕ‖2L2(Ω)

this yields with

a(ϕ, ϕ; t) ≥(κ− 1

2‖~v‖2

L∞(Ω)n

)‖ϕ‖2

H1(Ω) −(κ+ 1− 1

2‖~v‖2

L∞(Ω)n

)‖ϕ‖2

L2(Ω)

So, for diffusion dominated systems one can get with a high enough κ, that(κ− 1

2‖~v‖2

L∞(Ω)n) ≥ 0 and thus the coercivity of a(ϕ1, ϕ2; t).

Remark When we set H = H1(Ω) instead of H = L2(Ω) the results of Lemma 2.2 followas well, without the restriction on ~v in Γ0.

Theorem 2.3 (Energy Estimate) The problem (2.10)-(2.11) has at most one solutiony ∈ W (0, T ;L2(Ω), V ) and satisfies the energy estimate

‖y(t)‖2L2(Ω)+‖y‖2

L2(0,t;V )+‖yt‖2L2(0,t;V ∗) ≤ C

(‖B0u0‖2

L2(Ω) + ‖Bsus‖2L2(0,t;V ∗)

)∀t ∈ (0, T ],

where C > 0 depends only on β and γ from Lemma 2.2.

Proof We use Theorem 1.35 from [22, p. 44]. Its assumptions ‘Assumptions 1.34”,[22, p. 43] hold by Lemma 2.2.

Theorem 2.4 (Existence and Uniqueness) The problem (2.10)-(2.11) has an uniquesolution

y ∈ Y = W(0, T ;L2(Ω), V

).

Proof By Lemma 2.2 we have all necessary conditions for Theorem 1.37 [22, p. 46]which gives us the unique solution for the Abstract Parabolic Evolution Problemrepresented in equations (2.10)-(2.11).

We define now the space Z and recall the definition of the space Y .

Y = W (0, T ;L2(Ω), V ), Z := Zs × Z0 = L2(0, T ;V )× L2(Ω)

We know that Z,D are Hilbert spaces and Y, Z are Banach spaces, so we have the samesetting as in [22, pp. 52-53]. Note that since V and L2(Ω) are Hilbert spaces we get forthe dual space of Z

Z∗ = L2(0, T ;V ∗)× L2(Ω)∗.

This means that Bs ∈ L(Us, Z∗s ) and B0 ∈ L(U0, Z

∗0) and thus B ∈ L(U,Z∗). In this

setting we obtain an well defined operator for the advective diffusive equation in thefollowing Lemma.

Lemma 2.5 The operator A ∈ L(Y, Z∗) with

A : W (0, T ;L2(Ω), V ) −→ L2(0, T ;V ∗)× L2(Ω)∗ = Z∗

Ay : ϕ =

(ϕsϕ0

)7→(〈yt, ϕs〉V ∗,V + κ〈∇y,∇ϕs〉L2(Ω) + 〈∇y · ~v, ϕs〉L2(Ω)

〈y(0), ϕ0〉L2(Ω)

)is well defined and has a bounded inverse.


Proof This result follows directly from the energy estimate in Theorem 2.3 and theexistence and uniqueness Theorem 2.4.

With the definition of the operator A in Lemma 2.5 we are able to rewrite Problem(1.9) with weak constraints. As in Section 1.4 we define the set of admitted solutionsYad := y ∈ Y | y ≥ 0 ⊂ Y as the set of all positive solutions in Y , because we assumedthat a contaminant concentration is positive.To compare a solution y ∈ Y with measurements yδ ∈ D one introduces the trace operatorQ ∈ L(Y,D) as the projection from Y onto D. To allow fine tuning of the regularizationwe add positive definite and symmetric bilinear forms regγ : U ×U → R, with parameters0 < γ ∈ Λ combined in the additional regularization term

reg : U × U → R, reg =∑γ∈Λ

γregγ

to the cost functional and set α > 0 as a default value. In the following we proceed withthe functional analytical treatment of the minimization problem (2.18) without controlconstraints.

min(y,u)∈Y×U

J(y, u) :=1

2

∥∥Qy − yδ∥∥2

D+α

2‖u‖2

U +1

2reg(u, u)

subject to e(y, u) := Ay −Bu = 0, y ∈ Yad(2.18)

Problem (2.18) is state constrained, which adds additional difficulty to the solving process.Next check some conditions by the following Lemma to meet the Assumptions 1.42 of[22, p. 53].

Lemma 2.6 For our Problem the following holds

i) U is convex and closed.

ii) Yad ⊂ Y is convex and closed, and Problem (2.18) has a feasible point y ∈ Yad.

iii) A ∈ L(Y, Z∗) has a bounded inverse.

Proof i) The space U is a convex, nonempty, and closed Hilbert space by definition.

ii) The space Yad := y ∈ Y | y ≥ 0 is convex and closed by definition. Problem(2.18) has the feasible point (y, u) = 0. The space Yad := y ∈ Y | y ≥ 0 is convexand closed by definition.

iii) Is the main conclusion of Lemma 2.5 which treats the bounded inverse of A.

Now the stage is set for the first important theorem regarding problem (2.18).

Theorem 2.7 With all given assumptions, the problem (2.18) has an unique solution(y, u).


Proof Since the additional bilinear form reg doesn’t change the general setting, we canuse Theorem 1.43 from [22, p. 53]. Its assumptions Assumptions 1.42, [22, p. 53] holdby Lemma 2.6. Since α > 0 we get uniqueness. Note that for α = 0 and a suitableregularization reg with γ > 0, ∀γ ∈ Λ also yields boundedness of a minimizationsequence, which also leads to the uniqueness of the solution (y, u).

Non-Negativity of Solutions Since the problem is arising from an advection diffusionequation, related to the Fokker Planck equation, a natural property is the positivity of thecontaminant concentration. For positiveness of PDEs refer to [8]. We will be using resultsfrom [11]. One can show under certain assumptions, that the strong form (2.1)-(2.4) hasa non-negative solution. First we define the positive cone

K+ =y : Ω→ R | y ∈ L2(Ω), y ≥ 0 a.a.t ∈ (0, T )

as the set of non-negative functions. With κ > 0 and ~v(x) =

v1...vn

, where we have to

assume that v1, . . . , vn ∈ C1(Rn,R). Also for the control us we assumeus ∈ C1(Rn, R), us ≥ 0, u0 ∈ K+, u0

∣∣δΩ

= 0. Then Assumptions 2.1. and all assumptionsof Theorem 2.3. in [11, pp. 2-4] are fulfilled, which yields y(t) ∈ K+ for all t > 0.

For the following theory on optimality conditions we drop the restriction on the state andset Yad = Y . So far we set up a well-defined optimization problem with weak constraints,and gained proof for existence and uniqueness of the solution. The next section dealswith the constraints and focuses on optimality conditions.

2.2 The Reduced Problem

To make things more handy we can, according to Lemma 2.6, define the following con-tinuous and affine linear solution operator.

U 3 u 7→ y(u) := A−1 (Bu) ∈ Y

Following [22, p. 70], we want to check some conditions on our problem by the followingLemma.

Lemma 2.8 Problem (2.18) fulfills the following assumptions.

i) J : Y × U −→ R and e : Y × U −→ Z∗ are continuously Frechet differentiable andU, Y, Z∗ are Banach spaces.

ii) For all u ∈ U , the state equation e(y, u) = 0 has a unique solution y = y(u) ∈ Y .

iii) ey(y(u), u) ∈ L(Y, Z∗) has a bounded inverse for all u ∈ U .

Proof i) Since D and U are Hilbert Spaces, we can write

J(y, u) =1

2〈Qy − yδ, Qy − yδ〉D +

α

2〈u, u〉U +

1

2reg(u, u).


The objective functional J : Y × U → R as a sum of inner products is obviouslyF-differentiable. Utilizing the linearity of the scalar product one gets the partialderivatives of J and e.

Jy(y, u) ∈ V ∗, Jy(y, u)δy = 〈Qy − yδ, Qδy〉D ∀δy ∈ D,Ju(y, u) ∈ U∗, Ju(y, u)δu = α〈u, δu〉U + reg(u, δu) ∀δu ∈ U

So we get for the linear Operator

e : (y, u) ∈ Y × U 7→(yt + a(y, ·)−Bsus

y(0)−B0u0

)∈ Z∗

the following derivatives.

ey(y, u) ∈ L(Y, Z∗) ey(y, u)δy =

(δyt + a(δy, ·)

δy(0)

)eu(y, u) ∈ L(U,Z∗) eu(y, u)δu = −Bδu =

(−Bsδus−B0δu0

)(2.19)

The operator e : Y × U → Z∗, as a sum of bounded linear and bilinear operatorsis infinitely Frechet differentiable according to Lemma 1.16 [22, p. 92]. Since theadvection diffusion equation is linear we know that

e(k)(y, u) = 0 for all y ∈ Y, u ∈ U, k ≥ 2.

ii) The state equation e(y, u) = 0 is written as Ay = u. From Lemma 2.5 we now thatA has a bounded inverse. This automatically means that we get for every u ∈ Uand thus Bu ∈ Z∗ the unique solution as y = A−1Bu.

iii) We know from before, that ey(y(u), u) = A since e(y, u) is linear in y. So fromLemma 2.5 we know, that ey(y, u) has a bounded inverse for all y ∈ Y, u ∈ U andtherefore for all y = y(u).

It is quite unhandy to handle a minimization problem constrained by a partial differentialequation. To get rid of this dependency one introduces the reduced problem.

Definition 2.9 (Reduced Problem) The reduced objective functional is

J(u) := J(y(u), u) (2.20)

and thus the reduced problem isminu∈U

J(u)

Note that y(u) is well defined, because e(y, u) has an unique solution for each u ∈ U .


2.3 Adjoint Approach

To calculate the derivative of the reduced objective functional (2.20) one defines theLagrange function L : Y × U × Z → R,

L(y, u, p) = J(y, u) + 〈e(y, u), p)〉Z∗,Z

we now insert y = y(u) and get for p ∈ Z

J = J(y(u), u) + 〈e(y(u), u), p)〉Z∗,Z = L(y(u), u, p)

Differentiating this we get

〈J ′(u), δu〉U∗,U = 〈Ly(y(u), u, p), y′(u)δu〉Y ∗,Y + 〈Lu(y(u), u, p), δu〉U∗,U ∀δu ∈ U (2.21)

To loose the term y′(u)s one chooses a p = p(u) such that the first summand vanishes,meaning

Ly(y(u), u, p(u)) = 0,

which unfolds to

〈Ly(y(u), u, p), d〉Y ∗,Y =〈Jy(y(u), u), d〉Y ∗,Y + 〈ey(y(u), u)d, p〉Z∗,Z=〈Jy(y(u), u), d〉Y ∗,Y + 〈ey(y(u), u)∗p, d〉Y ∗,Y = 0 ∀d ∈ Y.

Hence we get the so called adjoint equation as

−〈Jy(y(u), u), d〉Y ∗,Y = 〈ey(y(u), u)∗p, d〉Y ∗,Y ∀d ∈ Y.

Plugging in the results from Lemma 2.8 and with p = (ps, p0)T ∈ Z we obtain the adjointequation (2.22) for our problem.

−〈Qy − yδ, Qd〉D = +〈dt, ps〉Z∗s ,Zs + 〈a(d, ·), ps〉Z∗s ,Zs + 〈p0, d(0)〉L2(Ω) ∀d ∈ Y (2.22)

We now choose p = p(u) ∈ Z as given in (2.22) and get a representation for the gradientof the reduced objective function J .

J ′(u) = 0 + Lu(y(u), u, p(u)) = Ju(y(u), u) + eu(y(u), u)∗p(u)

Again, we plug in prior results, where we variate over δu = (δus, δu0) ∈ U ,

〈Ju(y(u), u), δu〉U∗,U + 〈eu(y(u), u)∗p(u), δu〉U∗,U= α〈u, δu〉U + reg(u, δu)− 〈Bsδus, ps(u)〉Z∗s ,Zs − 〈B0δu0, p0(u)〉Z∗0 ,Z0 ∀δu ∈ U,

which gives us a representation for J ′(u). But this result depends on an adjoint statep = p(u), which solves the adjoint equation (2.22). This state p exists according to thefollowing Lemma.

Lemma 2.10 (Adjoint State) The adjoint equation

−〈Qy − yδ, Qd〉D = 〈dt, ps〉Z∗s ,Zs + 〈a(d, ·), ps〉Z∗s ,Zs + 〈p0, d〉L2(Ω) ∀d ∈ Y

has a solution p ∈ Z, which fulfills with p0 = ps(0)

−〈Qy − yδ, Qd〉D =〈(ps)t, d〉L2(0,T ;L2(Ω)) + κ〈∇ps,∇d〉L2(0,T ;L2(Ω))

− 〈∇ps · ~v, d〉L2(0,T ;L2(Ω)) + 〈ps(T ), d(T )〉L2(Ω) ∀d ∈ Y(2.23)


Proof As first step one uses the definition of the bilinear form a.

−〈Qy − yδ, Qd〉D =

∫ T

0

〈dt(τ), ps(τ)〉V ∗,V + κ〈∇d(τ),∇ps(τ)〉V ∗,V

+ 〈∇d(τ) · ~v, ps(τ)〉V ∗,V dτ + 〈d(0), p0〉L2(Ω) ∀d ∈ Y(2.24)

One can now show with integration by parts4 in time, assuming ps ∈ Y , that∫ T

0

〈dt(τ), ps(τ)〉V ∗,V dτ

= −∫ T

0

〈(ps)t(τ), d(τ)〉V ∗,V dτ + 〈ps(T ), d(T )〉L2(Ω) − 〈ps(0), d(0)〉L2(Ω). (2.25)

Then consider the equation (2.26)

∇ · (ps~v) = ∇ps · ~v + ps div(~v)︸︷︷︸=0

= ∇ps · ~v (2.26)

and use integration by parts [2] in Ω∫ T

0

〈∇d(τ) · ~v, ps(τ)〉V ∗,V dτ =

∫ T

0

∫Ω

(∇d · ~v)ps dx dt =

∫ T

0

∫Ω

∇d · (~vps) dx dt

=

∫ T

0

∫∂Ω

(ps~v · ~n)d dS dt−∫ T

0

∫Ω

∇ · (~vps)d dx dt

=

∫ T

0

∫∂Ω

(ps~v · ~n)d dS dt−∫ T

0

∫Ω

∇ps · ~v d dx dt.

(2.27)

In (2.28) we apply results from equations (2.27) and (2.25) on the adjoint equation (2.24).

−∫ T

T1

∫Γm

(y − yδ)ϕ dx dt =−∫ T

0

∫Ω

(ps)tϕ dx dt+ 〈ps(T ), ϕ(T )〉L2(Ω)

− 〈ps(0), ϕ(0)〉L2(Ω) +

∫ T

0

∫Ω

κ∇ps∇ϕ dx dt

+

∫ T

0

∫∂Ω

(ps~v · ~n)ϕ dS dt−∫ T

0

∫Ω

∇ps · ~vϕ dx dt

+ 〈ϕ(0), p0〉L2(Ω) ∀ϕ ∈ Y

(2.28)

The initial conditions are isolated from (2.28) and put together in (2.29).

〈ϕ(0), p0〉L2(Ω) − 〈ps(0), ϕ(0)〉L2(Ω) = 〈p0 − ps(0), ϕ(0)〉L2(Ω) ∀ϕ ∈ Y (2.29)

Note that (2.29) is equivalent to (2.30), because of the definition of Y .

〈ϕ, p0〉L2(Ω) − 〈ps(0), ϕ〉L2(Ω) = 〈p0 − ps(0), ϕ〉L2(Ω) ∀ϕ ∈ L2(Ω) (2.30)

4see. Theorem 1.32 [22, p. 40]


We now obtain from (2.29) p0 = ps(0). Splitting off (2.29) from (2.28) one gets in (2.31)an equivalent of equation (2.23).

−∫ T

T1

∫Γm

(y − yδ)ϕ dx dt =−∫ T

0

∫Ω

(ps)tϕ dx dt+

∫ T

0

∫Ω

κ∇ps∇ϕ dx dt

−∫ T

0

∫Ω

∇ps · ~vϕ dx dt+

∫ T

0

∫∂Ω

(ps~v · ~n)ϕ dS dt

+

∫Ω

ps(T )ϕ(T ) dx ∀ϕ ∈ Y

(2.31)

The strong form of equation (2.31) is written in the system (2.32)-(2.35).

−(ps)t − κ4 ps −∇ps · ~v = 0 on Ω× [0, T ], (2.32)

ps(T ) = 0 on Ω, (2.33)

(~vps + κ∇ps) · ~n = −(y − yδ) on Γm × [T1, T ], (2.34)

(~vps + κ∇ps) · ~n = 0 on ∂Ω× [0, T ] \ Γm × [T1, T ]. (2.35)

This is nothing else than a backward time advective diffusive equation. To see this weset t = T − t and p(t) = ps(t). So for the first time derivative we get a small change thatmeans p(t)t = pt(T − t)(−1) = −p(t).

pt − κ4 p+∇− ~v · p = 0 on Ω× [0, T ], (2.36)

p(0) = 0 on Ω, (2.37)

(~vp+ κ∇p) · ~n = −(y − yδ) on Γm × [0, T − T1], (2.38)

(~vp+ κ∇p) · ~n = 0 on ∂Ω× [0, T ] \ Γm × [0, T − T1]. (2.39)

Now we just have to proof that the weak form of (2.36)-(2.39) given in (2.40) has asolution.

−∫ T

T1

∫Γm

(y − yδ)ϕ dx dt =

∫ T

0

〈(ps)t(t), ϕ(t)〉V ∗,V dt+

∫ T

0

∫Ω

κ∇ps∇ϕ dx dt

−∫ T

0

∫Ω

∇ps · ~vϕ dx dt+

∫ T

0

∫∂Ω

(ps~v · ~n)ϕ dS dt

+ 〈ps(0), ϕ(0)〉L2(Ω) ∀ϕ ∈ L2(0, T ;V )

(2.40)

For this we utilize the results of Theorem 2.4 for the negated velocity field −~v, whichnow also fulfills the most important property namely Γ0 = x ∈ ∂Ω | − ~v · ~n = 0. Thebilinear form a from (2.9) is now slightly modified with respect to the negated velocityfield −~v which yields

a(ϕ, ψ; t) := κ〈∇ϕ,∇ψ〉L2(Ω) + 〈−~v · ∇ϕ, ψ〉L2(Ω) for all ϕ, ψ ∈ V, t ≥ 0.

It is already known that a is bounded and coercive, which extends to a, because of (2.41).Here the same properties as in (2.17) are used.

〈ϕ(−~v) · ~n, ψ︸︷︷︸=0 on Γ0

〉L2(Γm) = 〈ϕ (−~v) · ~n︸︷︷︸=0 on Γ1⊃(ΓmΓ0)

, ψ〉L2(ΓmΓ0) = 0 (2.41)


To use the same results as in the original proof, we define for the right hand side of (2.10)a functional f ∈ L2(0, T, V ∗), such that

〈f(t), ψ〉V ∗,V := 〈(Qy)(t)− yδ(t), Υmψ〉L2(Γm) ∀ψ ∈ V for a.a. t ∈ [0, T ]. (2.42)

In (2.42) the operator Υm is defined as the trace on Γm, which is well defined in analogto the definition of the trace in (2.13). This means (2.42) is well defined, so we can statethe following abstract parabolic evolution problem. Find ϕ ∈ Y such that∫ T

0

〈ϕt(t), ψ(t)〉V ∗,V dt+

∫ T

0

〈a(ϕ, ψ; t) dt =

∫ T

0

〈f(t), ψ(t)〉V ∗,V dt

∀ψ ∈ L2(0, T ;V )

(2.43)

with the initial conditionϕ(0) = 0. (2.44)

We use the same results as in Theorem 2.4. For the bilinearform a the results fromLemma 2.2 hold as well, which means (2.43)-(2.44) fulfills the Assumptions 1.34 from([22, p. 43]). This allows us to use Theorem 1.37 from [22, p. 46], which yields existenceof a solution ϕ ∈ Y .This applied with the original time scaling yields the existences of an adjoint state p.

In order to find an optimum we check on optimality conditions. So we make use ofTheorem 1.48 [22, p. 71], which gives us optimality conditions of first order in Theorem2.11.

Theorem 2.11 (General First Order Optimality Conditions) If u is a local solu-tion of the reduced Problem 2.9 then u satisfies the variational inequality

u ∈ U and 〈J ′(u), u〉U∗,U = 0 for all u ∈ U.

Proof Follows directly from Theorem 1.48 [22, p. 71].

Corollary 2.12 Let (y, u) ∈ Y × U be an optimal solution of Problem 2.18. Then thereexists an adjoint state p ∈ Z, such that the following optimality conditions hold.State Equation(〈yt(t), ϕs〉V ∗,V + κ〈∇y(t),∇ϕs〉L2(Ω) + 〈∇y(t) · ~v, ϕs〉L2(Ω)

〈y(0), ϕ0〉L2(Ω)

)=

(〈(Bsus)(t), ϕs〉L2(Ω)

〈B0u0, ϕ0〉L2(Ω)

)for all ϕs ∈ V, ϕ0 ∈ L2(Ω) and a.a. t ∈ [0, T ]

Adjoint Equation

− 〈Qy − yδ, Qd〉D = 〈(ps)t, d〉L2(0,T ;L2(Ω)) + κ〈∇ps,∇d〉L2(0,T ;L2(Ω))

− 〈∇ps · ~v, d〉L2(0,T ;L2(Ω)) + 〈ps(T ), d(T )〉L2(Ω) ∀d ∈ Y,

and Control Equation

J ′(u)δu = α〈u, δu〉U + reg(u, δu)

− 〈Bsδus, ps〉Z∗s ,Zs − 〈B0δu0, ps(0)〉Z∗0 ,Z0 = 0 ∀δu ∈ U.


Additionally we give the strong form of the State Equation

yt − κ∆y + ~v · ∇y = us on Ω× [0, T ],

y(0, x) = u0 on Ω,

κ∇y · ~n = 0 on Γ1 × [0, T ],

y = 0 on Γ0 × [0, T ],

and the Adjoint Equation

−(ps)t − κ4 ps −∇ps · ~v = 0 on Ω× [0, T ],

ps(T ) = 0 on Ω,

(~vps + κ∇ps) · ~n = −(y − yδ) on Γm × [T1, T ],

(~vps + κ∇ps) · ~n = 0 on (∂Ω× [0, T ]) \ (Γm × [T1, T ]) ,

p0 = ps(0) on Ω.

For the Control Equation a gradient representation is given as

∇J(usu0

)= reg(u, ·) +

(αus −B∗s psαu0 −B∗0 p0

)= reg(u, ·) +

(αus −B∗s ps

αu0 −B∗0 ps(0)

).

∇J(u) = αu+ reg(u, ·)−B∗p

Proof The state equation is mandatory for the problem, so of course every solution(y, u) of Problem 2.18 fulfills the state equation. The existence of the adjoint statefollows from Lemma 2.10. Then one combines the first order optimality conditions inTheorem 2.11 with the Lagrange Ansatz for the first derivative of the reduced costfunctional J ′, which yields the control equation.

We didn’t specify a control space U yet. Here we suggest the following control space

U = Us × U0 := L2(0, T ;L2(Ω))× L2(Ω)

where we define the inner product in U as

〈u, v〉U,U :=

∫Ω

u0v0 dx+

∫ T

0

∫Ω

usvs dx dt for all u, v ∈ U.

From the Gelfand triple in Lemma 2.2 we obtain the well defined operators

Bs ∈ L(L2(0, T ;L2(Ω)), L2(0, T ;V ∗)

)and B0 ∈ L

(L2(Ω), L2(Ω)∗

)and thus B ∈ L(U,Z∗).

2.4 Second Derivatives

In analog to [22, p. 64] we want to calculate the second derivative of the reduced costfunctional J . Differentiating J ′(u), given in (2.21), in a second direction s2 ∈ U given


s1 ∈ U yields

〈J ′′(u)s1, s2〉U∗,U =〈Ly(y(u), u, p), y′′(u)(s1, s2)〉Y ∗,Y + 〈Lyy(y(u), u, p)y′(u)s2, y′(u)s1〉Y ∗,Y

+ 〈Lyu(y(u), u, p)s2, y′(u)s1〉Y ∗,Y + 〈Luy(y(u), u, p)y′(u)s2, s1〉U∗,U

+ 〈Luu(y(u), u, p)s2, s1〉U∗,U .(2.45)

First, there exist no mixed terms with y and u in the Lagrange equation which yieldsLyu(y, u, p(u)) = 0 and Luy(y, u, p) = 0. This allows us to condense the rather compre-hensive term (2.45) to a more compact version in (2.46). Also according to the productrule, the Ly term stays. One now chooses the adjoint state p = p(u) as before, i.e.Ly(y(u), u, p(u)) = 0, which lets the term containing y′′(u) drop out. At last, in this casethe control is linear, which yields Luu(y, u, p) = αIU .

〈J ′′(u)s1, s2〉U∗,U = 〈Lyy(y(u), u, p(u))y′(u)s2, y′(u)s1〉Y ∗,Y + α〈s2, s1〉U∗,U + reg(s2, s1)

(2.46)With y′(u)∗ describing the adjoint of y′(u) we can write

J ′′ = y′(u)∗Lyy(y(u), u, p(u))y′(u).

We get y(u) by solving e(y(u), u) = 0. From this we get for s ∈ U

d

due(y(u), u)s = ey(y(u), u)y′(u)s+ eu(y(u), u)s = 0.

Using this with the Implicit Function Theorem one gets the sensitivity

y′(u)s =− ey(y(u), u)−1eu(y(u), u)s = −ey(y(u), u)−1(−s).

Since the advective diffusive equation is linear, the linearization ey(y, u)−1 gives us thesame equation again. So it comes down to

y′(u)s = −y(−s) = y(s).

This in mind, we have a short look at the adjoint sensitivity

y′(u)∗s =(−ey(y(u), u)−1eu(y(u), u)

)∗s = −eu(y(u), u)∗ey(y(u), u)−∗s = y(u)∗s,

which is nothing else but the already well known adjoint equation. Here [. . . ]−∗ denotesthe adjoint inverse. Putting this together we get the Hessian as

∇2J(u)s = y(u)∗(y(s)).

Later, this will be applied to solve the reduced optimization problem utilizing a Newtonor Newton-like method. Then the Newton equation (2.47) can be solved without actuallycalculating the full Hessian ∇2J .

∇2J(u)s = −J ′(u)T (2.47)


For whatever particular algorithm (2.47) is used, incidentally Problem 2.9 is quadratic,so when using a Newton method a solution can be acquired with only one Newton step.At this point we gained a comprehensive analytical setup, which provides us with a well-defined optimization problem in an infinite dimensional environment, along with a methodto acquire a solution. In the next step one would expect space and time discretizationof the problem, which then allows numerical experiments. Indeed this is done later inSection 3, but first we want to utilize all the results acquired before, in order to treat amodification of our optimization problem.This modification reduces the infinite dimensional setting to a finite dimensionalparametrization, which still depends on the presented theory.

2.5 The Finite Dimensional Parametrization

A solution of Problem (1.9) has an infinite number of degrees of freedom. But manyapplications don’t take advantage from this setup, as they filter the information an ac-quired solution provides. For example, as u0 describes a density on Ω, one might onlybe interested in areas of high density, because they reveal the position of contaminantclouds. So by examining a solution u0 of the weak problem (2.18), parameters such aslocation, size and intensity of multiple contaminant clouds can be determined.But instead of such a post-processing one should set up a suitable optimization problem,which is directly searching for these parameters, and thus delivering a immediate answerto the questions arising in the case of an application. Then the knowledge about the typeof parameters is additional knowledge about a possible solution, which can alleviate theill-posedness of the original problem.We consider the Inverse Problem (1.9) from beginning but instead of an unknown initialcondition u0 and source us, one wants to take advantage of further knowledge about theshape of the reconstruction u. In the case of one initial probably rotational symmetricgas cloud its intensity and location is more important than its actual shape and distri-bution. As already said, this comes down to few parameters so it is only logical to usea parametrization of u = (us, u0). We choose now a suitable function depending on theparameters a = (a1, ..., ak) ∈ Aad ⊂ A = Rk, k ∈ N

f(a;x, t) =

(fs(a;x, t)f0(a;x)

)with

fs : Rk → Us, f0 : Rk → U0.

and thusf : a ∈ Rk 7−→ f(a) ∈ U.

For the following theory we suppose, that

fs ∈ C2(Rk, Us

)and f0 ∈ C2

(Rk, U0

)in the piecewise continuous sense. In order to be consistent with the previous notationdefine the mapping

u : a ∈ Rk 7→ f(a) ∈ U,


which means we substitute u := u(a) in the previous theory. This gives us a slightlyaltered optimization problem.

miny,u(a)

J (y, u(a), a) :=1

2

∥∥Qy − yδ∥∥2

W+α

2‖u(a)‖2

U +γa2‖a− aref‖2

A

s.t. a ∈ Aad and

yt − κ∆y + ~v · ∇y = fs(a) on Ω× [0, T ],

y(0) = f0(a) on Ω,

κ∇y · ~n = 0 on Γ1 × [0, T ],

y = 0 on Γ0 × [0, T ].

Here we added as regularization the deviation from a reference value aref ∈ A for γa > 0.Because we have already regularization of the parameters, we set in the following α = 0.Since Aad ⊂ Rk and f is supposed to be smooth enough by assumption we are able touse all previous results like existence of a solution and first order optimality conditions.First we write down the Lagrangian approach

L(y, u(a), p) = J(y, u(a), a) + 〈e(y, u(a)), p〉Z∗,Z

and the reduced cost functional

J(u(a), a) := L(y(u(a)), f(a), p) = J(y(u(a)), u(a), a) + 〈e(y(u(a)), u(a)), p〉Z∗,Z .

Recall now the representation of the first derivative J(a). We only modified u = u(a),so all derivatives in direction y stay the same, so the adjoint equation stays unaffected inthe sense that we just have to substitute without thinking about further derivatives

Ly(y(u(a)), u(a), p) = 0 ⇔ ey (y(u(a)), u(a))∗ p = −Jy(y(u(a)), u(a), a).

With this substitution we write the adjoint state p = p(u(a)) as p = p(a). But instead ofjust dealing with eu(y, u) ∈ L(U,Z∗) one has to use now d

dae(y, u(a)) : Rk → Z∗. Using

findings from the proof of Lemma 2.8, in particular (2.19), this yields

d

da〈e(y, u(a)), p〉Z∗,Z = 〈eu(y, u(a))u′(a), p〉Z∗,Z = 〈−u′(a), p〉Z∗,Z .

Plugging in the definitions then results in

d

dae(y, u(a))p = −

(d

da

∫Ω

f0(a)p(0) dx+d

da

∫ T

0

∫Ω

fs(a)p dx dt

)= −

(∫Ω

(d

daf0(a)

)p(0) dx+

∫ T

0

∫Ω

(d

dafs(a)

)p dx dt

).

Since Ju is already known and Ja(y, u, a) = dda

γa2‖a− aref‖2

A = γa(a− aref ), we now have

a representation for ddaJ(u(a), a).

d

daJ(u(a), a) = J ′(u(a))u′(a) + Ja(y, u, a) = Lu(y(u(a)), u(a), p(a))u′(a) + Ja(y, u, a)

= Ju(y(u(a)), u(a))u′(a) + eu(y(u(a)), u(a))u′(a)p(a)


To calculate the Hessian of J(u(a), a) we get in analog

d

da2e(y, u(a))p = −

(∫Ω

(d

da2f0(a)

)p(0) dx+

∫ T

0

∫Ω

(d

da2fs(a)

)p dx dt

)and

Jaa(y, u, a) = γa IRk ,

Luu(y(u(a)), u(a), p(a)) = αIU + 0.

We take again p = p(a) as solution of the adjoint equation, s1, s2 ∈ Rk and get accordingto the previous section(

d

da2J(u(a), a)

)(s1, s2) =〈Lu(y(u(a)), u(a), p(a)), u′′(a)(s1, s2)〉U∗,U

+ 〈Lyy(y(u(a)), a, p(a))y′(u(a))u′(a)s2, y′(u(a))u′(a)s1〉Y ∗,Y

+ 〈Luu(y(u(a)), u(a), p(a))u′(a)s2, u′(a)s1〉U∗,U + γas

T2 IRks1

We know that Lyy(y, u, p) = IY which yields

Lyy(y(u(a)), u(a), p) = (y′(u(a))u′(a))∗Q∗Qy′(u(a))u′(a)

= (u′(a)y′(u(a)))∗Q∗Qy′(u(a))u′(a)

With the assumptions on f the previous theory on existence, uniqueness and optimalityconditions holds as well.Next, possible parametrizations i.e. choices for f are discussed.First we focus on a stationary initial contaminant cloud, that means fs ≡ 0. Possibleparametrizations f0 have to take into account some parameters with restrictions.

Coordinate of the gas cloud x, or for more sources x1, . . . , xd x ∈ R.

Intensity I > 0 of the gas cloud.

Radius or more generally the size represented by σ > 0.

Then one can chose for example a = (σ, I, x) and define the Gaussian bell similar to theresults from the introduction (1.8).

f0(σ, I, x;x) = I · e−12σ(x−x)T ·(x−x) (2.48)

And for a more non-smooth disturbance the positive part of a downward opened parabel.

f0(σ, I, x;x) = max(0, I

(1− σ(x− x)T · (x− x)

) )(2.49)

Whenever referring to derivatives of (2.49), we refer to the piecewise derivative on the setD =

x ∈ Rn : 1

σ≥ (x− x)T · (x− x)

, as for example in (2.50).

∂d

∂If0(σ, I, x;x) =

(1− σ(x− x)T · (x− x)

)for x ∈ D

0 otherwise(2.50)


Remark The parameters σ and I as a contaminant cloud’s properties are connected toeach other by the law of diffusion. Over time the maximum contaminant concentrationrepresented by I will drop, while the contaminant spreads out into the space Ω thusincreasing the size σ. Initially this goes comparably fast. So when it comes to the dis-cretization one expects bigger variety on these two parameters, which does not necessarilymean that the resulting reconstruction is badly affected.

The functions in (2.48) and (2.49) are still time independent. To introduce time depen-dency for a contaminant source, one starts with the following setting for a time dependentlocation. A linear movement of a source starting at x0 ∈ Ω and ending in x1 ∈ Ω has thefollowing representation for the x-coordinate

x(t) = x0 +t

T(x1 − x0) for t ∈ [0, T ].

The linear moving contaminant source is given as

fs(σ, I, x0, x1;x, t) = f0(σ, I, x(t);x) = I · e−12σ(x−x(t))T ·(x−x(t)).

One receives the first derivatives by applying the chain rule

∂x0fs(σ, I, x0, x1;x, t) = ∂xf0(σ, I, x(t);x)

(1− t

T

),

∂x1fs(σ, I, x0, x1;x, t) = ∂xf0(σ, I, x(t);x)

(t

T

).

To generalize this we suppose the parameters a = (σ, I, x) are time dependent,i.e. a(t) : [0, T ]→ A. Then fs is given as

fs(a;x, t) := f0(a(t);x, t) t ∈ [0, T ].

If we assume a(t) ∈ C0 one can linearly interpolate a(t) in points a = [a0, . . . , am] fort ∈ [0, T ] and m ∈ N, m ≥ 2. This is the generalization of the linear movement byintroducing more supporting positions, which yields with the time step 4t := T

m,

a(t) =

ak−1 + (t− (k − 1)4t) ak−ak−1

4t for (k − 1) ≤ t4t < k, k = 1, . . . ,m

am for t = T.

Now provide the piecewise derivatives of first order.

∂

∂akif(a(t), t) =

(∂

∂aif0(a(t), t)

)·

1− t−(k−1)4t

4t for k ≤ t4t < k + 1,

t−(k−1)4t4t for k − 1 ≤ t

4t < k,

0 else .

Note that this linear spline is independent from a later time discretization scheme.

3 DISCRETIZATION 30

3 Discretization

The main problem was analyzed in a functional analytical environment. But only forwell-defined geometries of Ω the advection diffusion equation becomes simple enough,such that analytical solutions can be found, in order to solve the minimization problem.In general, we are in need of numerical methods to acquire solutions, which requiressome form of discretization. For this, the continuous model is transferred to its discretecounterpart. Although we follow the approach first discretize then optimize the resultingdiscretization is consistent to the analytical theory.In the following we introduce space and time discretization and apply these on the prob-lem. Then discretization schemes for various scenarios are presented.

3.1 Finite Element Space Discretization

For space discretization we are using Lagrange finite Elements. The Galerkin approxima-tion of the variational equation

Find y ∈ V a(y, ϕ) = b(ϕ) ∀ϕ ∈ V,

now reads asFind yh ∈ Vh a(yh, ϕ) = b(ϕ) ∀ϕ ∈ Vh,

where Vh is yet to determine. Nevertheless we suppose Vh ⊂ V , which leads to a conform-ing finite element discretization. First we choose a triangulation Th of Ω ⊂ Rn, n = 2, 3satisfying the definition below, which follows Definition 3.19 from [23, p. 114].

Definition 3.1 (Triangulation) A triangulation Th consists of a finite number of sub-sets K of Ω which are called elements. These form a non overlapping decomposition ofthe closed space Ω and fulfill the following properties:

(T1) Every K ∈ Th is closed.

(T2) The interior int(K) is nonempty and a Lipschitz domain for every K ∈ Th.

(T3) Ω =⋃K∈Th K

(T4) The intersection int(K1) ∩ int(K2) is empty for all different K1, K2 ∈ Th.

We consider a decomposition of Ω in actual triangles satisfying

h = max diam(K) | K ∈ Th

which is the all triangles’ longest edge length. Note that such a decomposition can berepresented by nodes and edges forming a grid, which is also called a mesh. Of course thetriangles’ size doesn’t vary too much, which depends in detail on the implemented meshingmethod, which also fulfills Definition 3.1. In the following we give a short description ofthe Finite Element Quadratic Ansatz we are going to use in Section 4. The main goalis to determine a concrete Ansatz space Vh. So for every element K ∈ Th we choose thespace of the Ansatz function on our element to be the space containing polynomials ofsecond order,

PK := v|K | v ∈ Vh = P2(K) for K ∈ Th.

3 DISCRETIZATION 31

This leads to

Vh ⊂ v : Ω→ R | v|K ∈ PK for all K ∈ Th = v : Ω→ R | v|K ∈ P2 for all K ∈ Th

and thus we know Vh ⊂ H1(Ω). Next we want to have a closer look on the triangles K andtheir definition as a regular n-simplex 5. For n = 2 we take each K ∈ Th as a triangle withthe local nodes a1, a2, a3 ∈ Ω, for n = 3 we get a tetrahedron with nodes a1, a2, a3, a4 ∈ Ω.We suppose that (a2 − a1, . . . , an+1 − an) are linear independent, which is implied by Kbeing an actual triangle respectively tetrahedron. Then we add the midpoints of K’s edgesas an+2 = 1

2(a1 +a2), . . . , ad = 1

2(an+1 +a1) where d = 1

2(n+1)(n+2). As example we now

consider one triangle element K ∈ Th with n = 2 and the nodes a1, . . . , ad ∈ K. There wewant to solve the interpolation problem for a given u ∈ V and the arising discretizationu = (ui)i=1,...d = (u(ai))i=1,...d ∈ Rn. We want to find some p ∈ P2(K) for u1, . . . ud ∈ Rn

such thatp(ai) = ui = u(ai) ∀i = 1, . . . , d.

Which is nothing else than finding an interpolation p ∈ P2(K) for u ∈ V on K based onthe nodes a1, . . . , ad. Since dim(P2(K)) = d the problem has an unique solution for eachu ∈ V . We introduce the so-called shape functions N1, . . . , Nd ∈ P2(K) with Ni(aj) = δijfor all i, j = 1, . . . , d. These form a basis of the space P2(K), which enables us to receive

p(x) =d∑i=1

uiNi(x)

as a solution of the interpolation problem. So now we want to calculate these shapefunctions for one element K. Next we take a look at the standard triangle in Fig. 1 and

a1 a2

a3

a6

a4

a5

a1 = (0, 0) a2 = (1, 0)a3 = (0, 1) a4 = (1

2, 0)

a5 = (12, 1

2) a6 = (0, 1

2)

Figure 1: standard triangle in two dimensions with corresponding coordinates of thevertices

set

B :=

a11 . . . a1,n+1...

. . ....

an1 . . . an,n+1

1 . . . 1

. (3.1)

Consider aj as a column of a matrix A, that means aj = (aij)i=1,...,d. For solving the

interpolation problem we define λ(x) : Ω → Rd as λ(x) = B−1

(x1

)for x = (x1, . . . , xn).

This yields with C = (cij) := B−1 in

λi(x) =n∑j=1

ci,jxj + ci,n+1 ∀i = 1, . . . , n+ 1.

5c.f. Definition 3.21 [23, p. 117]

3 DISCRETIZATION 32

According to [23, p. 121] we get now the following nodal functions.

Ni(x) =

λi(x)(2λi(x)− 1) for i = 1, . . . , n+ 1

4λi−(n+1)(x)λi−n(x) for i = n+ 2, . . . , d− 1

4λ1(x)λn+1(x) for i = d

(3.2)

Evaluating (3.1) and (3.2) for n = 2 yields

B =

0 1 00 0 11 1 1

, λ(x) = B−1

(x1

)=

1− x2 − x1

x1

x2

and thus

N =

(2x1 + 2x2 − 1)(x1 + x2 − 1)

x1(2x1 − 1)x2(2x2 − 1)

−4x1(x2 + x2 − 1)4x1x2

−4x2(x1 + x2 − 1)

.

We now have the shape functions for one standard element K, which can also be seen inFig. 2. These functions form a basis of the space P2(K), but we still don’t have a basis ofVh. For that we denote the nodal functions on an element K ∈ Th corresponding to the

Figure 2: the local nodal basis functions on the standard triangle

node aj with NKaj, j = 1, . . . , Nfem where Nfem is the number of nodes in the triangulation

of Ω. Since NKaj

is only defined on K we have to define a global node function ϕj ∈ Vh fora node aj as

ϕj =∑

K: aj∈K

NKaj· 1K for all j = 1, . . . , Nfem.

3 DISCRETIZATION 33

Here 1K is the characteristic function of K. One might wonder why ϕj ∈ Vh ⊂ C0(Ω).Assuming for n = 2 two elements K1, K2 ∈ Th share the same edge E := K1 ∩ K2

with the nodes a1, a2, a3 ∈ E. Then both elements’ nodal functions are consistent witheach other on this edge, NK1

i

∣∣E

= NK2i

∣∣E∀i = 1, . . . , 3, since dim(E) = n − 1 ⇒

dim(P2(E)) ≤ n+ 1. For n = 3 refer also to [5, p. 62]. From this we get ϕj ∈ C0(Ω) andsupp(ϕj) =

⋃K: aj∈K K for all j = 1, . . . , Nfem. We call now ϕ1, . . . , ϕNfem the global

nodal basis of our Ansatz space Vh and thus gain a proper definition for Vh. Accordingto [23, p. 62] the Galerkin method is about finding coefficients u1, . . . , uNfem ∈ R for

uh =

Nfem∑i=1

uiϕi such that a(uh, ϕ) = b(ϕ) ∀ϕ ∈ Vh.

Since we have a basis ϕ1, . . . , ϕNfem of Vh this is equivalent to

Nfem∑i=1

a(ϕi, ϕj)ui = b(ϕj) for all j = 1, . . . , Nfem.

One introduces the so called stiffness matrix Ah = (a(ϕi, ϕj))ij ∈ RNfem×Nfem ,y = (y1, . . . yNfem)T , and qh = (b(ϕi))i ∈ RNfem and solves Ahy = qh to get a solution

yh =∑Nfem

i=1 yiϕi . Note that the matrix Ah is sparse since

a(ϕi, ϕj) = 0 for supp(ϕi ∩ ϕj) = ∅,

which is quite often the case because suppϕi is comparably small to Ω. For a visualizationof suppϕi refer to Fig. 3. For convergence rate estimates refer to [23, pp. 131-146].

Figure 3: Support of a global nodal basis function ϕi.Source: Figure 2.8 [23, p. 64]

Next, introduce the discretization operator Ih : V → Vh as in [23, p. 132] for the nodesa1, . . . , aNfem with the global nodal basis ϕ1, . . . , ϕNfem of the corresponding triangulationTh of Ω.

Ih(u) =

Nfem∑i=1

u(ai)ϕi

3 DISCRETIZATION 34

The evaluation operator of the discretization is defined as

X : V → RNfem , X(u) =(u(a1), . . . , u(aNfem)

)T.

For later usage we now assemble some particular stiffness matrices with respect to thepreviously introduced global nodal basis ϕ1, . . . , ϕNfem . For the bilinear form a(ψ1, ψ2) =∫

Ωψ1ψ2 dx, ψ1, ψ2 ∈ L2(Ω) we denote the arising stiffness matrix with M ∈ RNfem×Nfem

and call it the mass matrix.

M := (mij)i,j=1,...,Nfem, mij := a(ϕi, ϕj) ∀i, j = 1, . . . , Nfem

Similarly we set up the matrix R for the weak Laplace operator (3.3), sometimes alsoreferred to as the stiffness matrix. Note that the weak Laplace operator is well defined,despite the nodal basis functions are only C0, since they are piecewise C1.

a(ψ1, ψ2) :=

∫Ω

∇ψ1 · ∇ψ2 dx, ψ1, ψ2 ∈ H1(Ω),

R := (rij)i,j=1,...,Nfem, rij := a(ϕi, ϕj) ∀i, j = 1, . . . , Nfem

⇒ a(Ih(ψ1), Ih(ψ2)) = X(ψ2)TRX(ψ1) ∀ψ1, ψ2 ∈ V

(3.3)

The trace operator for the measurements yδ on Γm is defined as a matrix Q in (3.4).

a(ψ1, ψ2) :=

∫Γm

ψ1ψ2 dx, ψ1, ψ2 ∈ L2(δΩ)

Q := (qij)i,j=1,...,Nfem, qij := a(ϕi, ϕj) ∀i, j = 1, . . . , Nfem

⇒ a(Ih(ψ1), Ih(ψ2)) = X(ψ2)TQX(ψ1) ∀ψ1, ψ2 ∈ V

(3.4)

The advective diffusive operator K (3.5) for κ > 0 arises from the mass and stiffnessmatrix.

a(ψ1, ψ2) :=

∫Ω

κ∇ψ1 · ∇ψ2 + (~v · ∇ψ1)ψ2 dx, ψ1, ψ2 ∈ V,

K := (kij)i,j=1,...,Nfem, kij := a(ϕi, ϕj) ∀i, j = 1, . . . , Nfem

⇒ a(Ih(ψ1), Ih(ψ2)) = X(ψ2)TKX(ψ1) ∀ψ1, ψ2 ∈ V

(3.5)

Because we have defined some matrices, we have to talk about their algebraic properties.By definition M , Q and R are symmetric. To answer questions regarding convexity oneis also interested in their quadratic forms. The nodal basis functions are all non-negativewhich yields

xTMx > 0 ∀x ∈ RNfem \ 0 positive definite

andxTQx ≥ 0 ∀x ∈ RNfem positive semidefinite . (3.6)

Now that space discretization is introduced, a measure for its quality is needed. Becausewe always want to know how good the discretization is we look at the Peclet number[23, p. 368]. The global Peclet number for our problem is defined as

Pe =‖~v‖∞ diam(Ω)

κ.

3 DISCRETIZATION 35

Depending on κ and the resolution of the triangulation this can range from 100 (diffusiondominated) to 109 (convection dominated). Unfortunately the finite element methodfails in the convection dominated case. For higher Peclet numbers one can stabilize thefinite element discretization with SUPG (streamline upwind Petrov–Galerkin). Since thepresented applications are not convection dominated one sticks with the standard finiteelement method.Before we start with the time discretization we recall from [23, pp. 293-294], that thereare now two different ways how to proceed.The vertical method of lines starts with space discretization. In our case this means firstdiscretizing space with the finite element method and later choosing a time discretization.Then again the horizontal method of lines also called Rothe’s method works the otherway round. First the time variable t is discretized. In a second step, space discretizationis applied.

3.2 Implicit Euler Time Discretization

Time discretization is done by splitting the time interval [0, T ] in NT ∈ N parts. Onesets tn = n · 4t, 4 t := T

NT, n = 0, . . . , NT and receives an equidistant subdivision by

the time points [t0 = 0, t1, . . . , tNT ]. We chose Rothe’s method so the semidiscretizationof y(x, t) ∈ W (0, T, L2(Ω), V ) with respect to the time variable t is y = [y0, . . . , yNT ]where yn ∈ V for all n = 0, . . . , NT and analogously us = [u0

s, . . . , uNTs ], uns ∈ H for all

n = 0, . . . , NT . We have again a look at our well known problem∫Ω

(d

dty

)ϕ dx = −

∫Ω

−κ∇y · ∇ϕ+ ~v · ∇yϕ+ usϕ dx ∀ϕ ∈ V∫Ω

y(0)ϕ dx =

∫Ω

u0ϕ dx ∀ϕ ∈ V.(3.7)

The system (3.7) gives an initial condition for y and defines the time derivative of firstorder. It can be solved in time by the backward or so-called implicit Euler method[28, p. 482]. Given the above discretization the approximation of the time derivative isset as

d

dtyn+1 =

yn+1 − yn

4tfor all n = 0, . . . , NT − 1. (3.8)

After applying (3.8) in (3.7) one gets as result of Rothe’s method a system of differentialequations in space.∫

Ω

(yn+1 − yn

4t

)ϕ dx = −

∫Ω

−κ∇y · ∇ϕ+ ~v · ∇yϕ+ usϕ dx

for all ϕ ∈ V, n = 0, . . . , NT − 1∫Ω

y0ϕ dx =

∫Ω

u0ϕ dx ∀ϕ ∈ V

(3.9)

Since we lack a direct analytical solution of (3.9) we have to discretize space. With thefinite element method in mind, time and space discretization of y(x, t) ∈ W (0, T, L2(Ω), V )is then done by setting yh = [y0

h, . . . , yNTh ] where ynh ∈ Vh for n = 1, . . . , NT . In analog

the corresponding discretization of u0 and us is (u0)h and [(u0s)h, . . . , (u

NTs )h]. This means

3 DISCRETIZATION 36

one discretizes for example the already time-discrete u0s ∈ V in space as (u0

s)h ∈ Vh withrespect to the already known Ansatz space Vh. Applying the space discretization on (3.9)leads to the problem formulation: Find yh such that∫

Ω

yn+1h − ynh4t

ϕh dx =−∫

Ω

κ∇yn+1h · ∇ϕh + ~v · yn+1

h ϕh dx+

∫Ω

(un+1s )hϕh dx∫

Ω

y0hϕh dx =

∫Ω

(u0)hϕh dx for all ϕh ∈ Vh.

From the previous Section 3.1, we know how to solve this, decoded in a linear system.The corresponding vector to yh is denoted by y = [y0 y1 . . . yNT ] ∈ RNT×Nfem . In analogone denotes u0 and us = [u0

s, . . . , uNTs ]. One now writes the first line as

1

4tM(yn+1 − yn) = −Kyn+1 +Mun+1

s

⇔ (M +4tK)yn+1 = Myn +4tMun+1s

⇔ yn+1 = (M +4tK)−1M(yn +4tun+1s )

In order for this to be consistent with respect to the initial values, we suppose for thecase us 6= 0 that u0

s = u0, or in other words the source term at t = 0 should continuouslyarise from the initial state u0. This results in the following linear system

yn+1 = (M +4tK)−1M(yn +4tun+1s ) for all n = 1, . . . , NT (3.10)

y0 = u0.

The system (3.10) will be written in the equation Sy = g, where the right side g is definedasg = [Mu0,4tMu1

s, . . . ,4tMuNTs ]T . The matrix S is called the forward operator and isdefined with L = (M +4tK) as

S :=

M 0 0 . . . 0 0 0−M K 0 . . . 0 0 0

0 −M K . . . 0 0 0...

......

. . ....

......

0 0 0 . . . M 0 00 0 0 . . . −M K 00 0 0 . . . 0 −M K

.

Note that starting with the vertical method of lines one would have come to the sameresults, as long as one chooses the same scheme for time and space respectively.

3.3 Discrete Cost Functional

Next one has to discretize the cost functional. First discretize the space time operator∫ TT1

∫Γmϕ1ϕ2 dx dt, ϕ1, ϕ2 ∈ Y . Hence one has to find an integration scheme for the

time integral∫ T

0. . . dt. Since the interval [0, T ] is split in NT different parts we have

3 DISCRETIZATION 37

NT +1 points for the function values. We use then the same integration scheme as in [27].Let ID denote the identity matrix of size Nfem ×Nfem. The time integration is given bythe diagonal matrix Στ of size (NT + 1)Nfem × (NT + 1)Nfem ,

Στ := 4t

ID. . .

ID

. (3.11)

Further we suppose that the time point where the measuring starts always matches upwith a point of the discretization scheme. That means we have a discretized time pointN1 = T1

TNT with N1 ∈ N0. The discretized space time operator Q is then according to

the midpoint rule defined as

Q :=

0. . .

0∆tQ

. . .

∆tQ

.

For now we suppose the case us ≡ 0, so one only searches for an unknown initial condition.Then of course uns ≡ 0, ∀n = 0, . . . , NT but this implies u0 = 0 which makes no sense. Thisis fixed by leaving u0 untouched and setting g = [Mu0, 0, . . . , 0]T as in [27, p. 11]. Also letyδ be the discretized measurements. With the already known matrices the minimizationproblem, including the cost functional, translates straight forward to

minu0,y

J(u0, y) =1

2(y − yδ)T Q(y − yδ) +

α

2uT0Mu0

s.t. Sy = g.(3.12)

Whereas (3.12) discretizes the reduced problem presented in Section 2.2. In the caseU0 = L2(Ω) and thus only u0 ∈ L2(Ω), (3.12) is not the complete cost functional that isgoing to be used. The term α

2uT0Mu0 corresponds to a first-order Tikhonov regularization,

where we substitute in the following α := γ1 > 0. Here additionally second order Tikhonovregularization will be used, which corresponds to the H1(Ω) seminorm. This translateswith γ2 > 0 to the term γ2

2uT0Ru0. The original problem for α = 0 is ill-conditioned and

regularization like the L2-norm is needed to at least ensure uniqueness. The H1 seminormalso partially recovers information about the shape of u0 by damping oscillations. Thediscretized minimization problem with additional second-order Thikonov regularizationis given in (3.13).

minu0,y

J(u0, y) =1

2(y − yδ)T Q(y − yδ) +

γ1

2uT0Mu0 +

γ2

2uT0Ru0

s.t. Sy = g(3.13)

Here we keep the notation from the infinite dimensional case, where J denotes the originalcost functional and J the reduced one. By setting directly y = S−1g = S−1[Mu0, 0, . . . , 0]T

3 DISCRETIZATION 38

one immediately receives the reduced cost functional (3.14).

minu0

J(u0) =1

2(S−1g − yδ)T Q(S−1g − yδ) +

γ1

2uT0Mu0 +

γ2

2uT0Ru0 (3.14)

To show the consistency between the discretization and the analytical theory, we introducethe Lagrange multiplier p = [p0 p1 . . . pNT ]T and the discrete Lagrangian as in [27, p. 12].

L(y, u0, p) := J(u0) + pT (Sy − g) (3.15)

Differentiating (3.15) with respect to y yields

Ly(y, u0, p) = Q(y − yδ) + STpT = 0. (3.16)

The derivative in direction u0 is the discrete control equation

Lu0(y, u0, p) = γ1Mu0 + γ2Ru0 −Mp0 = 0,

because

∂u0pT (−g) = pT∂u0 [Mu0 0 . . . 0] = pT [M 0 . . . 0] = (−pT0M)T = −Mp0.

Lets have a closer look at the adjoint equation (3.16), especially the use of the matrix ST

as in [27, p. 12]. The last row of (3.16) is

LpNT = −4 tQ(yNT − (yδ)NT ) ⇔ 1

4tMpNT = −NpNT −Q(yNT − (yδ)NT ).

The following rows n = 1, . . . , NT − 1 give this discretization scheme

Lpn −Mpn+1 = −4 tQ(yn − (yδ)n) ⇔ 1

4tM(pn − pn+1) = −Npn −Q(yn − (yδ)n)

which corresponds to a backwards-in-time implicit Euler discretization. To see this moreclearly one inverts time by setting m(n) = N − n and thus

1

4tM(pm+1 − pm) = −Npm+1 −Q(ym+1 − (yδ)m) for all m = 1, . . . , NT − 1.

But from the continuous adjoint equation we remember p(T ) = 0, which in our dis-cretization scheme would mean we have to set pNT = 0. On the other hand this wouldnot match the discrete minimization problem. Since we chose the first discretize thenoptimize approach we let it be as pNT = L−1

(−4 tQ

(yNT − (yδ)NT

))and stay satisfied

with the fact, that for finer time discretization 4t → 0 the discrepancy vanishes withlim4t→0 p

NT = 0 ≡ p(T ). Depending on the quality of the discretization this condition isfulfilled as well and consistent to the infinite dimensional theory in Section 2.2. Also thisensures a consistent discrete adjoint equation. With this in mind the derivatives are

∇J(u0) = L(y, u0, p)

and∇2J(u0) = GT S−TQS−1GT︸︷︷︸

∗

+ γ1M + γ2R (3.17)

3 DISCRETIZATION 39

where G is diagonal matrix such that g = [u0, . . . , 0]T = Gu0.To show that J is a strictly convex quadratic function take a look at the Hessian (3.17).The part (∗) in (3.17) is positive semidefinite, since Q and thus Q are positive semidefinite(3.6) and also S−1GT is a linear operator on both sides of Q. The convexity of J is thenenforced by the positive definite regularization γ1M + γ2R, but only for γ1 or γ2 > 0.This is the analog condition to α > 0 in the infinite dimensional case. The minimizationproblem (3.14) is unrestricted i.e. u0 ∈ RNfem and J is strictly convex, which yields Jto have a unique global minimum. So there exists an unique solution of the discretizedproblem (3.14).

3.4 Constant Source

For the constant source scenario we suppose u0 = 0 and take us(x, t) = us(x) constantover time, which enables us to discretize uns = us, ∀n = 0, . . . , N . Then the right side gis then g = [Mus,4tMus, . . . ,4tMus]

T . Compared to (3.14) the cost functional barelychanges, since we substitute α := γ3 > 0 and add additional second order Tikhonovregularization with γ4 > 0.

minus

J(us) =1

2(y − yδ)T Q(y − yδ) +

γ3

2uTsMus +

γ4

2uTs Rus

s.t. Sy = g

The adjoint equation also stays unaffected. Only the discrete control equation undergoesminor changes

Lus(y, u0, p) = γ3Mu0 + γ4Ru0 −NT∑n=1

4tMpn −Mp0. (3.18)

This can be solved the same way as the reconstruction for the initial value problem, (3.14).

3.5 Time Dependent Source

For the time dependent source scenario we suppose again u0 = 0. For first and sec-ond order Tikhonov regularization we have to find with γ3, γ4 > 0 the correspondingdiscretization for

γ3

∫ T

0

∫Ω

(us)2 dx dt+ γ4

∫ T

0

∫Ω

∇us · ∇us dx dt.

Note that when we write terms involving time and higher spatial derivatives of us, weonly state them to give the reader a better overview over the discretization scheme. Thesestatements are in general not well defined for the control spaces used in Section 2. Withthe already known time discretization scheme the L2 norm is written as a diagonal matrixM and in analog the H1 seminorm corresponds to R

M =

M . . .

M

, R =

R . . .

R

.

3 DISCRETIZATION 40

We then define the time integrals by Mτ := ΣτM and Rτ := Στ R where we use the timeintegration scheme from (3.11). The minimization problem looks again similar to (3.14).

minus

J(us) =1

2(y − yδ)T Q(y − yδ) +

γ3

2uTsMus +

γ4

2uTs Rus

s.t. Sy = g.(3.19)

But compared to the initial source problem, (3.19) is more ill-posed. Comparing the sizesof the vectors to be reconstructed then us is NT -times as big as u0. But still we don’thave more measurements yδ.To partially overcome this, we introduce only for the discretized case additional penaliza-tion. Since the ill-posedness stems from extending the reconstruction onto the time gridthe additional penalization should deal with the behavior of us over time.Assuming that gas sources move and oscillate comparably slowly with respect to the timegrid, the first approach is to penalize the discretized equivalent of the time derivativeof us. In this context one prefers the in-time constant source us from Section 3.4 is asa solution. So we want to add the discretized version of (3.20) to the discretized costfunctional, where γτ > 0 is a suitable penalty parameter to choose.

penτ (us) :=γτ2

∫ T

0

∫Ω

(d

dtus(x, t)

)2

dx dt (3.20)

In order to penalize correctly this can’t be discretized in total analogy to the continuouspart (3.20). The first time derivative of us and its discretization us = [u0

s, . . . , uNTs ] is set

as

(∂tus)0 = 0, (∂tus)

n =1

4t(uns − un−1

s ) for all n = 1, . . .NT (3.21)

since one does not want to punish an initial uprising source by setting (∂tus)0 = u0

s − 0 =u0s. This just leads to forcing the solution to be zero and thus making it harder to find a

solution. The L2-norm of the first time derivative of us∫ T

0

∫Ω

(us)2t dx dt (3.22)

is implemented in a matrix W according above scheme (3.21) with its exceptions.

penτ (us) := usT W us =

NT∑n=1

4t 1

4t(uns − un−1

s )TM1

4t(uns − un−1

s )

=

NT∑n=1

1

4t(uns − un−1

s )TM(uns − un−1s ).

This also equals

penτ (us) =1

4t

NT∑n=1

(uns )TMuns − 2(uns )TMun−1s + (un−1

s )TMun−1s ,

3 DISCRETIZATION 41

which written as a matrix results in

0 0

M −M−M

−M0 −M M

.

The first derivative is

d

duispenτ (us) = γτ

1

4t

−(u1

s − u0s)TM for i = 0,

(2uis − ui−1s − ui+1

s )TM for i = 2, . . . , NT − 1,

(uNTs − uNT−1s )TM for i = NT .

While (3.22) enforces the source us to be constant over time in every point x ∈ Ω, it doessometimes not meet the rather vague criteria of a constant source. When speaking of aconstant source, one is more likely to assume that the average exhaust of contaminant pertime is constant over time. To enforce this, one introduces another penalty parameter.The overall change of the amount of exhaust per time is given as

d

dt

∫Ω

us(x, t) dx. (3.23)

If one wants to penalize every change in exhaust in order to avoid annihilation effectsone has to square (3.23). For this, one penalizes with the discretized counterpart of thepenalty term defined in (3.24) for γm ≥ 0.

penm(us) :=γm2

∫ T

0

d

dt

(∫Ω

us(x, t) dx

)2

dt (3.24)

With the vector 1 = [1 . . . 1]T ∈ RNfem and adapting the derivative scheme from (3.21) adiscretization of (3.24) is given in (3.25).

penm(us) :=γm2

NT∑n=1

4t(

1

4t(1TMuns − 1TMun−1

s

))2

=γm2

NT∑n=1

1

4t(1TM

(uns − un−1

s

))2

(3.25)

The derivatives of (3.25) are given component-wise in (3.26).

d

duispenm(us) = γm

1

4t

−1TM (u1

s − u0s) 1TM for i = 0

1TM (2uis − ui−1s − ui+1

s ) 1TM for i = 2, . . . , NT − 1

1TM(uNTs − uNT−1

s

)1TM for i = NT

(3.26)

The advantage of (3.25) is that unwanted behavior can be penalized, while preservingthe problems quadratic structure. But, by penalizing just the overall exhaust, one favors

3 DISCRETIZATION 42

sinks of gas, which means us < 0. By squaring us in the integral (3.27), we introduce athird penalty term for γm2 > 0 in (3.28) with its analytic counterpart (3.27).

penm(us) :=γm2

2

∫ T

0

d

dt

(∫Ω

us(x, t)2 dx

)2

dt (3.27)

penm2(us) :=γm2

2

NT∑n=1

4t(

1

4t((uns )TMuns − (un−1

s )TMun−1s

))2

(3.28)

By adding penm2 to the cost functional J(y, us) we lose its quadratic structure. Sincewe did not cover this case before we disregard the penalty term penm2 for the followingtheory.Collecting the penalty terms penτ and penm yields in the the discretized cost functional(3.29).

miny,us

J(y, us) =1

2(y − yδ)T Q(y − yδ) +

γ3

2uTsMus +

γ4

2uTs Rus

+γτ2uTs Wτ us + penm(us)

s.t. Sy = g.

(3.29)

It is already known that compared to (3.16) the adjoint equation stays unaffected so wehave to recalculate the discrete control equation.We set g = [Mu0

s,4tMu1s, . . . ,4tMuNTs ]T .

Lus(y, us, p) = γ3Mτ us + γ4Rτ us + γτWτ us +∇penm(us)− M [p0 p1 . . . pNTT ]T = 0.

Nevertheless the time dependent source problem is compared to the initial value problemNT times bigger, since there are NT times more values to reconstruct. To reduce theproblem size one can discretize us on a coarse time grid and interpolate it linearly onthe default time grid. Linearly because this conserves the quadratic’s problem structure.Suppose us in NT − 1 time steps is to be reduced to (us)ip with Nip − 1 time stepsover the same time interval [0, T ]. Then we define the linear interpolation matrix E ∈RNT ·Nfem×Nip·Nfem such that us = E(us)ip. For example we suppose NT = 5 and Nip = 3.The linear interpolation can then be found in (3.30), where ID denotes the identity matrixon RNfem×Nfem .

us = E (us)ip :⇔(u0s, . . . , u

5s

)=

ID 0 012ID 1

2ID 0

0 ID 00 1

2ID 1

2ID

0 0 ID

(u0

s)ip(u1

s)ip(u2

s)ip

(3.30)

3.6 Discretization of the Parameterized Problem

We now want to discretize the parametrization of the initial condition and the sourceterm, introduced as a function u = f(a;x, t). Suppose the coordinates of the finite elementmeshes’ nodes are given by X and the time intervals by T we receive the discretization

3 DISCRETIZATION 43

f = f(a; X, T ). Then we write g = [Mf0,4tMf 1s , . . . ,4tMfNTs ]. As always we have to

distinguish between the initial condition.

mina∈Aad, y∈RNfem×NT

J(y, a) :=1

2(y − yδ)T Q(y − yδ) +

γa2

(a− aref )T · (a− aref )

s.t. Sy = g

We follow the approach first discretize and then optimize, but we still will see the analogyto the continuous problem. Next we can write this as

mina∈Aad

J(a) =1

2(S−1g − yδ)T Q(S−1g − yδ) +

γa2

(a− aref )T · (a− aref ).

The gradient of J(a) is calculated for each component ai of a. So∇ai denotes the derivativein ai where i = 1, . . . , dim(A).

∇ai J(a) =1

2(S−1∇ai g)T Q(S−1g − yδ) +

1

2(S−1 − g)T Q(S−1∇ai g − yδ) + γa(a− aref )T

= (S−1∇ai g)T Q(S−1g − yδ) + γa(a− aref )T

= (∇ai g)T S−T Q(S−1g − yδ) + γa(a− aref )T

As before the Hessian is given component-wise, where ∇2ai,j

denotes the derivative in aiand aj for i, j = 1, . . . , dim(A).

∇2ai,jJ(a) = (S−1∇2

ai,jg)T Q(S−1g − yδ) + (S−1∇ai g)T Q(S−1∇aj g) + γaIk

= (∇2ai,jg)T S−T Q(S−1g − yδ) + (∇ai g)T S−T Q(S−1∇aj g) + γaIk

(3.31)

As a detail of implementation we look at the term (S−1∇ai g)T Q(S−1∇aj g) in (3.31). Since

a ∈ Rn calculating this term can be parallelized. Suppose J ,∇aJ ,∇2aJ are to be evaluated

at once. The most time intensive operation is one application of S−1 whose complexity wedenote with CS ∈ N0. For the values of J ,∇aJ we get 2CS. If one wants to calculate theentire Hessian, the most efficient way is not to choose the adjoint approach. This wouldmean evaluating S−T Q(S−1∇aj g) first and would result in 2CS + 2NACS. So it is betterto solve and store S−1∇aj , g =: dj ∀j ∈ 1, . . . , n and then evaluate dTi Qd

Tj , i ≤ j which

yields a complexity of 2CS + NACS. On the other hand a matrix free implementationwould just evaluate a product of a direction d ∈ Rk with the Hessian of J(a). So firstone calculates dT∇ag and then plugs this in d(QS−T S−1) which results in the reducedcomplexity (2 + 2)CS.To introduce a constraint with a ∈ Aad, suppose that there exist functions h : Rn → Rnh ,g : Rn → Rng , such that

a ∈ Aad ⇔ g(a) ≤ 0 and h(a) = 0.

Then the constrained minimization problem can be written as

mina∈Rn

J(a) s.t. g(a) ≤ 0 and h(a) = 0. (3.32)

3 DISCRETIZATION 44

A standard method is to use the Lagrange mechanism with a Lagrange multiplier

λ =

(λhλg

)∈ Rnh+ng and the Lagrange function

L(a, λ) = f(a) +

ng∑i=1

λg,igi(a)

nh∑i=1

λh,ihi(a).

To obtain an optimal solution we check for the KKT conditions

∇aL(a) = 0, λg,igi(a) = 0 ∀i,

g(a) ≤ 0,

h(a) = 0,

λg,i ≥ 0.

Here the first order optimality is defined as

‖∇aL(a, λ)‖ = ‖∇J(a) +∑

λg,i∇gi(a) +∑

λh,i∇hh,i(a)‖. (3.33)

We gave this short definitions, because in Section 4.5 we will use an interior-point Algo-rithm based on [6] for solving (3.32). For more information about interior-point Algo-rithms refer to [26, pp. 392-425].

4 NUMERICAL TESTS 45

4 Numerical Tests

For the implementation of all the previous scenarios and algorithms suitable software isneeded. Since for the initial value problem there was already code available in [27] wefollow Petra and Stadler’s software choice and use modifications from [3].For the finite element discretization the FEM Software COMSOL Multiphysics 3.5a isused. It provides a MATLAB interface up to version 7.8. So we use MATLAB v.7.8(2009a) for all parts mainly involving FEM discretization. In order not to be forced toimplement basic algorithms from scratch the MATLAB Optimization Toolbox is used,mostly for parametrization related problems. To benefit from computers with multipleprocessors (multicore) the MATLAB Parallel Computing Toolbox was explored and usedin a very basic way.All calculations were carried out on the two different machines with the following prop-erties.

(A) Lenovo Thinkpad, Intel Core 2 Duo P8600, 2x2.40GHz, 8GB of memory

(B) LRZ Linux Cluster, 8-way AMD Opteron (Dual-Core), 32x2.69 GHz,32GB of memory 6

For production jobs on machine (B) MATLAB v.8 (2012b) was used. Since the FEM dis-cretization delivers sparse matrices, many other matrices emerging from these basic onesshare the property of sparsity, which can be seen in Fig. 4. So the usage of a proper toolset,which utilizes this sparsity for a better performance, is essential. Certainly not so crucial

Mass matrix M on Ω Mass matrix Q on Γm Preconditioner for timedependent source us

Figure 4: sparsity patterns of some matrices (nz - number of nonzero entries)

but still resulting in faster LU-decomposition of sparse matrices the more recent versionUMFPACK 5.6.1 is used. To solve the system Ax = b utilizing an LU-decompositionUMFPACK gives a permutation matrix P and a column reordering matrix Q such that atriangular matrix L and a upper triangular matrix U are the LU-decomposition of P A Qi.e. PAQ = LU . For this refer to [10].

6Refer to http://www.lrz.de/services/compute/linux-cluster/overview/.


4.1 Setup

For better comparability the standard setup from [27] for the shape of Ω and Γm ischosen. It can be seen in Fig. 5 that Ω ⊂ R2 is an unit square box with two rectangularobstacles. The measurements of the contaminant concentration are taken on the marginΓm marked in blue. The diffusion coefficient is set to κ = 10−3 and the Reynolds numberto Re = 102. This setup does not directly correspond to a particular application soit describes a more abstract situation. The scenario takes place in the time interval[0, T ] = [0, 4] and measurements on Γm are taken in [T1, T ] = [1, 4]. For the finite element

Figure 5: Ω with the margin Γ and the margin Γm where measurements are taken

discretization on needs a triangulation of Ω, which we get via a suitable mesh. Themesh generated by COMSOL’s built-in algorithm, which is based on an advancing frontalgorithm, can be seen in Fig. 6. When not specified otherwise, this is the default mesh.The inner boundary Γm is represented by 50 nodes, which corresponds to 50 triangular

triangulation element quality

Figure 6: Two-dimensional FEM mesh with 1012 vertices, 3889 nodes resulting in 1864triangular elements (162 boundary elements) and the corresponding element quality plot.

boundary elements. The quality q of an element is defined as a quantity between 0 and


1, where 1 marks perfect quality. For a triangular element with area A and side lengthsh1, h2, h3 the element quality q is defined in (4.1).

q =4√

3A

h21 + h2

2 + h23

(4.1)

In this case for q > 0.3 the FEM mesh should not affect the quality of the solution. Asone can see in Fig. 6 the standard triangulation has very good element quality.

4.2 Stationary Incompressible Navier Stokes

One generates a velocity field ~v satisfying the steady incompressible Navier-Stokes equa-tions (1.11)-(1.13) for boundary values related to Example 1.5. The Reynolds number isset to Re = 100. The corresponding implementation in Matlab was then adopted from[27]. Here the built in solver, namely a damped Newton method, provided by ComsolMultiphysics 3.5a is used. An overview of the implementation of incompressible Navier-Stokes is covered by the documentation [7], which also gives an detailed example of thesteady case setup in two dimensions. The FEM discretization of Ω follows the schemedescribed in Section 3. Here second order Langrange elements (L2) are used to discretizethe velocity field ~v and first order Lagrange elements (L1) for the pressure p respectively.For the underlying discretization of the result, which can be seen in Fig. 7, the same

Figure 7: Velocityfield ~v as solution of the Navier-Stokes equation. Given as vectorfieldby arrows and a corresponding streamline plot.

mesh that can be seen in Fig. 6 was used.

4.3 The Synthesizing of Measurement Data yδ

For appropriate numerical tests one has to get hold of suitable measurements yδ. Sincethere is no actual data provided by experiments we depend on synthesizing suitable dis-cretized data yδ for each scenario. There is never something as the measurement of the


contaminant concentration at a point x ∈ Γm at time t ∈ [T1, T ] because the process ofmeasuring underlies systematic and random errors which leads ultimately to uncertainty.In general measurements at a point (x, t) would be given in the standard form for statinguncertainties as in [30, p. 34],

yδ(x, t) = yδbest(x, t) ± δyδ (x, t) ∈ Γm × [T1, T ].

There yδbest(x, t) describes the best estimate and δyδ the uncertainty or error in the mea-surement, in the sense that one is confident that the correct value of yδ(x, t) lies mostprobably somewhere in [yδbest(x, t)− δyδ, yδbest(x, t) + δyδ]. Unlike [15] we don’t expand theuncertainty quantification to the whole process of reconstruction, which basically meanswe don’t make further use of δyδ and only focus on yδbest. Nevertheless this shows we haveto take discrepancy into account when synthesizing measurement data.In analog to the implementation of [27] the following approach is chosen. In the begin-ning of the procedure an initial gas distribution u0 and a gas source us is chosen. Thenthe wind field is calculated for the given topology depending on full knowledge about itsdirection and speed on the margin, when given Dirichlet boundary conditions. With thealready known discretization scheme we set

yδo = Qy = QS−1g(u).

One general thought on calculating quantities involving S−1 or its transposed. For thetheory the implicit Euler scheme, as a fairly big system of equations is just written inone matrix S. But one would never actually set up this big matrix in an implementation.Instead one successively calculates for example yn as in (3.10). Then the most timeconsuming part is the application of L−1 = (M +4tN)−1 for S−1 or L−T for S−T . Sincethe inverse of L is heavily used it gives a good return to calculate a LU-decompositionfor L and respectively LT .Suppose our setup equals with good enough discretization the real physics, we now haveso to say, perfect measurements yδo ± 0. As discussed before this will never be the case.The needed perturbation is now done by utilizing the normal distribution. Let N be avector of standard normal distributed random variable N (0, 1) with size dim(yδ). Wechoose now a noise ξ > 0 typically ξ ∈ [0, 0.1] and set

yδ = yδ + ξ ·max(|yδ|) · N .

In the standard form for uncertainty specification this is then written as

yδ(x, t) ± ξ ·max(|yδ|) for all (x, t) ∈ Γm × [T1, T ].

Unfortunately the setup consisting of

the shape of Ω,

the equation describing the wind field or the wind field in general,

Reynolds number Re, diffusion coefficient κ,

assumptions on the gas,


and the advection diffusion equation

does not describe the reality without errors. It is quite hard to introduce quantities whichmodel this systematic error. Despite all constants, which can always be modified withrandom noise one can never know a priori the discrepancy between an equation modelinga system and the real system itself.To partially take into account a systematic error and in order to not deceive ourselves inthe reconstruction process we model peaks of gas sources in the synthesizing process notwith an exponential, but a parabel Ansatz. This nonsmooth function should in generallead to more disturbance, while it is still unlikely that a gas concentration has actuallythis shape.In the Ansatz for the parameterized reconstruction one uses a Gaussian bell. Then ofcourse we will never obtain the original value of the parabel’s maximal intensity I, becausethe function shapes differ to much. On the other hand this introduces an actual systematicerror in this theory, which is highly desirable in this context. To give an impression ofthe quality of measurements and the enviromental conditions for taking measurementsrefer to Fig. 8. In Fig. 8b one can see the gas concentration over time in the forwardsimulation at the edge marked in Fig. 8a. The part on this edge, which contributes tothe synthesized measurements yδ is showed in 8c. Here measures are taken for t ∈ [1, 4].

(a) wall (b) original simulation (c) with noise ξ = 0.01

Figure 8: Gas concentration at one edge of the obstacles plotted over time t ∈ [0, 4].

4.4 Unparameterized Models

In this section we want to present numerical tests for the problems proposed in Sections3.4-3.5. But first we introduce an Algorithm for solving these discretized problems.

4.4.1 Globalized Newton Method

Since the main problem is of quadratic nature the Newton method should converge inone step. So for the sake of completeness we present a globalized Newton method inAlgorithm 1. The basic Algorithm was taken from [34] and slightly modified. Supposeone wants to minimize a function f ∈ C2(Rn,R), n ∈ N. By solving the Newton equation(4.2) for an iterate xk ∈ Rn one receives a direction sk ∈ Rn.

∇2f(xk)sk = −∇f(xk) (4.2)


When confronted with high dimension n calculation of the Hessian, ∇2f(xk) might onlybe possible at high costs. In the case one gets hold off the matrix vector product atreasonable costs ∇2f(xk)sk an iterative solver is applied to (4.2). In [26] a conjugategradient method (CG) is applied, here we chose the generalized minimal residual method(GMRES). This method essentially takes two parameters, the absolute tolerance tol-GMRES> 0 on the residual and the maximal number of iterations maxit-GMRES> 0. Inthe case that∇2f(xk) is ill-conditioned or that it is for another reason not possible to solve(4.2), within the defined parameters tol-GMRES> 0, maxit-GMRES> 0 the GMRESmethod is regarded as failed to converge. Algorithm 1 provides the gradient as a fall-back.Then the number of iterations in the Newton method depends highly on maxit-GMRESand tol-GMRES. If tol-GMRES is too small one might have unnecessary high precisionon the Newton step, is it too high the Newton step is not good enough. Confrontedwith a simple convex quadratic problem this can lead to more than one iteration in theNewton method. So tol-GMRES always has to be chosen with respect to the aborttolerance on gradient in the Newton method. As our first choice we set tol-GMRES=

min(

0.5,√‖∇f(xk)‖

)‖∇f(xk)‖. Of course there exists a variety of well-known methods

to improve convergence, but the objective here is to show that there is no differencebetween solving the unparameterized problem via the first order optimality conditionsor the Newton method. Also the correct implementation of a quadratic problem can bevalidated by quadratic convergence.

Algorithm 1 Globalized Newton-GMRES method (GNE-GMRES)

Require: Initial point x0 ∈ Ω, α1, α2 > 0, p > 0 and β, γ ∈ (0, 1) (Armijo)Set k := 0.while ∇f(xk) 6= 0 do

Solve the Newton equation ∇2f(xk)sk = −∇f(xk) for sk via GMRES.if GMRES did not converge then

Set sk = −∇f(xk).else if −∇f(xk)T sk < minα1, α2‖sk‖p‖sk‖2 then

Set sk = −∇f(xk).end ifChoose a maximal quasi Armijo stepsize σk ∈ 1, β, β2, . . . with xk + σks

k ∈ Ω and

f(xk + σksk)− f(xk) ≤ σkγ∇f(xk)T sk

Set xk + 1 = xk + σksk.

increase kend whilereturn Number of iterations k and xk.

4.4.2 Initial Clouds u0 with us ≡ 0

We start now in the familiar unit square box setting and set the noise level for thesynthesized measurements to 5% (ξ = 0.05). Here one supposes us ≡ 0. For the forward


simulation the gas source is modeled with the downward opened parabel

u0(x) = max(0, I

(1− σ(x− x)T · (x− x)

) ), σ = 40, I = 0.5, x = (0.35, 0.7).

This can be seen in Fig. 9. After the synthesizing of measurement data we start thereconstruction process by finding a suitable regularization parameters γ1, γ2 > 0 for thegiven noise ξ = 0.05.With δg = [Mδu0, 0, . . . , 0]T one has to find u0 such that

t = 0 t = 2 t = 4

Figure 9: Forward simulation of one initial contaminant source over t ∈ [0, 4]

δgT S−T Q(S−1g − yδ) + γ1Mu0 + γ2Ru0 =

δgT S−T QS−1g + γ1Mu0 + γ2Ru0 − S−T Qyδ = 0 for all δg.(4.3)

With a matrix G of dimension Nfem ×NT ·Nfem defined as

G := [M, 0, . . . , 0],

one writes the control equation (4.3) as a linear equation

GT S−T QS−1G u0 + γ1Mu0 + γ2Ru0 = GT S−T Qyδ (4.4)

This equation (4.4) in the form Ax = b is now solved with an iterative method, where wefavor GMRES over the CG-Method. We want to precondition the system (4.4). For thiswe chose

P := max(γ1, 10−11)M + γ2R

as preconditioner. One then solves

P−1(GT S−T QS−1Gu0 + γ1M + γ2R

)u0 = P−1 GT S−T Qyδ.

Note that P is still a sparse matrix. So utilizing the UMFPACK package for a fast LUPQ-decomposition of P , one speeds up the GMRES-method by externally applying the inverseP−1 with the LUPQ-scheme. Then the preconditioner is not handled by Matlab’s GMRESinternally. The number of iterations doesn’t change, but each iteration takes less time. Sowith a sufficient amount of iterations the LUPQ-decomposition becomes economic. In thefollowing the default precision of CG and GMRES for solving equations is set to 10−10.In order to not receive a solution mainly dominated by the measurement data’s noise one


adds the regularization terms γ1

2‖u0‖2

L2(Ω) and γ2

2‖∇u0‖2

L2(Ω) to the minimization problem.

To find good parameters γ1, γ2 > 0 we follow the instructions of [20, p. 1494]. If an upperbound for the norm of the measurement error or its magnitude is known, one tries to finda regularization i.e. parameters γ1, γ2 > 0 such that the residual norm (misfit) equalsthat boundary. For example suppose ξ is the an upper bound for the measurement error,then we choose γ1, γ2 > 0 such that for the solution u0 of (4.4) the following holds∣∣(S−1Gu0 − yδ)T Q(S−1Gu0 − yδ) − ξ

∣∣ < ε.

Here ε > 0 is a tolerance significantly smaller then ξ. The first disadvantage is, that thesolution u0 for calculating the residual norm depends on the regularization parameters γ1

and γ2. So essentially one has to find these parameters by systematic testing. Accordingto [20] the discrepancy principle leads to over-smoothing of the solution by taking γ1 toohigh. This leads to the arbitrary dropping of entries in the solution vector and thus thelost of possible solution data. Also such an upper bound might not always be known.Another heuristic approach and mostly a better way to determine the parameters is theL-curve method. Here we follow the instructions of [35, pp. 106-108]. The solution norm‖u0‖2

L2(Ω) is plotted against the overall residual norm ‖y(u0)−yδ‖2L2([T1,T ]×Γm) for different

values γ1, γ2, with a fixed ratio γ1/γ2 = const.. The arising curve has the shape of a L.The solutions’ more vertical part is still dominated by the errors in data whereas the flathorizontal part corresponds to parameters which over-smoothen the solution and lead toso called regularization errors. So it is desirable to choose a parameter γ correspondingto the L’s corner. According to [20, p. 1494] it is easier to find this corner, or morespecific the point of maximal curvature, on a double-logarithmic scale as one can see inthe following plot. Rather than estimating the parameters by finding the corner of theL-curve manually the Regularization Tools for Matlab by Hansen [18] provide tools forthis task. The documentation and the software is available at http://www2.imm.dtu.dk/

~pcha/Regutools/. The result of the L-curve method applied on the initial value problemcan be seen in Fig. 10. It shows an L shaped curve with the corner for γ1 = 1.3 ·10−5 andγ1 = 1.3 · 10−6 which are now the regularization parameters. The reconstruction foundwith these parameters can be seen in Fig. 11, where one can find for comparison theoriginal and a reconstruction with unsuitable parameters γ1, γ2.


Figure 10: L-Curve plot for initial condition at γ1/γ2 = 10−1. Here the solution normcorresponds to ‖u0‖2

L2(Ω) and the residual norm to ‖y(u0)− yδ‖2L2([T1,T ]×Γm).

original u0 improper regularization γ1 = 1.3 · 10−6,γ1 = 10−11, γ2 = 10−10 γ2 = 1.3 · 10−5

Figure 11: Comparison between reconstructions with different regularization and theoriginal.

4.4.3 Constant Source us ≡ const.

One assumes that us is constant, that means uns = us, ∀n = 0, . . . , NT . In the forwardsimulation, Fig. 12, one sees a contaminant distribution with one constant source givenby

us = max(0, e−90((x−0.35)2+(y−0.7)2) − 0.5).

The reconstruction is done by solving the first order optimality condition (3.18), given asLus(y, us, p)δus = 0 ∀δus ∈ RNfem . With δg = [Mδus,4tMδus, . . . ,4tMδus]

T one hasto find us such that

δgT S−T Q(S−1g − yδ) = δgT(S−T QS−1g − S−T QS−1yδ

)= 0 for all δg.


t = 0 t = 2 t = 4

Figure 12: Forward simulation of an constant contaminant source from t = [0, T ].

With a matrix G of dimension Nfem ×NT ·Nfem defined as

G := [M,4tM, . . . ,4tM ],

and adding the regularization, this can be written as solving a linear equation in formAus = b.

GT S−T QS−1g + γ1Mus + γ2Rus = GT S−T QS−1yδ (4.5)

Equation (4.5) can again be solved by a GMRES or CG Method, where we prefer GMRESand set the tolerance to GMRES-tol= 10−10 and GMRES-maxit=250.To find suitable parameters for the Tikhonov regularization we again utilize the L-curvemethod. Here for noise level ξ = 0.05, Fig. 13 suggests to chose γ1 = 10−5 and γ2 = 10−6.

Figure 13: L-Curve plot for constant source at γ1/γ2 = 10−1. Here the solution normcorresponds to ‖us(0)‖2

L2(Ω) and the residual norm to ‖y(us)− yδ‖2L2([T1,T ]×Γm).


original reconstruction

Figure 14: Comparison between the original source term us and its reconstruction for theconstant source scenario.

For the reconstruction, which can be seen in Fig. 14, the CG-Method needed 178 iterationswhereas the GMRES-Method, only needed 72 gradient evaluations. The FEM mesh isacquired by COMSOL’s dynamic mesh algorithm. Next we solve (4.5) for different meshrefinements. First a mesh is chosen, then the Navier-Stokes problem is solved followed bythe synthesizing of the measurement data. Also the minimization problem is solved bythe Newton method, Algorithm 1. The results are given in Table 4. With growing numberof nodes Nfem, i.e. finer mesh, the percentage of values y < 0 in the solution vector yof the forward simulation drops. The reason is, that with finer mesh and thus smallerPeclet number, the amount of oscillations decreases. Solving the first-order optimalityconditions with the GMRES method appears to be independent of the mesh, since thenumber of iterations and the relative residual of the acquired solution of (4.5) barelychanges. In the Newton method, Algorithm 1, the relative tolerance of GMRES is setto tol-GMRES-Newton=0.1, while the maximum number of iterations is set to maxit-GMRES-Newton=200. The tolerance on Algorithm 1 is tol-Newton = 10−10 and themaximum number of iterations is maxit-Newton=20. To coerce convergence in one step,one has to set tol-GMRES=tol-Newton, which is nothing else then applying GMRESon (4.5). But indeed the iteration history appears to be independent from the meshrefinement, except for two outliers, where the Newton method fails to converge, dueto stagnation in the inner GMRES. Table 5 gives the results for a similar experiment,but instead of changing the space discretization, one performs tests with different timediscretization. All tests were performed on machine (A).


Table 4: Convergence results for different FEM mesh refinements. Here Nfem describesthe number of nodes in the triangular FEM mesh and No. y < 0 in %, the percentage ofnegative entries in the solution vector y of the forward simulation.

Nfem 661 1323 1959 3889 12425 47651No. y < 0 in % 32.35 3.64 2.46 0.53 0.00 0.00GMRESIter. 67 71 71 72 72 72RelRes. 7.9e-11 7.7e-11 9.7e-11 9.7e-11 9.5e-11 8.3e-11Time [s] 0.70 1.35 2.09 5.31 23.11 107.97NewtonIter. 3 3 3 3 3 4GMRES-Iter. 218 151 153 155 155 360RelRes. DNC 3.358e-14 3.034e-14 3.556e-14 4.978e-14 DNCTime [s] 2.00 2.43 3.82 9.98 42.13 548.36

Table 5: Convergence results for different equidistant refinements of the time interval[0, T ].

NT 20 60 180 540 1620 4860Nfem 3889 3889 3889 3889 3889 3889No. y < 0 in % 0.53 2.11 6.14 9.04 10.16 10.58GMRESIter. 72 73 73 73 73 73RelRes. 9.7e-11 8.2e-11 9.0e-11 7.9e-11 7.9e-11 7.2e-11Time [s] 5.31 12.89 35.92 105.60 327.59 986.98NewtonIter. 3 3 3 3 3 3GMRES-Iter. 155 155 157 155 155 155RelRes. 3.585e-14 4.321e-14 3.186e-14 6.122e-14 4.977e-14 3.582e-14Time [s] 9.76 27.38 80.93 253.54 1015.28 5090.78


4.4.4 Time Dependent Source us

One assumes that us is not even space but also time dependent. So we look at theunparameterized case us > 0. For this case we choose as forward simulation a gas sourcemoving at a linear path through the unit square box, plotted in Fig. 15. The source termus can be seen in Fig. 16. For this scenario the noise level is set to ξ = 5 · 10−3. Next

t = 0 t = 2 t = 4

Figure 15: Forward simulation of constant source us in linear movement.

t = 0 t = 2 t = 4

Figure 16: Constant source us in linear movement.

we deviate from the standard setup to improve the solvability of this scenario. Since thereconstruction us should deliver information about every point at all times we have tobroaden the spatial variety on the location of the sensors. Instead of measuring on Γm wenow take measurements on 20 uniformly distributed points in the unit square box, whichcan be seen in Fig. 17. With very small circular areas around these measurement pointsthis then can be decoded in the matrix Q as a mass integral. All in all the new matrix Qdoes not access more nodes of the FEM discretization than in the standard setup, theyare just better distributed. We start with solving the first order optimality conditions(4.6), which equal the first step of a globalized Newton method with starting point zero.

(S−1)T Q(S−1g − yδ) + γ3Mus + γ4Rus + γτWτ us +∇penm(us) = 0

⇔ (S−1)T QS−1g + γ3Mus + γ4Rus + γτW us +∇penm(us) = (S−1)T yδ(4.6)

The linear equation (4.6) can be solved with a Conjugate Gradient Method (CG), butdue to better performance we stick with GMRES.As mentioned before, when solving a linear system Ax = b for x the CG and GMRESmethod benefit from suitable preconditioning with a quadratic symmetric matrix P , in


Figure 17: Randomly selected points of measurements.

the sense that one solves P−1Ax = P−1b. Looking at equation (4.6), while disregardingthe misfit term, a first guess for the Preconditioner P is

P := γ3M + γ4R + γτWτ .

But here the matrix W is causing problems. Although Wτ is symmetric and sparseit has three times more non-zero entries than M or R, because of its two block typesecondary diagonals. This blows up the size of P and increases calculation time too much,considering γτ is not the most important regularization parameter. The block diagonal ofWτ is written as diagblock(Wτ ) and almost equals M . Thus one chooses diagblock(Wτ ) in thepreconditioner P which doesn’t increase the storage size. When using FEM matrices themass matrix is always a good preconditioner, so one should as in [27] at least preconditionthe mass matrix M with a factor 10−11. These considerations lead to a preconditioner, asstated in (4.7).

P := max(γ3, 10−11)M + γ4R + γτdiagblock(Wτ ) (4.7)

If we want to conduct a reconstruction of barely moving source or even an in-time constantsource as in section 4.4.3 we have to attach importance to γτ , i.e. we expect a suitableγτ to grow with increasing in-time steadiness of one or multiple sources. But the ideaof this Ansatz was to reconstruct us without further knowledge of its time dependentbehavior. One favors steady reconstructions over oscillating ones, because physics itself isalways pursuing an equilibrium, for example Diffusion itself. With the already introducedL-curve method we want to obtain the regularization parameters γ3, γ4, γτ and γm. Todetermine these parameters L-curves were plotted for different fixed ratios γ3/γ4, γ3/γτ andγ3/γm. We then chose one of the curves with very good L-shape, which can be seen in Fig.18. For the following fixed ratios

γ3/γ4 = 1.11 · 104, γ3/γτ = 3.76 · 102, γ3/γm = 9.10 · 103,


the regularization parameters found by the L-curve method according to Fig. 18 are

γ3 = 1.70 · 10−3, γ4 = 1.55 · 10−7, γτ = 4.59 · 10−6 and γm = 1.89 · 10−7.

As already mentioned several times the Newton method implemented in Algorithm 1

Figure 18: Here the solution norm corresponds to ‖us‖2L2([0,T ]×Ω) and the residual norm

to ‖y(u0)− yδ‖2L2([T1,T ]×Γm).

should converge in one step when given us = 0 as a starting point. But the first runin table 6 needs two steps. This run uses the function in Algorithm 1 to determine theresidual tolerance for the GMRES in each Newton step, resulting in an inexact Newtonmethod. In the second run we make sure that this tolerance is smaller than 10−10, whichleads to convergence in one step. The corresponding reconstruction along with an examplefor almost no regularization can be seen in Fig. 19. One can see that the regularizationis highly necessary for a good reconstruction. Since it takes some time for the gas toreach the sensors and thus deliver information about its source, the discrepancy betweenoriginal and reconstruction at the end of the measurement period [T1, T ] is expected.

Table 6: Iteration history for the Newton method at different tolerances for GMRES.

Iter. NormGrad Step GMRES-it stepsize Armijo-Iter. errors0 0.007141561 - NaN 1 0 -1 1.191672e-05 Newton 28 1 0 -2 1.201428e-12 Newton 123 1 0 -Elapsed time is 33.180695 seconds.

0 0.007141561 - NaN 1 0 -1 4.612693e-11 Newton 110 1 0 -Elapsed time is 24.526433 seconds.


t = 0 t = 1 t = 2 t = 4

Original source term us

Reconstruction without regularization, i.e. γ3, γ4, γτ , γm, γm2 = 10−14.

Reconstruction with regularization found by the L-curve method.

Figure 19: Reconstruction of the source term us.


4.5 Parametrization

The theory already provides us with the gradient and the Hessian of the parametrization’sreduced cost functional. But there still remain constraints on the parameters. For thegiven sample problem in the unit square box one sets the basic linear constraints

0 ≤ σ ≤ 1000 0 ≤ I ≤ 1 0 ≤ (x)i ≤ 1 ∀i = 1, 2. (4.8)

These are plausible limits for the gas source which derive from the setting’s size and theshape of Ω. Note that the numerical constraints on x can be more refined such thatthey exactly reflect their continuous counterpart x ∈ Ω. It is also possible to combineconstraints on the position x with the size σ since we assume that a contaminant sourceis fully contained in box Ω.The MATLAB Optimization Toolbox already provides a suitable solver called fmincon7.It implements an interior-point algorithm and an simple interface to pass the linear con-straints, cost functional, gradient and Hessian.This algorithm adds a logarithmic barrier-function to handle the constraints via solvingan approximate problem. It uses two types of steps. First a Newton step, which is basedon solving the KKT conditions. In case the Newton step can’t be obtained a conjugategradient step, using a trust region is performed. This means that by default the Newtonstep is used, which should yield a similar convergence behavior for different but sufficientrefinements of the FEM discretization.The first order optimality given as output by the interior-point algorithm differs slightlyfrom the one presented in (3.33). Let the lower bounds on the parameters be given in avector l and the upper bounds in u as in (4.8). Then the first order optimality used inthe interior-point Algorithm is given in (4.9). Note that in case of an admissible point athe first order optimality (4.9) is just the L∞ norm of ∇aJ(a).

First Order Optimality := max

‖L(a, λ)‖∞|li − ai|λlower,i|ai − ui|λupper,i

(4.9)

All in all, three mayor numerical tests have been implemented. First we focus on thereconstruction of an initial condition.

4.5.1 Initial Condition u0

We stay in the same setting as the unparameterized case in Section 4.4.2, where onesupposed us ≡ 0 and kept the noise level at 5% (ξ = 0.05). The corresponding forwardsimulation can be seen in Fig. 9. First one has to determine how to set the parameterγa > 0. Experiments have shown that the interior-point Algorithm converges indepen-dently from the choice of γa. The reconstruction does not benefit from the regularizationparameter γa. This can be seen in Fig. 20. The reference value for the Gaussian Bellis aref = (500, 0.5, 0.5, 0.5) and for the Parabel Ansatz aref = (40, 0.5, 0.5, 0.5). So wechoose γa = 10−12 since the main goal is to minimize the misfit. Table 7 gives thereconstructions found by fmincon with the Parabel and Gaussian bell Ansatz without

7from MATLAB 7.8.0


Figure 20: Residual norm (misfit) plotted against the solution norm for values γa ∈[10−1, . . . 10−14]. Here the solution norm is defined as ‖a− aref‖, where aref is a referencevalue. The residual norm is given as ‖y(f0(a))− yδ‖L2(T1,T ;Γm).

original u0 unparametrized Gaussian bell Parabel AnsatzAnsatz

Figure 21: Overview over different reconstructions for the initial value problem.

regularization i.e. γa = 10−11, which can also be seen in Fig. 21 As expected bothparametrizations find the right space coordinates and have a higher variance on the sizeand intensity. The Gaussian bell does not coincide with the original parabolic shape ofu0 which leads to the higher intensity I. To further compare the Gaussian bell and the

Table 7: Comparison of Gaussian bell and parabel Ansatz for the initial condition u0.

type starting point convergence time resultσ I x σ I x

Gaussian bell 500 0 (0, 0) 233.15s 222.732 0.692 (0.351, 0.704)Parabel 10 0 (0, 0) 44.47s 37.703 0.459 (0.353, 0.698)Parabel original - 40 0.5 (0.350, 0.700 )

Parabel Ansatz we plot for γa = 0 the mass∫

Ωf0(a, x) dx for different values of noise. As


already said the shape of these two functions differ. Both give an accurate position butdiffer in the distribution of their mass. The Gaussian bell Ansatz tends to estimate moremass than the Parabel Ansatz, which can be seen in Fig. 22a. Compared to the originalcloud, formed by a Parabel, both Ansatz types underestimate the mass, until the result isnot a proper reconstruction. The plot in Fig. 22 suggest that the parabel Ansatz is morerobust against noise, but this is not necessarily true, since the synthesized data stemsfrom a parabel. On the other hand synthesizing the initial value u0 with a Gaussian Bellsuggests the same hypothesis as can be seen in Fig. 22b.

(a) Initial parabel (b) Initial Gauss. bell

Figure 22: Mass∫

Ωf0(a, x) dx for the corresponding reconstruction plotted for different

values of noise

4.5.2 Multiple Initial Sources in u0

Since the parametrized reconstruction in the previous section went well, we extend thismodel to multiple initial sources. For this one chooses for u0 two sources of parabolicshape.

u0(x) =2∑i=1

max(0, Ii

(1− σi(x− xi)T · (x− xi)

) ),

x1 = (0.4, 0.6), x2 = (0.65, 0.5), σ1 = 40, σ2 = 50, I1,2 = 0.5.

As can be seen in Fig. 23 the two sources are positioned close to each other. The intentionbehind this is to make the reconstruction more likely to fail to distinguish between twoseparated sources. Initially the number of clouds is unknown. So one starts with aGaussian bell Ansatz at a higher number of clouds than suspected. Successively thenumber is decreased as long as the reconstruction reveals clouds of insignificant mass,where the mass of a cloud f0 is defined as

∫Ωf0(x) dx. In Table 8 one can see the

results of this approach. It shows the reconstructed intensity, width and coordinates ofthe clouds. With decreasing number of Ansatz clouds, the ones with relative little massdrop out first, while the total mass in the system doesn’t undergo mayor changes. Onlythe one-cloud Ansatz yields an outlier. There the total mass drops abruptly by ca. 50%,


t = 0 t = 2 t = 4

Figure 23: Forward simulation of multiple initial contaminant sources over t ∈ [0, 4].

which can be interpreted as a sign that something is wrong with this Ansatz. Still there isa small discrepancy left regarding the mass of the two Gaussian clouds compared to theiroriginals, which probably stems from the different Ansatz shapes and may be improvedby a penalty on the high peaks in the intensity.Unlike the unparameterized case the parametrization delivers the Hessian∇2

aJ(a) directly,which enables us to calculate its smallest eigenvalue λmin and condition κ. The interior-point algorithm terminates for all four Ansatz functions after the first-order optimalitydrops below a tolerance of 10−10. The iteration history is given in Fig. 24, which revealsthe fast convergence of the algorithm for every Ansatz. In Fig. 25 one can see that the

Figure 24: First order optimality plotted over the iteration history of MATLAB’s interior-point algorithm for the four different Ansatz functions.

parametrization gives a good reconstruction. Nevertheless this can be misleading aboutthe accuracy, since the unparametrized case Fig. 25a is with suitable regularization just ablurry version of Fig. 25b. There it is still possible to identify two circular shaped clouds.So this does not mean, that Fig. 25b is somehow a better result than Fig. 25a, since theparametrization always implies a sharp image, even if the algorithm didn’t converge atall.


Table 8: Results of the parametrized problem for u0 at changing number of Ansatz func-tions

sources 4 3 2 1 originalσ 318.083505 312.085435 292.545589 71.546512 (40)I 0.692838 0.683816 0.649709 0.359682 (0.5)x1 0.649670 0.648840 0.688025 0.606320 0.65x2 0.513981 0.514480 0.510663 0.514497 0.5mass 0.004676 0.004638 0.005014 0.005207 0.0065σ 263.967297 303.810110 221.908778 (40)I 0.719529 0.782447 0.672794 (0.5)x1 0.408764 0.411467 0.403547 0.4x2 0.597043 0.593638 0.598108 0.6mass 0.006160 0.006329 0.006407 0.0052σ 563.224920 1.721137I 0.027930 0.014136x1 0.011185 0.001000x2 0.999992 0.146709mass 0.000001 0.000083σ 4.245228I 0.012490x1 0.340076x2 0.194425mass 0.000056

total mass 0.0109 0.0110 0.0114 0.0052 0.0117

λmin of ∇2aJ(a) 9.6267e-13 5.2279e-10 2.7185e-09 1.2164e-06 3.2269e-07

condition κ 3.1643e+12 7.0370e+09 1.3017e+09 1.0173e+06 7.7585e+06

(a) GMRES (b) parameterized

Figure 25: Comparison between unparametrized and parametrized case.


4.5.3 Linear Moving Source us of Constant Intensity over Time

Until now one only considered stationary and in-time-constant sources in the standardsetup with sensors placed on Γm. Now we are curious whether it is possible to track amoving gas source, as given in the following scenario. A gas source, in-time-constant inwidth σ = 300 and intensity I = 0.3, moves along a linear slope from point x1 = (0.35, 0.7)to the point x2 = (0.9, 0.2) at constant speed over the time interval [0, T ]. This can beseen in 26. One takes, as before, measurements on Γm over time [T1, T ]. We get these

t = 0 t = 2 t = 4

Figure 26: Forward simulation of a linear moving in-time constant intensity contaminantsource for t ∈ [0, 4].

by applying the noise ξ = 0.01 on the forward simulation results’. This converges with

t = 0 t = 4

Figure 27: Reconstruction of us for the linear moving source problem.

fmincon fairly fast and independent from the starting point as can be seen in the resultsof Table 9. The corresponding reconstruction with start- and endpoint can be seen in Fig.27. Since this scenario turned out to work well, the next section deals with a generalizationof this case.


Table 9: Convergence results for the interior-point Algorithm applied on the linear movingsource problem for different starting points.

starting point time resultσ I x1 x2 σ I x1 x2

100 0.5 (0.1,0.1) (0.1,0.1) 27.30s 306.27 0.3049 (0.354,0.699) (0.901,0.194)500 0.5 (0.1,0.1) (0.1,0.1) 36.57s 306.27 0.3049 (0.354,0.699) (0.901,0.194)100 0.5 (0.1,0.1) (0.9,0.9) 24.73s 306.27 0.3049 (0.354,0.699) (0.901,0.194)300 0.1 (0.9,0.1) (0.1,0.9) 24.84s 306.27 0.3049 (0.354,0.699) (0.901,0.194)300 0.1 (0.1,0.9) (0.9,0.1) 19.08s 306.27 0.3049 (0.354,0.699) (0.901,0.194)

4.5.4 General Parametrized us

Previously, one fundamental assumption was, that the time t when the contaminant isreleased is known as t = 0. Now we flip this standard scenario. Measurements arenow taken for t ∈ [0, T ] on the common Γm. A contaminant cloud of constant intensityand shape starts moving along a circular-shaped path at t = T1 = 1 till the end of thesimulation t = T = 4. The actual path and the corresponding forward simulation canbe seen in Fig. 28. For synthesizing the noise level is kept on ξ = 0.01. One now

t = 0 t = 1 t = 3 t = 4

Figure 28: Forward simulation of a circular moving contaminant source for t ∈ [T1, T ]with in-time constant intensity. The circular shaped path of the source is marked with ablack line, where the white x marks the current midpoint of the contaminant source.

searches for parameter vectors aip1 , · · · , aipNip, Nip = 8, such that their linear interpolation

ai, . . . , aNT , NT = 20 on the time grid, describes time dependent parameters for Gaussianbell shaped source uis = fgauss(ai), ∀i = 1, . . . , NT . This us should then minimize the cost

functional J(aip), where we set aip = [aip1 , · · · , aNipip ] such that aip ∈ R4·Nip . The standard

linear constraints are applied on aip. Solving with fmincon’s interior-point Algorithm,yields convergence in 625 iterations, after the norm of first order optimality conditionsdrops below 10−10 to 5.510738 · 10−11. With high Nip calculations becomes more timeconsuming for each iteration, which yields a total of 1h 13min running time on machine(A). The acquired solution is visualized in Fig. 29. The first figure, Fig. 29a, shows thereconstructed path as a linear spline, where the supporting points also known as aipi aremarked by red dots. For comparability the original path is marked with a blue dashedline. Obviously the major part of the path is correct, except for the first t = 0 and lastpoint t = 4. But keep in mind that Fig. 29a illustrates only the x and y-coordinatesfrom aip. The remaining factors, intensity I and radius σ of the Gaussian bell, have to


(a) reconstructed path

(b) exhaust over time

Figure 29: Results for a circular moving contaminant source. The blue lines mark theoriginal data.

be considered together. The exhaust mass integral∫Mfs(a

ip, t) dx is plotted over time inFig. 29b, here the original exhaust is also represented by the dashed blue line. The spacecoordinates of the outliers at t = 0 and t = 4 can be regarded as irrelevant since their massis way below average. Additionally we are finally able to retrieve an estimate for the timeT1, when the massive exhaust of contaminant starts, though we didn’t explicitly searchfor it. The explanation for the instant drop in mass exhaust at the end of the time periodcomes quite natural. Contaminant needs a certain time to actually reach the sensor areaΓm. Since we didn’t apply additional terms to the cost functional for penalizing non-constant behavior, the reconstruction is unable to provide information about a time, forwhich no sensor data was acquired. So in fact numerical experiments showed, that itdoesn’t matter whether there is a peak in exhaust (but restrained on a very small area),or no exhaust at all, because the sensors won’t pick up any information about that, aslong as they are too far away.


original

reconstruction

t = 0 t = 1 t = 2 t = 3

Figure 30: Comparison between original source term us and its reconstruction.

4.6 Outlook on Sparsity (`1-penalization)

For the discretized problem the vectors us and u0 have a large amount of entries. Sincewe always can assume that a gas source is concentrated in a small part of the area Ω, wecan assume that most of the entries in the discretized unparameterized reconstruction arezero. This is also referred to as sparsity. So we prefer a reconstruction us to be sparse.But as we could see Section 4.4 the Newton method delivers solutions with many entriesclose to zero, but not strictly zero. Enhancing the sparsity of a solution means forcingsome of these entries to be actually zero. The `1-penalization given by the L1-norm as‖x‖1 = |x1|+ |x2|+ · · ·+ |xn| is known to enhance sparsity.In this section we adopt the theory, Algorithms and results of [25] to give an outlook onenhancing sparsity of solutions us and u0 from Section 4.4. Also [29] provides theory inthe function space similar to Problem (2.18) with additional sparsity `1-regularization.More about the Newton method on sparsity can be found in [16], where the more generaltheory is described in [33]. For additional Algorithms, also tested by Milzarek [25], referto [14].The `1-penalization of the Tikhonov regularized problem is issued in (4.10) and (4.11).Here we set the penalty parameter µ > 0 and define the initial value problem

minu0

Φ (u0) := J(u0) + µ‖u0‖1, (4.10)

and the generalized source problem

minus

Φ (us) := J(us) + µ‖us‖1. (4.11)

For the sake of a clear notation we present Algorithms for the more general problem(4.12),

minx∈Ω

f(x) (4.12)


for which we want to find a solution with enhanced sparsity. So one issues the problem

minx∈Ω

Φ(x) := f(x) + µ‖x‖1. (4.13)

Here x ∈ Rn, with dimension n and of course the function f with its derivatives arenothing else than a substitution for u0 or us. First we set x := u0 ∈ RNfem and f := Jwith the derivatives g := ∇J and H = ∇2J , which we are able to provide by the Lagrangemechanism from Section 3.3. For the generalized source we set x := us ∈ RNfem·NT andanalogously f := J , g := ∇J and H = ∇2J .We call x ∈ Rn a stationary point of Φ if it satisfies

Φ′(x)d ≥ 0 ∀d ∈ Rn.

Next we present Algorithms, which search for stationary points of Φ and thus a possiblesolution of (4.13).

4.6.1 SNF and FXP-QA

First we have to introduce additional functions and notations. We fix τ > 0 and letP[−µ,µ] : R→ [−µ, µ] denote the projection onto the set [−µ, µ], i.e.P[−µ,µ](t) := minmax−µ, t, µ, which is applied component-wise. The gradient andthe Hessian of f are written as g(x) := 5f(x) and H(x) := 52f(x) respectively. Thenwe define

Fτ (x) := g(x)− P[−µ,µ]

(g(x)− x

τ

)and

Gτ (z) := z − τg(z) and Sν(z) := z − P[−ν,ν](z).

Note that Fτ is semismooth because it is piecewise in C1. With Tikhonov regularizationthe cost functional J is not only convex but strictly convex, so we will proceed with thestrictly convex case for the following Algorithms. We set

gk := 5f(xk), 4k := (gk)Tdk + µ(‖Sτµ(Gτ (x

k))‖1 − ‖xk‖1

).

The first Algorithm we are going to present is Algorithm 2, which is taken from [25]. Itis also related to a method proposed by Wen, Yin, Goldfarb and Zhang in [37]. It isa fixed point method with a monotone quasi-Armijo rule. For the next Algorithm onecombines the Fixed Point Method with Quasi-Armijo rule given in Algorithm 2, with asemismooth Newton method and a multidimensional filter. So in case the step given bythe semismooth Newton method is somehow not acceptable or can’t be calculated at all,one falls back on the Fixed Point Method.We adopt the filter with the filter function θ : Rn → Rp+ as defined in (4.14). For thedecomposition scheme given in (4.15) one chooses pF ∈ 1, . . . , n. Note that, accordingto experiment’s of Milzarek [25], the Algorithm 3 is quite insensitive to the choice of pF .

θj(x) :=1√|Ij|‖Fτ,Ij(x)‖ =

1

Ij

∑i∈Ij

Fτ,i(x)2

1/2

∀j ∈ 1, . . . , pF (4.14)


Algorithm 2 Fixed Point Method with Quasi-Armijo Rule (FPC-QA)

Require: Initial point x0 ∈ Ω, τ > 0, β, γ ∈ (0, 1)Set k := 0.while dk 6= 0 do

Calculate the direction: dk = Sτµ(Gτ (xk))− xk.

Choose a maximal quasi armijo stepsize σk ∈ 1, β, β2, . . . with xk + σkdk ∈ Ω and

φ(xk + σkdk)− φ(xk) ≤ σkγ4k

Set xk + 1 = xk + σkdk and increment k.

end whilereturn Number of iterations k and xk.

I1 := 1, . . . , l I2 := l + 1, . . . , 2l . . . IpF := (pF − 1)l + 1, . . . , n

l =

⌊n

pF

⌋= max

k ∈ Z|k ≤ n

pF

(4.15)

In the case pF = n, leading to Ij = j, we obtain

θ(x) = (|Fτ,1(x)|, |Fτ,2(x)|, . . . , |Fτ,n(x)|)T .

The nonsmooth Newton equation is given as

M(xk)sk = −Fτ (xk),

where M(xk) is a generalized derivative of Fτ in xk given as

M(x) := (I −D(x))H(x) +1

τD(x).

Here D(x) is defined as a diagonal matrix with the components

Dii(x) =

0 |gi(x)− 1

τ| > µ

1 |gi(x)− 1τ| < µ for all i = 1, . . . , n

∈ 0, 1 |gi(x)− 1τ| = µ

.

The semismooth Newton steps can be robustified with a function ρ(t) = ctp with p ∈(0, 1], c > 0 by replacing M(x) with

Mρ(x) := (I −D(x)) (H(x) + ρ(‖Fτ (x)‖)I) +1

τD(x). (4.16)

With (4.16) one tries to solve Mρ(xk)sk = −Fτ (xk) in each step of Algorithm 3. Since this

is mostly done using a matrix free method one uses a simple block elimination techniqueto reduce the complexity. For this one introduces the indexing sets

I := i : Dii(x) = 1 and A := i : Dii(x) = 1.


Algorithm 3 Semismooth Newton Method withMulti-dimensional Filter Globalization (SNF)

Require: Initial point x0 ∈ Rm , τ > 0, β, γ ∈ (0, 1) (quasi-Armijo parameters) γF ∈(0, 1), F−1 = ∅ (filter parameters) and αi > 0 for i ∈ 1, 2, 3, ν ∈ (0, 1).

1: Set k := 0, ϕ0 :=∞, ρ0 :=∞ and the filter F0 = θ(x0).2: while Fτ (x

k) 6= 0 do3: Compute the semismooth Newton step sk via M(xk)sk = −Fτ (xk).4: if Calculation of Newton step is possible then5: Set xk+1 = xk + sk and check xk+1 lies in Ω and is acceptable for filter Fk6: if xk+1 ∈ Ω and

max1≤j≤p

(qj − θj(xk+1) ≥ γF max1≤j≤p

θj(xk+1) ∀q ∈ Fk

then7: Set ϕk+1 = ϕk, ρk+1 = minρk‖Fτ (xk+1)‖ and increment k8: Add θ(xk) to the filter by Fk = Fk−1 ∪ θ(xk).9: continue with next iteration

10: end if11: end if12: Compute the direction dk = Sτµ(Gτ (x

k))− xk and choose a maximal quasi-Armijostep σ ∈ 1, β, β2, β3, . . . ⊂ (0, 1] satisfying

φ(xk + σkdk)− φ(xk) ≤ σkγ 4k .

13: Set xk+1 = xk + σkdk and ϕk = φ(xk+1) and increment k.

14: Set Fk = Fk−1.15: end while16: return Number of iterations k and xk.


The smaller problem is then given as

skI = −τ[Fτ (x

k)]I ,[

H(xk)]A,A = −

[Fτ (x

k)]A −

[H(xk)

]A,I s

kI .

Adapting the numerically robust formulation from (4.16) yields in the system (4.17).

skI = −τ[Fτ (x

k)]I ,[

H(xk) + ρI]A,A = −

[Fτ (x

k)]A −

[H(xk)

]A,I s

kI

(4.17)

In the later implementation we set ρ(xk) = ‖Fτ (xk)‖.Note that there are several methods to solve the Newton equation M(xk)sk = −Fτ (xk).Basically an iterative solver as in Algorithm 1 is used. One has the choice between aCG or GMRES method, where GMRES provides higher performance for ill-conditionedsystems. These solvers are early terminated. This means, they are terminated, whenthe Newton equation is solved to a desired relative tolerance or the maximum number ofiterations is exceeded. Especially the maximum number of iterations is chosen fairly low,which yields GMRES to terminate quite early. Because of this choice the obtained Newtonstep is quite inexact. If there are no other errors during the application of GMRES, weconsider the case “Calculation of Newton step is possible” to hold. This is almost alwaysthe case, so the decision whether to proceed with the Newton step depends mainly on theacceptance by the filter. So this means that the filter framework has to compensate forpoor calculation of the Newton step.

4.6.2 Application

In the next step the Algorithms from Section 4.6 are applied to solve the adapted problem(4.10). Here, the default parameters can be found in Table 10. Note that the relativetolerance tol-sparse is calculated with respect to the first step of each Algorithm for anall zero starting point. Here all calculations were carried out on machine (B). In the firstexperiment we show, that it is possible to use the FPC-QA Algorithm and gain plausibleresults, although the costs of calculation are quite high. As a starting point for FPC-QAwe use the solution of the discrete first order optimality conditions (4.4) at a precision oftol-GMRES=10−10, which were already obtained in Section 4.4.2.As we can see in Table 11 one obtains solutions with enhanced sparsity since the numberof nonzero entries decreases. For this refer to Fig. 32, where the solution for µ = 10−4

is obviously to much penalized, while for µ = 10−6 FPC-QA yields a quite denoisedreconstruction. For a comparison to the original refer to Fig. 31. Also a pleasant effectis, that negative entries, which make no sense in a physical way, tend to disappear. Notethat experiments found that decreasing the tolerance in the GMRES method when solving(4.4) does not lead to a decreasing number of negative entries in the solution vector u0.Now presentable solutions for a moderate choice of tol-sparse were acquired. As we canalready estimate from the convergence times given in Table 11 and especially Table 12,high precision solutions for tol-sparse=10−6 would exceed a normal time frame. Since theenhancement in sparsity seems sufficient for this problem we stick with tol-sparse=10−3.


Table 10: Summary of parameters and their default values for SNF and FPC-QA

tol-sparse relative terminating tolerance of the SNF and theFPC-QA Algorithm, tol-sparse=10−3

maxit-FPC maximum number of iterations for theFPC-QA Algorithm, maxit-FPC=2 · 105

maxit-SNF maximum number of iterations for the SNF Algorithm,maxit-SNF=15000

β, γ parameters for the quasi-Armijo conditionβ = 0.1, γ = 0.1

τ parameter for the nonsmooth function Fτ , τ = 4γF factor for the filter acceptance criterion, γF = 10−3

pF filter decomposition factor, pF = 1000GMRES-tol, GMRES-maxit parameters to control the accuracy of the

GMRES method, GMRES-tol=0.1, GMRES-maxit=5PCG-tol, PCG-maxit parameters to control the accuracy of the PCG method,

PCG-tol=0.1, PCG-maxit=10

Table 11: Convergence results for the initial value problem. Parameters are the defaultparameters from Table 10. Here N<0 denotes the number of negative entries and N6=0 thenumber of nonzero entries in the solution vector.

Method GMRES FPC-QAµ - 10−4 10−5 10−6 10−7 10−8

Rel.tol. 1e-6 1.00e-03 1.00e-03 1.00e-03 1.00e-03 1.53e-04misfit 1.3690e-04 1.1289e-03 2.3783e-04 1.4378e-04 1.3659e-04 1.3690e-04Iter. 45 16944 36346 81924 16938 0Time 2.59s 27min 9s 58min 41s 2h 13min 54s 27min 44s 0min 0sN<0 1765 0 3 101 1378 1765N6=0 3889 53 285 749 3253 3889

Table 12: FPC-QA applied on the initial value problem for lower tolerance tol-sparse=10−4.

µ 10−4 10−5 10−6 10−7 10−8

Rel.tol. 1.00e-04 1.00e-04 1.00e-04 1.00e-04 1.00e-04misfit. 1.1280e-03 2.3876e-04 1.4522e-04 1.3736e-04 1.3685e-04Iter. 31799 92604 228740 351223 28435Time 51min 15s 2h 31min 47s 6h 8min 8s 9h 26min 59s 46min 3sN<0 0 3 102 519 1736N6=0 52 278 721 1741 3838

This Section should give a short outlook on sparsity, so we apply now the SNF on the initialvalue problem. Here the convergence behavior is dominated by the choice of GMRES-tol,GMRES-maxit or PCG-tol, PCG-maxit respectively and the filter acceptance criterioncontrolled by γF . Table 13 shows results for different parameter combinations.


(a) original u0 (b) γ1 = 1.3 · 10−6,γ2 = 1.3 · 10−5, µ = 10−6

(c) γ1 = 1.3 · 10−6,γ2 = 1.3 · 10−5, µ = 0

Figure 31: Comparison between the plain Tikhonov regularized solution and the Tikhonovregularized solution with `1 penalization.

µ = 10−4 µ = 10−5 µ = 10−6

µ = 10−7 µ = 10−8

Figure 32: Plotted results of the FPC-QA method corresponding to Table 11 at differentvalues µ.

The main objective is to keep the amount of rejected Newton steps low, since their cal-culation is cost intensive. On the other hand one has to ensure a certain quality of eachperformed Newton step to yield convergence, which can be done by increasing γF , whichyields more rejected steps. The other, rather costly option is to improve the Newtonsteps’ quality with higher GMRES-maxit or PCG-maxit, where at more than ten itera-tions GMRES clearly outperforms PCG.As can be seen in Table 13, SNF convergences for µ = 10−7, 10−8 in a reasonable timeframe, but only for γF = 10−2, GMRES-tol=0.1, GMRES-maxit=20. In all other casesSNF fails to converge. One reason is that too many Newton steps were rejected causingSNF to be heavily outperformed by the basic FPC-QA.But, considering the results in Fig. 32 one can see that the `1-penalization has the poten-


tial to increase the solution quality for this application. Although further investigationexceeds the scope of this thesis, we can conclude that with different parameters and pos-sibly additional Algorithms one can probably achieve better solutions in less time. Giventhe broad possibilities for applications and the recent developments in sparsity enhancingtheory this would be an interesting topic for further research.


Table 13: Results of the SNF Algorithm for different parameter combinations, with GM-RES or PCG method. Here N<0 denotes the number of negative entries and N6=0 thenumber of nonzero entries in the solution vector u0.

µ 10−4 10−5 10−6 10−7 10−8

γF = 10−3, maxit-SNF=15000, PCG-tol=0.1, PCG-maxit=10, pF = 1000N<0 370 993 1298 1503 1787N6=0 697 1774 3337 3837 3887‖Fτ‖ 2.375e-03 5.397e-04 8.982e-05 9.064e-06 8.933e-06Rel.tol. 1.00 1.40e-01 2.21e-02 2.22e-03 2.19e-03Misfit. 8.0632e-02 8.1325e-04 1.4542e-04 1.3810e-04 1.4142e-04Newt. steps 12 23 1454 34 10Fxp. steps 14988 14977 13546 93 0Iter. 15000 15000 15000 127 10Time 3h 40min 56s 3h 44min 25s 4h 21min 52s 2min 41s 14sγF = 10−2, maxit-SNF=15000,GMRES-tol=0.1, GMRES-maxit=20, pF = 1000

N<0 69 168 255 1055 1744N6=0 157 564 1092 2808 3876‖Fτ‖ 1.151e-03 1.782e-04 1.515e-05 4.083e-06 3.987e-06Rel.tol. 4.85e-01 4.63e-02 3.73e-03 1.00e-03 9.76e-04Misfit. 9.5011e-04 2.5098e-04 1.4535e-04 1.4063e-04 1.4092e-04Newt. steps 28 59 68 26 25Fxp. steps 14972 14941 14932 1454 56Iter. 15000 15000 15000 1480 81Time 2h 34min 54s 2h 38min 31s 2h 37min 53s 16min 49s 56s

γF = 10−2, maxit-SNF=1000, GMRES-tol=0.1, GMRES-maxit=100, pF = 50N<0 12 149 976 1620 1740N6=0 78 622 2533 3877 3885‖Fτ‖ 3.560e-04 2.013e-04 5.283e-05 5.639e-06 2.997e-06Rel.tol. 1.50e-01 5.23e-02 1.30e-02 1.38e-03 7.33e-04Misfit. 1.1120e-03 2.3750e-04 1.4477e-04 1.3740e-04 1.3976e-04Newt. steps 998 1000 1000 869 11Fxp. steps 2 0 0 131 0Iter. 1000 1000 1000 1000 11Time 11min 16s 18min 3s 30min 56s 58min 13s 0min 20s

γF = 10−2, maxit-SNF=15000, PCG-tol=0.1, PCG-maxit=5, pF = 1000N<0 957 848 488 1214 1739N6=0 1716 2058 1637 3726 3880‖Fτ‖ 2.101e-03 2.422e-04 9.279e-06 3.579e-05 9.869e-06Rel.tol. 8.84e-01 6.29e-02 2.29e-03 8.76e-03 2.42e-03Misfit. 1.7603e-02 5.4588e-04 1.4648e-04 1.6482e-04 1.4180e-04Newt. steps 5210 1018 214 8 17Fxp. steps 9790 13982 14786 197 337Iter. 15000 15000 15000 205 354Time 2h 38min 18s 2h 32min 56s 2h 40min 33s 2min 39s 4min 29s


4.7 Extension Into Three Dimensions

At last the model is extended onto three dimensions. Here κ = 10−3 and Re = 100. Asbasic geometry we choose a closed box, in which we place, similar to the two dimensionalcase, two obstacles. This can be examined in Fig. 33. In analogy to the previous examplethe measurement area Γm is set as the surface of the obstacles, which is marked bluein Fig. 33. The used FEM discretization consists of 68781 nodes, which form 47075

Figure 33: Three dimensional empty box with one central and one displaced obstacle.

tetrahedral elements. This results in 5954 tetrahedral elements for the two dimensionalboundary. For a tetrahedral element the element quality q is defined in (4.18), where Vis the volume and h1, . . . , h6 are the side lengths of the element.

q =72√

3V

(h21 + h2

2 + h23 + h2

4 + h25 + h2

6)3/2

(4.18)

Figure 34 shows the element quality for different viewpoints, which fulfills q > 0.1 so itshould not affect the solutions quality. The Navier-Stokes equations were solved in threedimensions for boundary conditions enforcing a rotating velocityfield. To acquire conver-gence with the damped Newton method, the damping factor was set to 10−2 and spooleswas used as the linear solver. It can be seen in Fig. 35 that the air in the containmentmoves clockwise with enforced downward drag on the y-z-plane for x=0 and an upwarddrag on the y-z-plane for x=2. This results in mild turbulences, which are revealed bythe flow lines plot.For the forward simulation an initial gas source was set up with a paraboloid intensitydistribution (4.19). The noise is set as ξ = 5 · 10−3.

u0(x) =f0(I, σ, x;x) = max(0, I

(1− σ(x− x)T · (x− x)

) ),

σ = 20, I = 0.5, x = (1.5, 0.3, 0.7)T(4.19)

Density plots in three dimensions are always tricky to realize, especially for different timesteps. Figure 36 shows besides the standard isosurface plot, the gas concentration ondifferent slices. In this plot, values smaller than 0.02 were excluded, so one gets a betteroverview.To apply the techniques presented in the two dimensional case, an unparameterized anda parameterized reconstruction are presented. With the L-curve method, see Fig. 37, thecurve closest to an L-shape was acquired with the ratio γ1/γ2 = 10. Then the regularization


parameters are found as γ1 = 10−6 and γ2 = 10−7. Here, GMRES with GMRES-tol=10−6

solves the discrete control equation in 342 iterations. The solution is displayed in Fig. 38,where values below 0.02 are excluded for the sliced plot.The parameterized reconstruction is done according to Section 4.5.2, except that here theGaussian bell is extended to three space dimensions. The results of applying the interior-point Algorithm are given in Fig. 39 and Table 14.We now arrive at the last result, which reflects the main aspects of this thesis.The discretized problem was set up consistent to the infinite dimensional theory. In sucha setting one expects from the application of Algorithms using mainly Newton steps ona nonlinear problem the same convergence behavior for different mesh refinements. Here,after the FEM mesh reached a certain quality one would expect this result. In this casewe apply the interior-point Algorithm on the parameterized problem for different finiteelement mesh (triangulation) refinements. Note that in every case the measurement datais synthesized with the corresponding mesh. As we can see in Table 15 the percentageof negative entries in the forward simulation No. y<0/Nfem·NT decreases with better finiteelement discretization.But the main result is, that the number of iterations settles around 30 while the numberof nodes, Nfem, and thus the mesh refinement is increasing. So, for the parameterizedproblem in three dimensions we empirically observe, that the number of iterations isindependent of the mesh refinement, which is a quite satisfying result.


Figure 34: Finite element mesh in three dimensions. The elements are tetrahedrons.

Figure 35: Clockwise rotating velocityfield ~v as solution of the Navier-Stokes equation.


-

t = 0 t = 2 t = 4

Figure 36: Forward simulation of one initial contaminant source over t ∈ [0, 4] in threedimensions. Plotted as isosurface and slices (for y > 0.02).

Figure 37: L-Curve plot for the initial value problem in three dimensions at γ1/γ2 =10. Here the solution norm corresponds to ‖u0‖2

L2(Ω) and the residual norm to

‖y(u0)− yδ‖2L2(T1,T ;Γm).


z = 0.2 z = 0.4 z = 0.6 z = 0.8

Figure 38: Unparameterized reconstruction.

two one

Figure 39: parameterized reconstruction with a two and one source Ansatz.


Table 14: Results of the parametrized problem for u0 at a changing number of Gaussianbell Ansatz functions.

sources 2 1 originalσ 193.0154 85.4964 (20)I 0.3515 0.8303 (0.5)x1 1.3205 1.4941 1.5x2 0.2485 0.3618 0.3x3 0.6863 0.7227 0.7mass 0.0114 0.0610 0.0027σ 193.0604I 0.9764x1 1.5513x2 0.3065x3 0.6716mass 0.0318Iter. 28 33

Table 15: Convergence results for different FEM mesh refinements for initial value problemin three dimensions.

fmincon interior-pointNfem 1118 2833 4934 12342 23697 68781Iter. 31 23 28 32 31 33FirstOrdOpt. 1.6e-12 4.007e-11 2.565e-11 3.2e-11 1.602e-12 3.2e-11

J(a) 2.807e-06 1.668e-06 1.606e-06 1.851e-06 1.849e-06 1.832e-06No. y<0/Nfem·NT 0.61 0.38 0.51 0.25 0.17 0.10GMRESIter. 271 311 326 335 340 342RelRes. 1.0e-06 9.8e-07 9.9e-07 9.9e-07 9.8e-07 9.9e-07

5 CONCLUSIONS AND OUTLOOK 84

5 Conclusions and Outlook

We presented a brief overview from the infinite dimensional theory to actual numericalexperiments. During the literature research it became clear, that due to a broad varietyof results for different infinite dimensional settings, statements e.g. regarding the NavierStokes equation, can be refined under suitable assumptions.The purpose of unparameterized and parameterized reconstruction was not to comparethese Ansatz types in their performance, but to show the capabilities of different Ansatzspaces. With additional penalty functions and constraints, each Ansatz can be tailoredto a certain application, such that the reconstruction’s quality benefits from that.Also multiple enhancements to the synthesizing of measurement data are possible, partic-ularly to improve the guess of the systematic error. In this thesis matrix free methods areessential due to the problem size. Further extensions may make use of matrix free trust-region SQP methods or other interior point algorithms provided e.g. by Ipopt [36]. Thisalso shows that the strong theoretical background gives a good return in the numericalpart. When directly comparing a broad set of algorithms or processing an application,more considerations regarding the synthesizing of measurement data are mandatory.While not crucial for the presentation of solver techniques within the scope of this thesis,the use of an upwind scheme is recommended in case of an application.The validation of reconstructions by an uncertainty analysis was left uncovered, althoughit is immanent for an all embracing analysis of the localization of contaminant sources.But according to recent publications this is also an advancing field of mathematics, wherewe can expect more results to emerge soon.

5 CONCLUSIONS AND OUTLOOK 85

Appendix

The data disc provided with this thesis contains different versions of MATLAB sourcecode for the numerical tests, saved results, and the LaTeX source code for this document.The following table gives an overview over the content of the data disc.

Path Description/latex LATEX source of this document/matlab/constantSource Unparameterized constant source us = const./matlab/FEM Finite element node function example/matlab/generalSource Unparameterized general source us/matlab/initial Unparameterized initial source u0, us ≡ 0/matlab/parametrization2 Examples for parametrization

REFERENCES 86

References

[1] Robert A. Adams and John J. F. Fournier, SOBOLEV SPACES, Second Edition, PURE ANDAPPLIED MATHEMATICS, vol. 140, ELSEVIER, Department of Mathematics, The University ofBritish Columbia, Vancouver, Canada, 2003.

[2] Hans Wilhelm Alt, Lineare Funktionalanalysis, 6., uberarbeitete Auflage: Eine anwendungsorientierteEinfuhrung, Springer-Lehrbuch Masterclass, Springer Berlin Heidelberg, 2012.

[3] Jakob Ameres, Vera Bommer, Matija Hustic, and Julia Wenzel, An Inverse Problem to Localize GasClouds, SCoNDO III - Students’ Conference on Nonlinear and Discrete Optimization (14. July 2012).http://www-m9.ma.tum.de/SS2012/SCoNDO.

[4] Boris S. Bokstein, Mikahil I. Mendelev, and David J. Srolovitz, Thermodynamics and Kinetics inMaterials Science A short course: A short course, Oxford University Press Inc., 2005.

[5] Dietrich Braess, Finite Elemente: Theorie, schnelle Loser und Anwendungen in der Elas-tizitatstheorie, 4. Auflage, Springer-Verlag Berlin Heidelberg, 2007.

[6] R.H. Byrd, J. C. Gilbert, and J. Nocedal, A Trust Region Method Based on Interior Point Techniquesfor Nonlinear Programming, Mathematical Programming 89 (2000), no. 1, 149–185.

[7] COMSOL Multiphysics Modeling Guide: COMSOL 3.5a, COMSOL AB, November 2008.

[8] Jacky Cresson, Messoud Efendiev, and Stefanie Sonner, Necessary and Sufficient Conditions for thePositivity of Solutions of systems of Stochastic PDEs (2010), 10-35.

[9] E. L. Cussler, Diffusion: Mass Transfer in Fluid Systems, Third Edition, Cambridge Series in Chem-ical Engineering, Cambridge University Press, University of Minnesota, 2009.

[10] Timothy A. Davis, ACM Transactions on Mathematical Software (TOMS) 30.2 (June 2004), 165-195, DOI 10.1145/992200.992205.

[11] Efendiev M. and Sonner S., On Verifying Mathematical Models with Diffusion, Transport and Inter-action, GAKUTO International Series Mathematical, Sciences and Applications 32 (2010), 41-67.

[12] Heinz W. Engl, Johannes Kepler, Martin Hanke, and Andreas Neubauer, Regularization of In-verse Problems, Mathematics and Its Applications, Kluwer Academic Publishers,AA Dordrecht, TheNetherlands, 1996.

[13] Lawrence C. Evans, Partial Differential Equations, SECOND EDITION, Graduate Studies in Math-ematics, vol. 19, American Mathematical Society, Providence, Rhode Island, Department of Mathe-matics, University of California, Berkely, 1997.

[14] Mario A. T. Figueiredo, Robert D. Nowak, and Stephen J. Wright, Gradient projection for sparsereconstruction: Application to compressed sensing and other inverse problems, IEEE Journal of Se-lected Topics in Signal Processing 1.4 (2007), 586-597.

[15] P. Flath, L. C. Wilcox, V. Akcelik, J. Hill, B. van Bloemen Waanders, and O. Ghattas, Fast Al-gorithms for Bayesian Uncertainty Quantification in Large-Scale Linear Inverse Problems Based onLow-Rank Partial Hessian Approximations, SIAM Journal on Scientific Computing 33 (2011), no. 1,407–432.

[16] R. Griesse and D. A. Lorenz, A semismooth Newton method for Tikhonov functionals with sparsityconstraints, Inverse Problems 24.3 (2008), 035007-035026.

[17] Elaine T. Hale, Wotao Yin, and Yin Zhang, Fixed-point Continuation for `1-Minimization: Methologyand Convergence, SIAM J. Optim. 19.3 (2008), 1107–1130.

[18] Per Christian Hansen, Regularization Tools version 4.0 for Matlab 7.3 46.2 (October 2007), 189-194,DOI 10.1007/s11075-007-9136-9.

[19] Per Christian Hansen, The L-Curve and its Use in the Numerical Treatment of Inverse Problems,WIT Press, 2000.

REFERENCES 87

[20] Per Christian Hansen and D. O’Leary, The Use of the L-Curve in the Regularization of Dis-crete Ill-Posed Problems, SIAM Journal on Scientific Computing 14 (1993), no. 6, 1487-1503, DOI10.1137/0914086.

[21] J. O. Hirschfelder, C. F. Curtiss, and R. B. Bird, Molecular Theory of Gases and Liquids, WileyNew York, 1954.

[22] M. Hinze, R. Pinnau, M. Ulbrich, and S. Ulbrich, Optimization with PDE Constraints, MathematicalModelling: Theory and Applications, vol. 23, Springer-Verlag, 2009.

[23] Peter Knabner and Lutz Angermann, Numerical Methods for Elliptic and Parabolic Partial Differ-ential Equations, Texts in Applied Mathematics, vol. 44, Springer-Verlag New York, 2003.

[24] Pierre-Louis Lions, Mathematical Topics in Fluid Mechanics, Oxford Lecture Series in Mathematicsand its Applications, vol. 1 and 2, CLARENDON PRESS - OXFORD, Department of Mathematics,University of California, Berkely, 1996.

[25] Andre Milzarek and Micheal Ulbrich, A semismooth Newton method with multi-dimensional filterglobalization for L1-Optimization (2012).

[26] Jorge Nocedal and Stephen J. Wright, Numerical Optimization, Springer Series in Operations Re-search, Springer-Verlag New York, Inc., 1999.

[27] Noemi Petra and Georg Stadler, Model Variational Inverse Problems Governed by Partial Differ-ential Equations, Technical Report 10-05, The University of Texas at Austin, The Institute forComputational Engineering and Sciences, March 2011.

[28] Alfio Quarteroni, Riccardo Sacco, and Fausto Saleri, Numerical Mathematics: Second Edition, Textsin Applied Mathematics, vol. 37, Springer Berlin Heidelberg, 2007.

[29] Georg Stadler, Elliptic optimal control problems with `1-control cost and applications for the place-ment of control devices, Computational Optimization and Applications Archive 44.2 (November2009), 159-181.

[30] John R. Taylor, An Introduction to Error Analysis: Second Edition, University Science Books, Sausal-ito, California, 1997.

[31] Roger Temam, Navier-Stokes Equations: Theory and Numerical Analysis, Studies in Mathematicsand its Applications, vol. 2, Elsevier Science Publishers B.V., Department of Mathematics, Universityof California, Berkely, 1984.

[32] F. Troltzsch, Optimale Steuerung partieller Differentialgleichungen, 2., uberarbeitete Auflage,Vieweg+Teubner, 2009.

[33] Michael Ulbrich, Semismooth Newton Methods for Variational Inequalities and Constrained Opti-mization Problems in Function Spaces, MOS-SIAM Series on Optimization, Society for Industrialand Applied Mathematics, Technische Universitat Munchen, 2011.

[34] Michael Ulbrich and Stefan Ulbrich, Nichtlineare Optimierung, Mathematik Kompakt, vol. VIII,Birkhauser, Springer Basel, 2012.

[35] Curtis R. Vogel, Computational methods for inverse problems, Frontiers in applied mathematics,SIAM Society for Industrial and Applied Mathematics, Montana State University, Bozeman, Mon-tana, 2002.

[36] A. Wachter and L. T. Biegler, On the Implementation of a Primal-Dual Interior Point FilterLine Search Algorithm for Large-Scale Nonlinear Programming, Mathematical Programming 106.1(2006), pp. 25-57.

[37] Z. Wen, W. Yin, D. Goldfarb, and Y. Zhang, A fast algorithm for sparse reconstruction based onshrinkage, subspace optimization and continuation, SIAM J. SCI.COMPUT. 32, no. 4, 1832–1857.

· Inverse Probleme unter advektivem und di usivem Transport von Sto en Masterarbeit -...

Documents

Transcript of · Inverse Probleme unter advektivem und di usivem Transport von Sto en Masterarbeit -...