opus4.kobv.de · Zusammenfassung Parameterschätzung behandelt die Bestimmung des Zustands eines...

Asymptotic error estimates of the 3DMUSIC algorithm in the case of systematic

errors

Asymptotische Fehlerschranken des 3D MUSICAlgorithmus im Fall von systematischen Fehlern

Der Naturwissenschaftlichen Fakultätder Friedrich-Alexander-Universität

Erlangen-Nürnberg

zur

Erlangung der Doktorgrades Dr. rer. nat.

vorgelegt von

Patrick Luff

aus Weißenburg i. Bay.

Als Dissertation genehmigt von der Naturwissen-

schaftlichen Fakultät der Friedrich-Alexander-Universität

Erlangen-Nürnberg

Tag der mündlichen Prüfung: 07.03.2013

Vorsitzende/r der Promotionskommission: Prof. Dr. Johannes Barth

Erstberichterstatter/in: Prof. Dr. Eberhard Bänsch

Zweitberichterstatter/in: Prof. Dr. Barbara Kaltenbacher

Don’t stop believin’Journey, Escape (1981)

Zusammenfassung

Parameterschätzung behandelt die Bestimmung des Zustands eines gegebenen Systemsaufgrund von indirekten Sensormessungen. Unter den vielen Methoden in Verwendungwird der Multiple Signal Classification (MUSIC) Algorithmus als einer der erfolgreichstenin mehreren Anwendungsgebieten angesehen, darunter Magnetoenzephalographie, Elek-troenzephalographie, Radar und drahtlose Kommunikation. Der Algorithmus durchsuchtdas Arbeitsvolumen mit einem Modell einer Einzelquelle und berechnet die Ähnlichkeitzwischen dem Quellenunterraum und einem genäherten Signalunterraum. Es wird dabeiangenommen, dass die Quelle räumlich statisch ist.

Die meisten praktischen Anwendungen haben mit Ungenauigkeiten des zugrunde liegen-den physikalischen Modells zu kämpfen. Diese Ungenauigkeiten beeinflussen die Param-eterschätzung und führen zu einem inkorrekten Ergebnis, sogar im Fall von rauschfreienSensoren. Diese Arbeit konzentriert sich auf die Modellierung and Quantifizierung dieserSchätzfehler im asymptotischen Fall. Dafür wird angenommen, dass die Modellfehler Zu-fallsvariablen sind und Erwartungswert und Momente höherer Ordnung des Schätzfehlerswerden genähert.Bereits bestehende Ansätze erster Ordnung werden auf potentiell zweite Ordnung er-

weitert. Zusätzlich wird dreidimensionale Parameterschätzung betrachtet. Wie sich her-ausstellt, werden im dreidimensionalen Fall Ableitungen von Eigen- und Singulärsyste-men benötigt. Für beide Systeme werden in dieser Arbeit Lösungen in geschlossenerForm angegeben. Auftretende numerische Probleme werden in einem separaten Kapitelbehandelt. Die vorgestellten numerischen Vereinfachungen machen eine Berechnung vonasymptotischen Schätzfehlern in einer praktikablen Zeitspanne überhaupt erst möglich.

Als ein untergeordneter Beitrag wird in dieser Arbeit der Fall einer sich bewegendenQuelle behandelt. Verwendet wird ein stichprobenbasierter Partikelfilter-Algorithmus,der die A-posteriori-Wahrscheinlichkeitsdichte des Zustands des Systems aufgrund dereingegangenen Messungen nähert. Eine neue Technik für die Gewichtung der Stich-proben basierend auf dem MUSIC-Algorithmus wird vorgestellt.

Die vorgestellten Konzepte und Algorithmen werden an einer Anzahl von Testfällenvalidiert.

i

Abstract

Parameter estimation deals with the problem of determining the state of a given systemthrough indirect measurements by sensors. Among the abundance of methods in use,the multiple signal classification (MUSIC) algorithm is considered as one of the mostsuccessful ones in various areas, like magnetoencephalography, electroencephalography,radar or wireless communication. The algorithm is able to search the operating volumewith a one source model and computes the similarity of the source subspace to an esti-mated signal subspace. The source is assumed to be spatially static.

Most practical applications suffer from inaccuracies of the underlying physical model.These will disturb the parameter estimation even in the case of noiseless sensors. Thisthesis focuses on modeling and quantifying these parameter estimation errors in theasymptotic case. Therefore, the modeling errors are considered as random variables andexpectation and higher moments of the parameter estimation error are approximated.Existing first order approaches are augmented to near second order and the case of threedimensional parameter estimation is treated. The three dimensional case necessitatesthe use of derivatives of eigensystems and singular systems, for which closed-form solu-tions are provided. Arising problems in numerical complexity are treated in a separatechapter. The presented numerical simplifications make a calculation of asymptotic errorestimates in a feasible time possible in the first place.

As a minor contribution of this thesis, we investigate the case of a moving source in theoperating volume. Our method of choice is a sample based particle filter algorithm whichestimates the posterior probability density of the system state given the measurements.We propose a new technique of weighting the samples based on the MUSIC algorithm.

The presented concepts and algorithms are validated through a number of test cases.

ii

Contents

Zusammenfassung i

Abstract ii

Contents iii

List of figures vi

Acknowledgements viii

1 Introduction 1

2 Physical background 32.1 Electromagnetic source estimation . . . . . . . . . . . . . . . . . . . . . . 3

2.1.1 Magnetic induction . . . . . . . . . . . . . . . . . . . . . . . . . . . 32.1.2 Measurement of induced voltage . . . . . . . . . . . . . . . . . . . 62.1.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

2.2 Estimation of direction of arrival . . . . . . . . . . . . . . . . . . . . . . . 82.3 Systematic noise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

3 The MUSIC Algorithm 123.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123.2 Scalar-valued problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133.3 Vector-valued problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

4 Derivatives of eigenvalues and eigenvectors 174.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174.2 Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

4.2.1 The method of Lee and Jung . . . . . . . . . . . . . . . . . . . . . 184.2.2 Extension to arbitrary derivatives . . . . . . . . . . . . . . . . . . . 18

5 Derivatives of singular values and singular vectors 215.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 215.2 Proposed method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

5.2.1 Distinct singular values . . . . . . . . . . . . . . . . . . . . . . . . 225.2.2 Closed-form expressions of the derivatives . . . . . . . . . . . . . . 235.2.3 Repeated singular values . . . . . . . . . . . . . . . . . . . . . . . . 255.2.4 Example of repeated singular values with repeated derivatives . . . 28

iii

Contents

6 Derivatives of 3D MUSIC 30

7 Expectation of estimation error 327.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 327.2 Scalar valued x . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

7.2.1 Estimation error as ratio of quadratic forms . . . . . . . . . . . . . 327.2.2 Moments of quadratic forms . . . . . . . . . . . . . . . . . . . . . . 337.2.3 Extension to higher order estimates . . . . . . . . . . . . . . . . . . 35

7.3 Vector-valued x . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 377.3.1 Estimation error as ratio of quadratic forms . . . . . . . . . . . . . 377.3.2 First and second order expansion . . . . . . . . . . . . . . . . . . . 397.3.3 Extension to higher order estimates . . . . . . . . . . . . . . . . . . 407.3.4 Higher order Moments . . . . . . . . . . . . . . . . . . . . . . . . . 417.3.5 Outlook and Numerical problems . . . . . . . . . . . . . . . . . . . 42

8 Implementation 448.1 General techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44

8.1.1 Economic computation of the trace of a matrix product . . . . . . 448.1.2 Storage issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

8.2 Gaussian, zero-mean distributed h . . . . . . . . . . . . . . . . . . . . . . 468.2.1 Scalar h . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 468.2.2 Multivariate h . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 468.2.3 Nonzero entries of the Gaussian covariance matrix . . . . . . . . . 478.2.4 Summary and computational effort . . . . . . . . . . . . . . . . . . 488.2.5 Critical simplification step . . . . . . . . . . . . . . . . . . . . . . . 49

9 Moving source: tracking via sequential Bayesian filtering 519.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 519.2 Particle filtering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

10 Simulations 5710.1 Trace calculation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

10.1.1 Convergence rate of truncated trace calculation . . . . . . . . . . . 5710.2 Moments of localization error . . . . . . . . . . . . . . . . . . . . . . . . . 60

10.2.1 1D direction of arrival estimation . . . . . . . . . . . . . . . . . . . 6010.2.2 3D parameter estimation problem . . . . . . . . . . . . . . . . . . . 6310.2.3 Higher order moments . . . . . . . . . . . . . . . . . . . . . . . . . 68

10.3 Particle filters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70

11 Conclusion and Outlook 75

A Appendix 77A.1 Multi-index notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77A.2 Example of operator vec . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77A.3 Repeated singular values . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77

iv

Contents

A.4 Closed form of the inverse of a 3× 3-matrix . . . . . . . . . . . . . . . . . 81A.5 Quadratic forms ratio expectation . . . . . . . . . . . . . . . . . . . . . . . 81

A.5.1 Kronecker product calculus . . . . . . . . . . . . . . . . . . . . . . 81A.5.2 Proof of (7.7) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82

A.6 Nonzero entries of the covariance matrix . . . . . . . . . . . . . . . . . . . 83A.6.1 Examples for possible configurations . . . . . . . . . . . . . . . . . 83A.6.2 Examples of the covariance matrix R . . . . . . . . . . . . . . . . . 84A.6.3 An algorithm to calculate all possible permutations of a given set . 84

A.7 Particle filter algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86

v

List of Figures

2.1 Left: direct solution of magnetic flux equation. Right: magnetic momentapproximation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

2.2 Equivalent circuit of simple measurement setup . . . . . . . . . . . . . . . 72.3 Left: polar coordinates. Right: EM plane wave impinging at origin . . . . 92.4 Systematic errors in the case of distorted sensors . . . . . . . . . . . . . . 11

3.1 Example of principal angles in three dimensions. Left: two two-dimensionalsubspaces. The principal angles are designated by α and β. β = 0 sincethe two subspaces share one dimension. Right: one one-dimensional andone two-dimensional subspace. The principal angle is the actual anglebetween vector and plane. . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

9.1 Illustration of SIR algorithm. From left to right: initial state and particledistribution, state and particle update, particle weighting, resamplingstage, particle 2 is abandoned, therefore particle 3 is duplicated. . . . . . 55

10.1 Number of permutations and associated number of indices. . . . . . . . . 5810.2 Effects of truncated calculation 1 . . . . . . . . . . . . . . . . . . . . . . . 5910.3 Effects of truncated calculation 2 . . . . . . . . . . . . . . . . . . . . . . . 6010.4 Sensor position and source angles . . . . . . . . . . . . . . . . . . . . . . 6110.5 Left: Mean error and estimated first moment of DOA estimation for

different incident angles for source 2. Right: absolute residual . . . . . . . 6210.6 Left: Mean error and estimated first moment of DOA estimation for

different systematic noise. Right: absolute residual . . . . . . . . . . . . . 6310.7 Sensor and source position. . . . . . . . . . . . . . . . . . . . . . . . . . . 6410.8 Left: Norm of sample mean and estimated mean of quadratic form for

one source position over all systematic errors for 31 parameters. Right:absolute value of the residual. . . . . . . . . . . . . . . . . . . . . . . . . 65

10.9 Left: Norm of sample mean and estimated mean of quadratic form overall source positions for 31 parameters, σ = 0.01. Right: absolute value ofthe residual. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65

10.10 Left: norm of sample mean error and estimation of mean error over allsource positions for 31 parameters, σ = 0.01. Right: absolute value ofthe residual. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66

10.11 Left: Norm of sample mean error and estimation of mean error for onesource position over all systematic errors for 31 parameters. Right: abso-lute value of the residual. . . . . . . . . . . . . . . . . . . . . . . . . . . . 67

vi

List of Figures

10.12 Sample mean error and estimation of mean error over all positions for 62parameters. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67

10.13 Sample mean error and estimation of mean error over all systematic errorsfor 62 parameters. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68

10.14 Sample mean squared error and estimated second moment over all posi-tions for 31 parameters. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69

10.15 Sample mean squared error and estimated second moment with 31 pa-rameters for near and far field estimation. . . . . . . . . . . . . . . . . . . 70

10.16 Marker path in noiseless case . . . . . . . . . . . . . . . . . . . . . . . . . 7110.17 Left: sample of true and estimated source path for σp = 0.003, σm =

0.3. Marker path (blue) and estimated paths on top. Euclidean distancebetween source and particle mean at the bottom. Right: sample forσp = 0.003, σm = 1.5. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72

10.18 Left: true and estimated source path sample for σp = 0.008, σm = 0.3.Marker path (blue) and estimated paths on top. Euclidean distancebetween source and particle mean at the bottom. Right: sample forσp = 0.008, σm = 2. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73

10.19 Left: mean distance of all timesteps and samples for tested process andmeasurement noise. Right: mean distance for a random walk state function. 73

A.1 Sparsity pattern of R . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84

vii

Acknowledgements

First and foremost, I would like to thank my supervisor Prof. Eberhard Bänsch forhis support, encouragement, guidance and contributions throughout the entire courseof my postgraduate studies. I would also like to thank my co-referee Prof. BarbaraKaltenbacher for her comments and remarks on the final version of this work.

At Siemens AG Erlangen, I would like to thank Dr. Johannes Reinschke for introducingme to a fascinating topic and providing me funding, background, contacts and advicefrom the beginning to well past the time of direct collaboration. Dr. Reinschke providedme with resources and access to laboratory equipment and experimental data whichformed the groundwork of this thesis. I thank Mario Bechtold, Dirk Hiebel and Dr.Wolfgang Schmid for their support and practical and theoretical advice. I would like tothank Dr. Alexander Juloski for bringing the field of Bayesian estimation to my interest.

I thank Dr. Nicolass Neuß, Dr. Dmitri Kuzmin and Dr. Cornelia Scheider of theUniversity Erlangen-Nürnberg for their mathematical advice, feedback and clarificationson my notation.

I thank the whole research group of the chair of Applied Mathematics III, too many toname here, for helping to make the daily routine both productive and a pleasure.

In the end, my thanks go to my family, my friends and my love. For without them, thiswork would not have come to live.

Erlangen, September 2012

viii

1. Introduction

Parameter estimation problems arise in many fields of research and applications. Typ-ically, the user wants to achieve information on the state of a system given indirectmeasurements by sensors. The usually nonlinear dependence of the sensor data on thesystem state necessitates the use of fitting techniques. The mathematical problem ofparameter estimation is twofold.

• On one hand, a physical model is required to forecast sensor data given the systemparameters are known. This is commonly referred to as the forward problem orforward solution. We consider two applications in this thesis. The main focus is onestimating position and orientation of active electromagnetic (EM) point sourcesin the near-field of the sensors. The signal of the sources can be considered toreach the sensors instantaneously. Therefore, the location of the sources must bedetermined from the spatial diversity of the signal intensity across the sensors. As asecond application, we consider the direction of arrival (DOA) estimation problemarising in many applications, e.g. in radar or wireless communication. The sourceslie in the far-field of the sensors such that the signal intensity does not containenough information. Here, time disparity of the arriving signals is the source ofutilizable information.

• On the other hand, once a feasible forward solution is found, the inverse problemof inferring the parameters of the system given potentially noisy senor data arises.Using common simplifications, the two mentioned applications will have forwardsolutions which share critical characteristics. This enables the use of the samealgorithms for both with only slight modifications. The multiple signal classificationalgorithm (MUSIC) uses the spectral properties of the forward solution to scanthe operating volume. Although generally suboptimal, the MUSIC algorithm hasthe advantage of searching for one source at a time, compared to multi-source fitnecessary in a least-squares solution.

This thesis concentrates on the investigation of modeling errors. In all applications, theassumptions on the forward solution are inevitably met only to a certain extent. Thesesources of error can be unknown influences on the sensor data, construction inaccuraciesor model parameters known only within tolerances. This leads to a discrepancy betweenforward solution and observed data even in the noiseless case and perfectly known sys-tem state, which will lead to a erroneous inverse solution. Analytical expressions forthe expected error and second order moment of the state estimation are given in theasymptotic case. Existing error estimates concentrate largely on one dimensional prob-lems and employ first order approximations in one or several stages of the derivation. In

1

1. Introduction

this work, these estimates are extended to potentially fully second order accuracy and tothe three-dimensional (3D) case. Arising numerical problems are treated thoroughly.

As a minor result, we examine the case of a moving EM source in space. Algorithmsused in this field are based on Bayesian inference, with the Kalman filter being a promi-nent member in the linear Gaussian case. A new weighting technique is proposed forthe applications mentioned above which improves performance in the case of low sensornoise.The remainder of this thesis is organized as following. In Chapter 2, we will present

the physical models of the considered applications and our model of systematic errors.Chapter 3 introduces the MUSIC algorithm in its one and three dimensional variants.Some preliminary work on derivatives of eigensystems and singular systems is done inChapters 4 to 5. A minor direct result of these chapters are derivatives of the 3D MUSICcost function, which are stated in Chapter 6. The main contribution of this work canbe found in Chapter 7, where we extend the existing asymptotic error estimates to nearsecond order and the 3D case. This 3D case introduces numerical problems which areaddressed in Chapter 8. We consider the problem of a moving source in Chapter 9 wherewe propose a new weighting technique for a common sample based algorithm. We showthe validity of the presented algorithms through some test cases in Chapter 10.

2

2. Physical background

We provide the physical background of the applications we consider in this work. Thefirst setup is a source coil emitting an EM field which induces voltages in receiving sensorcoils. The second application is a direction of arrival estimation problem where we useclassical simplifying assumptions.

2.1. Electromagnetic source estimation

2.1.1. Magnetic induction

The voltage usc induced in a closed filament wire sensor coil by a source coil with knowncurrent imc of angular frequency ω = 2πf0 can be modeled by means of the mutualinductance L between the two coils (see [1] pp. 256)

usc = Ldimc(t)

dt.

Given a alternating sinusoidal current imc = Imcejωt+φmc with known amplitude Imc and

phase φmc, the induced voltage is given by

usc = jωLimc(t). (2.1)

The mutual inductance is given through means of the magnetic flux Φ ([33] pp. 216)

L =Φ

imc(t),

whereΦ =

∫Asc

~B · d ~Asc (2.2)

is the normal component of the magnetic field ~B over the sensor coil area Asc and ~Bsatisfies the laws of magnetostatics (see [33] p. 180)

∇×B = µ0J, (2.3)∇ ·B = 0,

where J is the current density and µ0 is the permeability constant. The magnetic field~B(~x ) can be described through a vector potential A

~B = ∇×A (2.4)

3


X

Xsource coil

sensor coil

~nsc

~nmc

~xsc

~xmc

~tmc

~x cmc

0X

source coil

sensor coil

~nsc

~m

~x cmc

~xsc

dAsc

0

Figure 2.1.: Left: direct solution of magnetic flux equation. Right: magnetic momentapproximation

which has the general form

A(~x) =µ0

4π

∫V

J(~x ′)

|~x− ~x ′|dx′ +∇Ψ(~x ), (2.5)

where J(~x ′) is the current density distributed over the volume V and Ψ(~x ) is an arbitraryscalar function. Ψ(~x ) can be specified by a gauge transformation, where in magnetostat-ics, the use of the Coulomb gauge ∇ · A = 0 is common. Combining (2.3), (2.4) and theCoulomb gauge leads to

∇(∇ ·A)−∇2A = µ0J, (2.6)∇2A = −µ0J (2.7)

The solution of (2.7) is (2.5) ([33] pp. 181) with Ψ constant, e.g.

A(~x ) =µ0

4π

∫V

J(~x ′)

|~x− ~x ′|dx′. (2.8)

If we consider the source coil as a thin wire, J(~x ′) = imcδ(~x′−~xmc)~t(~xmc) holds ([41] p.

279), where ~xmc is the position of the source coil wire and ~t is its tangent with∣∣~t ∣∣ = 1.

Later, we will use the so-called magnetic moment approximation but for the sake ofcompleteness, we show the magnetic flux resulting from directly integrating (2.2). Using(2.4) and (2.8) in (2.2) and Stokes’ theorem yields (see also [33] pp. 234)

Φ =

∫Asc

∇×A · d ~Asc

=

∮∂Asc

A · d~xsc

=µ0imc

4π

∮∂Asc

∮∂Amc

~t(~xmc)

|~xsc − ~xmc|dxmc · d~xsc. (2.9)

4


The setup of the direct solution of the magnetic flux using the vector potential is illus-trated in Figure 2.1 left.Rather than (2.9), we will use a simplified model which is valid if the distance between

sensor and source coil is much greater than the radius of the source coil. We expand thedenominator in (2.8), where we shift the origin of the coordinate system in a way that~xmc is small, typically the center of the source coil ~x cmc. We denote the shifted valueswith superscript s and get by Taylor expansion (see [33] pp. 185 and [41] pp. 279.)

1

|~x ssc − ~x smc|=

1

|~x ssc|+~x ssc · ~x smc|~x ssc|

3 + · · · .

It can be shown that the first term of this expansion inserted in (2.8) is equal to zero.We define the magnetic moment ~m := 1

2

∫~x smc × J(~x smc)dx

smc. In the case of a planar

source coil with surface area Amc, the magnetic moment reduces to ([1] pp. 200)

~m = imcAmc ~nmc.

where Amc is the source coil area and ~nmc is its surface normal. The second term of theexpansion of the vector potential inserted in (2.8) yields

A(~x s) =µ0

4π

~m× ~x s

|~x s |,

A(~xsc) =µ0

4π

~m× (~xsc − ~x cmc)|~xsc − ~x cmc|

. (2.10)

Higher order elements of the expansion are neglected. Regarding (2.9), there are twopossible ways to calculate the magnetic flux Φ using the magnetic moment model (2.10).Either we can calculate the rotation of A(~x ) analytically and integrate over Asc or wecalculate the line integral over ∂Asc.The solution of ∇×A(~x ) using the magnetic moment approximation is given by

B(~x ) = ∇×A(~x ) =1

4π |~p |3

(3~p (~p · ~m)

|~p |2− ~m

)=

1

4π |~p |5(

3~p ~p T − |~p |2 I)

︸︷︷︸:=K(~x )

~m

= K(~x )~m, (2.11)

where ~p := ~x − ~x cmc and I is the identity. One can observe that (2.11) depends linearlyon ~m. This is the foundation of the MUSIC algorithm which we will present in Chapter3. We write (2.10) in a similar form which emphasizes the linear dependency on ~m:

A(~x ) =1

4π |~p |3

0 ~p3 −~p2

−~p3 0 ~p1

~p2 −~p1 0

︸︷︷︸

=:K(~x)

~m,

= K(~x ) ~m, (2.12)

5


where ~pi, i = [1, 2, 3], denote the components of ~p = ~x− ~x cmc.The integral (2.9) can be solved with a parameterization ξsc(ϕsc), ϕsc ∈ [0, a]:

Φ =

∮∂Asc

A · d~xsc

=

a∫0

A(ξsc(ϕsc)) · ξsc(ϕsc)dϕsc =

a∫0

ξTsc(ϕsc)K(ξsc(ϕsc))~mdϕsc

=

a∫0

ξTsc(ϕsc)K(ξsc(ϕsc)) dϕsc ~m. (2.13)

Further simplifications are possible if the radius of the sensor coils is sufficiently small.In this case, the surface integral, which results from using (2.11) in the first line of (2.9)can be approximated by the magnetic flux in the center of the coil times the area of thecoil. This results in

Φ =

∫Asc

∇×A · d ~Asc

=

∫Asc

K(~x )~m · d ~Asc

=

∫Asc

[K(~x )~m

]· ~nscdAsc

=

∫Asc

~nTscK(~x ) dAsc ~m (2.14)

≈ Asc~nTscK(~x csc)~m, (2.15)

where ~x csc denotes the center of the source coil. Figure 2.1 right illustrates (2.14). (2.14)gives the same result as (2.13) but has the disadvantage that a surface integral has to besolved numerically. (2.15) is a straightforward approximation which gives an analyticalsolution.

2.1.2. Measurement of induced voltage

In an actual application, it is not possible to measure the voltage induced by the sourcein a sensor coil directly. Measurement devices, parasitic currents and self inductance ofthe sensor coil will influence the measurements. We show one possibility to model thesource-sensor measurement setup in Figure 2.2. The influence of cables is neglected. Thesensor coil is characterized by resistance Rs, capacitance Cs and self inductance Ls. Themeasurement device is characterized through its resistance Rm. The source coil influenceis modeled through the mutual inductance L, where L can be calculated as described

6


source sensor measurement

Ls

Rs

CsL Rm

imc

umeas

Figure 2.2.: Equivalent circuit of simple measurement setup

above. With Kirchhoff’s circuit laws, the measured voltage umeas is given by

umeas =jωRm

Rs +Rm + jω Ls + jω CRmRs − ω2LsCRmi(t)L(xmc).

As can be seen, all measurement-related factors can be summarized in a linear factor l :=jωRm

Rs+Rm+jω Ls+jω CRmRs−ω2LsCRm, which may vary from sensor to sensor. The measured

voltage is then given byumeas = l i(t)L(xmc).

Further real world parameters such as the number of windings of the coils are neglectedhere but could easily be incorporated in l.

2.1.3. Summary

Overall, we presented four different possible calculations of the magnetic flux. (2.9) isthe derivation of the magnetic flux through Faraday’s law of induction. The magneticmoment approximation of the vector potential A results in two equivalent possibilities tocalculate the magnetic flux (2.13) and (2.14), where (2.13) requires the numerical calcula-tion of one integral while (2.14) requires two. (2.14) gives the possibility to approximatethe integral through the sensor surface (2.15), which may be sufficient in certain appli-cations.We only measure the effective value ueff = Re(um)√

2of the induced voltage, so all param-

eters can be considered to be real-valued. Considering n sensors and p measurements,the measurements are taken over time as U = [ueff (t1), . . . , ueff (tp)] for one sensorand are concatenated in y(t) =

[UT1 , . . . , U

Tn

]T ∈ Rn×p. Using the magnetic momentapproximation, the measurements of sensor i and the source are connected through

yi(t) = a(xmc)i ωmc(t), (2.16)

where ωmc(t) = nmci(t) ∈ R3×p is the normed source orientation, which is defined asthe surface normal of the source coil times the time series of current. In the following,this will be referred to as the orientation of the source. Although not rigorously correct,this notation has a nice geometrical interpretation. Given n sensors, we define a(xmc) :=[a(xmc)

T1 , . . . , a(xmc)

Tn

]T ∈ Rn×3 with the i−th row corresponding to the i−th sensor

7


and

a(xmc)i =

12

√2 Re(kiAmc)

ai∫0

ξTsc,i(ϕsc)K(ξsc,i(ϕsc))dϕsc,

12

√2 Re(kiAmc)

∫Asc,i

~nTsc,i K(~x ) dAsc,i,

12

√2 Re(kiAmc)Asc,i~n

Tsc,i K(~xmc), i = 1, . . . , n,

depending on which calculation method of the magnetic flux is chosen. All linear param-eters are summarized in a parameter ki for each sensors, called the measurement gain.In the case of m sources, we can simply concatenate the influence of the sources and getthe final expression of the forward solution:

y(t) =[a(xmc,1), . . . , a(xmc,m)

]︸︷︷︸=:A(x)

ωmc,1(t)...

ωmc,m(t)

= A(x)ω(t), (2.17)

where A(x) is the so-called lead-field matrix.In this model, we do not consider mutual coupling between sensors. A sensor coil with

nonzero current will itself emit an EM field and therefore the sensors will influence eachother. This would turn (2.17) into a system of equations M(x)y(t) = A(x)ω(t), withM(x) ∈ Rn×n describing the mutual inductance between source and sensors and sensorsamong each other. With the sensors not changing position, M(x) is constant except onthe diagonal. It is easy to see that this system of linear equations can be reduced tothe form (2.17). Therefore, and because the influence of mutual inductance is typicallynegligible, we do not model mutual inductance between sensors.

2.2. Estimation of direction of arrival

Estimation of direction of arrival (DOA) of a signal is a classical problem in signalprocessing with applications in radar [53] or wireless communication [69],[26]. We assumean EM point source in the far field characterized in polar coordinates through (θ, φ) emitsan electromagnetic wave, see Figure 2.3 left. The wave is impinging at n sensors locatedat positions rk, k = 1, . . . , n. The spatio-temporal wave field Ψ(r, t) propagates accordingto the wave equation ([33] Chapter 6 and [37] for a derivation from Maxwell’s equations)(

∇2 − 1

c2

∂2

∂2t

)Ψ(r, t) = 0

where ∇2 is the spatial Laplacian operator and c is the velocity of propagation in themedium.There are several common assumptions consistent with applications which will result

in a simplified model. We refer to the work of Doron [18] and Belloni [11]. The firstassumption is φ = π, i.e. we consider a one-dimensional DOA problem in the xy-plane.

8


θ

φ

EM source

x

y

z

x

y

d

sensor

impingingplane wave

rk γkθs

rs

Figure 2.3.: Left: polar coordinates. Right: EM plane wave impinging at origin

The sensors are considered isotropic, i.e. the radiation pattern is unity for every EMsource direction of each sensor. We assume that the source characterized by rs, ‖rs‖ = 1,or equivalently by θs, lies in the far-field of the sensors so that the signal received bythe sensors can be considered as a plane wave. The source signal s(t) is considerednarrowband, i.e. has a small bandwidth compared to the carrier frequency ωc = 2π

λ ,where λ = c

f is the wavelength, f is the signal frequency and c is the propagation of thewave in the medium. At the origin, the signal wave W (r, t) has the form

W (0, t) = s(t)ejωct.

The origin is usually placed in the centroid of the sensors. Under these assumptions,the signal received by sensor k is then time-shifted by the distance between the sensorand the wave front through the origin ([26] and [16]), see Figure 2.3 right. The signeddistance d between sensor and plane wave front through the origin is given by d = rk · rsor equivalently by d = ‖rk‖ cos(θs − γk), where θs, γk describe the incoming wave andsensor position angle, respectively. d is positive if the sensor is on the side of the planewave with positive rs direction and negative otherwise. The time-shifted signal is givenby ([40],[59])

W (θs, t) = s (t+ ‖rk‖ cos(θs − γk)) ejωc(t+‖rk‖ cos(θs−γk)).

Due to the narrowband assumption, we can assume s(t+ ‖rk‖ cos(θs − γk)) ≈ s(t) andthe sensor data is finally given by

W (θs, t) =[s(t) ejωct

]ejωc‖rk‖ cos(θs−γk).

Given n sensors located at (r1, . . . , rn), we can define

a(θs) =[ejωc‖r1‖ cos(θs−γ1), . . . , ejωc‖rm‖ cos(θs−γn)

]T,

9


and assuming m signal sources impinging at x = (θ1, . . . , θm), we obtain a forwardsolution

y(t) =[a(θ1), . . . , a(θm)

]︸︷︷︸=:A(θ)

ejωct s1(t)

...ejωct sm(t)

︸︷︷︸

=:ω(t)

= A(x)ω(t), (2.18)

where A is again referred to as the lead-field matrix. As one can see, this forward solutionhas the same properties as (2.17) and is used frequently in DOA estimation problems([62], [61], [74]).This combination of linear (for orientation/time series) and nonlinear (for position/DOA)

dependence of measurements on signal parameters as illustrated in (2.17) and (2.18) canbe found in various other applications such as electroencephalography (EEG) and magne-toencephalography (MEG) ([8]) or thunderstorm localization ([47]). It is the foundationof the MUSIC algorithm introduced in Chapter 3.

2.3. Systematic noise

The models introduced in the last two sections will inevitably match the conditions in anapplication imperfectly. Sources of errors can be of very different nature. First of all, theassumptions required to achieve a nonlinear/linear distinction of the parameters shownin the last two sections may be violated in a substantial way. Furthermore, unknownEM sources may be present. Figure 2.4 shows an example where sensor positions arenot known exactly. Due to the distorted sensor positions, we have to expect erroneousparameter estimation even in the case of noise-free measurements. We will not treat thecases in which the physical model itself is not valid or unknown parameters influence thesensor data. To preserve tractability, we assume the model is correct but its parametersmay not. These errors will be called systematic errors or modeling errors. We model theerroneous forward solution in a very general sense as

y(t, h) = A(x, h)ω(t),

i.e., the systematic error is treated through the lead-field matrix. Of course, we haveA(x, 0) = A(x) if the model has no systematic error. The influence of h on A dependson the given problem.As an example, we consider the 3D estimation problem with erroneous sensor positions

and lead-field matrix as (2.15). In this case, we have hi := [hki ;~hpi ;

~hoi ] ∈ R7 for one senor,

10


expected sensor data y(t, 0)

measured sensor data y(t, h)

inverse solution

true state x

estimated state x

real sensorexpected sensor

Figure 2.4.: Systematic errors in the case of distorted sensors

h := [hT1 , . . . , hTm]T and

a(xmc, hi)i =1

2

√2 Re((ki + hki )Amc)Asc(~nsc + ~hoi )

T K(~xmc + ~hpi ),

a(xmc, h) =[a(xmc, h1)T1 , . . . , a(xmc, hn)Tn

]T,

y(t, h) =[a(xmc,1, h), . . . , a(xmc,m, h)

] ωmc,1(t)...

ωmc,m(t)

,y(t, h) = A(x, h)ω(t).

We restrict ourselves to h ∈ Rk in this work, with k ∈ N depending on the problem.

11

3. The MUSIC Algorithm

3.1. Introduction

The MUSIC algorithm is a long known and successful tool in parameter estimation. Toour knowledge, Schmidt [62] was the first to describe the MUSIC cost function. Mosherand Leahy introduced MUSIC in the field of EEG and MEG [50] (see also [47]). MEGand EEG brain imaging is a vivid field of research and numerous variants of the algo-rithm were introduced since then. Mosher and Leahy proposed an enhanced recursivelyapplied (RAP-) MUSIC [48], [49], which is designed to treat difficult to resolve multiplesources and correlated sources. [17] introduced a hybrid algorithm which combines theglobal optimization technique space mapping (SM) with RAP-MUSIC. [31] suggested aprecorrelated and orthogonally projected (POP-) MUSIC approach for a better resolu-tion of correlated sources. [52] and [2] suggested the use of fourth order (FO) momentsof the acquired sample covariance matrix which resulted in the FO-MUSIC algorithm.In the radar/wireless communication field, some specialized algorithms were invented.ROOT-MUSIC exploits the resulting Vandermonde structure of the signal gain in thecase of a uniform linear array of sensors [9]. If the sensors are align such that a subar-ray of sensors is just a translation of another subarray, the ESPRIT algorithm utilizesthe transational invariance of the signal gain [61]. Variants for a uniform circular array(UCA) are introduced in [10].Algorithms in the field of EEG/MEG can roughly be divided into two categories. MU-

SIC is considered to belong to the class of parametric algorithms, which provide a costfunction which can be used in an optimization algorithm [8]. Examples of parametricalgorithms besides MUSIC or its variants are Least-Squares Estimation [8] and Beam-forming [72]. The other class of approaches, called imaging algorithms, uses a predefinedgrid on which a global solution is sought. Examples are LORETA [58] and FOCUSS [28],[29].Throughout all the variants of MUSIC, the basic approach and cost function used by

Schmidt remained unchanged in its core, though. We will give the derivation and theideas of Schmidt’s approach and the case of scalar linear parameters in Section 3.2. InSection 3.3, we will treat the case of vector-valued linear parameters. In the aforemen-tioned literature, this case was handled rather sloppily. Therefore, we will provide a morein-depth review of the necessary concept of principal angles.We consider the problem of estimating the state parameters of a known number of m

sources emitting signals measured by n sensors, where m < n. The MUSIC algorithmutilizes the special form of the forward solution (2.17) and (2.18)

y(t) = A(x)ω(t) + n(t)

12


derived in Sections 2.1.3 and 2.2 with additive noise n(t), where y ∈ Rn, A ∈ Rn×k andω ∈ Rk. We consider the columns of A(x) = [a1(x), . . . , ak(x)] as linearly independentto ensure the uniqueness of the solution, where k depends on the number of sourcesand the dimension of the linear parameters. We assume A,ω to be real-valued. This isjustified by Section 2.1.3 in the 3D parameter estimation case. In the 1D case, we haveA,ω complex, but we restrict ourself to the real case. A derivation of the MUSIC costfunction in the complex case is a straightforward extension which can be found in variouspublications mentioned above (see [62],[21]).The noise is assumed to be temporally and spacially uncorrelated Gaussian, zero mean

with covariance E[n(t)n(t)T

]= σ2 In. The signal of two different sources is assumed to

be uncorrelated, as well as we assume signal and noise to be uncorrelated. First, we willdescribe a one dimensional approach as introduced in the DOA estimation problem inSection 2.2, in which the position of one source can be described through one parameter,i.e. k = m and ω =

[ωT1 , . . . , ω

Tm

]T with scalar ωi, i = 1, . . . ,m. Later, we extend thisapproach to a fully three dimensional case as shown in Section 2.1.3, i.e. ωi ∈ R3 andk = 3m, which will require the use of principal angles.

3.2. Scalar-valued problem

The asymptotic autocorrelation of y(t) is given by

R = E[y(t)y(t)T

]= A(x)ω(t)ω(t)T︸︷︷︸

=:P

A(x)T + E[A(x)ω(t)n(t)T

]+ E

[n(t)ω(t)TA(x)T

]+E

[n(t)n(t)T

]= A(x)PA(x)T + σ2 In, (3.1)

where we used the zero mean assumption of the noise. We have rank(R) = n andrank(P ) = m, since we assumed the sources to be uncorrelated with the noise andamong each other. When sorted in descending order of magnitude, the eigenvalues λi,i = 1, . . . , n of R satisfy λ1 ≥ λ2 ≥ . . . ≥ λm > λm+1 = . . . = λn = σ2, that is, theeigenvalues of APAT are shifted by the noise variance σ2.The MUSIC algorithm exploits this distribution of eigenvalues to split the space spannedby R into so-called signal- and noise-subspace through eigendecomposition

R = A(x)PAT (x) + σ2I

=[Φs Φn

] [Λ + σ2I 00 σ2I

] [Φs Φn

]T= ΦsΛsΦ

Ts + ΦnΛnΦT

n ,

where Λ contains the eigenvalues of P , Λs = Λ + σ2I, Λn = σ2I and the spans ofthe matrices Φs ∈ Rn×m and Φn ∈ Rn×n−m are the signal and the noise subspaces,

13


respectively. Since R is symmetric and of full rank, the signal and noise eigenvectorsform an orthonormal basis of Rn. We define the MUSIC projector as

Π := ΦsΦTs . (3.2)

With Π being an orthonormal projection, obviously Π = I − ΦnΦTn also holds. If m is

known, this partitioning is straightforward. In the case of m being unknown, a distinctdrop in the magnitude of the eigenvalues of R can be used as an indicator of the amountof sources [50]. The MUSIC projector can be used to determine the similarity betweenthe signal space spanned by the received signal and the space spanned by an estimatedsource position x. The MUSIC cost function takes the form

f(x) =a(x)TΠa(x)

a(x)Ta(x), (3.3)

where a(x) is a column of the lead-field matrix. (3.3) can be interpreted as a normalizedprojection of a(x) on the space spanned by Π followed by a normalized scalar productbetween a(x)T and the projected vector. Since the scalar product of two normalizedvectors u and v can be interpreted as the cosine of the angle between u and v, we seethat (3.3) will be 1 if a(x) lies entirely in the space spanned by Π and 0 if a(x) isorthogonal to Π. The range of f(x) is [0, 1]. The lower bound is a direct consequenceof the quadratic nature of f(x), the upper bound can easily be derived through the sub-multiplicative property of the 2-norm.As a side note, the following equations also hold trivially:

a(x)TΠa(x)

a(x)Ta(x)=

a(x)T (I − ΦnΦTn )a(x)

a(x)Ta(x)= 1− a(x)TΦnΦT

na(x)

a(x)Ta(x).

Since a(x) and a(y) are linearly independent for x 6= y by assumption, f(x) = 1 if andonly if x is an actual source position. A slightly different cost function was introducedby Ferréol in [21] with

f(x) = a(x)TΠFa(x), (3.4)ΠF = I −A(x)A(x)+ = I − y(t)y(t)+,

where (·)+ denotes the Moore-Penrose pseudo-inverse and x is the actual source position.This cost function attains its minima at true source positions and has the advantage ofhaving a simple derivative, which we will use in Section 10.2.1.In an application, we will not have the autocorrelation R, but rather an approximation

R =1

N

N∑i=1

y(ti)y(ti)T

based on N measurements. This will lead to approximations of signal and noise sub-spaces Φs and Φn. We therefore have to seek source positions which will give the bestprojection into the space spanned by Φs or, equivalently, are nearly orthogonal to Φn.

14


The approximated cost function f(x) = a(x)T Πa(x)a(x)a(x)T

can be used as the cost function in anoptimization routine.Note that f(x) does not depend on the signal orientation ω(t), therefore the space the

optimizer has to operate on is noticeably reduced. Once a feasible minimum of x :=argmin

xf(x) is found, the remaining unknown parameters can be determined through

([50])ω(t) = A(x)+y(t).

In the presence of systematic noise h 6= 0, the sensor data is distorted as y(t, h) =A(x, h)ω(t) and therefore the distorted projection matrix is given by ΠF (h) = I −y(t, h)y(t, h)+. Assuming the asymptotic case, we have to solve the perturbed cost func-tion

f(x, h) = a(x)TΠF (h)a(x). (3.5)

3.3. Vector-valued problem

We now consider a 3-dimensional approach. In this case, a(x) ∈ Rn×3, rank(P ) = mholds and we cannot use a simple projection like (3.3). We will compare the subspacespanned by the forward solution at an estimated source position and the signal subspaceΠ by means of principal angles.

Principal angles: definition

We use the definition of principal angles given in [27]. Let F and G be subspaces in Rrwhose dimensions satisfy q = dim(F ) ≥ dim(G) = p ≥ 1.The principal angles Θ1, . . . ,Θp ∈ [0, π/2] are defined recursively by

cos(Θk) = maxu∈F

maxv∈G

uT v =: uTk vk,

subject to

‖u‖2 = ‖v‖2 = 1,

uTui = 0, i = 1, . . . , k − 1

vT vi = 0, i = 1, . . . , k − 1,

where ui, vi describe the principal vectors belonging to the principal angles.This definition is somewhat cumbersome. A simple calculation method is available,

though.

Principal angles: calculation

We assume the bases of the subspaces in question are given as matrices. Let A ∈Rr×p and B ∈ Rr×q be of full column rank and 0 < p ≤ q ≤ r. [27] describes aHouseholder orthogonalization to obtain orthogonal bases of A,B. We use singular value

15


α

β = 0

α

Figure 3.1.: Example of principal angles in three dimensions. Left: two two-dimensionalsubspaces. The principal angles are designated by α and β. β = 0 sincethe two subspaces share one dimension. Right: one one-dimensional and onetwo-dimensional subspace. The principal angle is the actual angle betweenvector and plane.

decomposition (SVD) as proposed in [12]. Assume UA ∈ Rr×p and UB ∈ Rr×q areorthonormal bases of the spaces spanned by A and B obtained through SVD, e.g. UA,Bare the right singular vectors. We investigate the singular value decomposition in moredetail in Section 5.1. We define

M := UTAUB ∈ Rp×q

and its singular value decomposition M = UMSMVTM , SM = diag(σ1, . . . , σp). If σ1 ≥

. . . ≥ σp, then the principal angles are given by

cos(Θk) = σk.

The principal angles Θ1, . . . ,Θp reflect the similarities between the subspaces spannedby A and B in Rr. If r = 3, the principal angles are actually the angles between thesubspaces as illustrated in Figure 3.1.To achieve a quadratic form as in (3.3), we write the squared principal angles as

f(x) = λmax(UA(x)TΠUA(x)

)(3.6)

with λmax denoting the maximum eigenvalue of the enclosed expression and UA(x) theleft singular values of the lead-field matrix A(x). (3.6) calculates the square of the cosineof the maximum principal angle. Once again, the assumption of linear independence ofthe 3m columns of A ensures f(x) = 1 if and only if x is an actual source position.In the presence of systematic noise, the cost function is constructed akin to (3.5) as

f(x, h) = λmax(UA(x)TΠ(h)UA(x)

). (3.7)

In the following two chapters, we will lay the foundation of derivatives of (3.6) and (3.7)with respect to x and h.

16

4. Derivatives of eigenvalues andeigenvectors

4.1. Introduction

In the next two chapters, we will lay the foundation of partial derivatives of the 3DMUSIC cost function. In the 1D case (3.4), derivatives w.r.t. x and h are straightforwardand known (see [21] and Section 10.2.1). To our knowledge, derivatives for the 3D caseare not stated in literature yet. Recalling (3.6), we see that derivatives of this MUSICcost function will require the derivative of a function of the form f(x) = λ(A(x)), whereλ denotes an eigenvalue of the matrix function A.

For the rest of this section we assume the matrix A(x) ∈ Rn×n, n ∈ N, is symmetric,depends continuously on x ∈ Rm, is sufficiently smooth and the derivatives of A withrespect to x are available up to any required order. The eigenvalues λ1(x), . . . , λn(x)and eigenvectors u1(x), . . . , un(x) of A are known point wise and the eigenvectors arenormalized as uTj uj = 1. We define an eigenvalue λi to be unique if λi 6= λj ∀x , i 6= j.Conversely, we define λi to be a repeated eigenvalue if ∃x : λi = λj , j 6= i. If we wantto analyze the behavior of the MUSIC algorithm, we will naturally have to investigateeigenvalue derivatives of symmetric matrices. As we will see later, calculation of higherorder eigenvalue derivatives will also require derivatives of the corresponding eigenvectors.A lot of research has been done investigating the existence of derivatives of eigensys-

tems. If A is an analytic function with distinct eigenvectors and x ∈ R1, there are wellknown proofs that also its eigenvalues and eigenvectors are analytic functions (see Rel-lich, [60] Chapter 1.1). This was proven to extend to the case of repeated eigenvaluesif A is hermitian [4]. Unfortunately, for x ∈ Rm>1, there are simple examples in whichthe eigenvalues and eigenvectors are not differentiable, although directional derivativesare known to exist [67]. Despite this result, our experience has shown that our proposedalgorithm gives valid information on the derivatives of the MUSIC cost function.To our knowledge, Nelson [51] was among the first to propose a simple algorithm to

compute the derivatives of eigenvalues and eigenvectors. Mills-Curan [46] provided aprocedure in the case of repeated eigenvalues with distinct eigenvalue derivatives. An-drew and Tan [5] expanded this procedure to symmetric eigensystems with repeatedeigenvalues and repeated eigenvalue derivatives up to an order k. [71] gave an extensionto general complex-valued eigensystems for the same case of repeated eigenvalues. Itis interesting to note that the approach shown there is very similar to [75], which wewill use extensively in Chapter 5, where we treat the derivatives of singular vectors. Aniterative procedure for large matrices is presented in [6]. We will use and slightly expanda comparatively old approach of Lee and Jung [39], because of its nice features of being

17

4. Derivatives of eigenvalues and eigenvectors

numerically stable, easy to implement and easy to expand to higher derivatives. Forthe rest of this chapter, we will drop the dependency of all terms on x for notationalconvenience.

4.2. Algorithm

4.2.1. The method of Lee and Jung

We extend the approach of Lee and Jung [39] to calculate the derivative of a uniqueeigenvalue and its corresponding eigenvector. Consider the generalized eigensystem

Auj = λjMuj , (4.1)

where A is symmetric positive semi-definite and M is symmetric positive definite. λj isthe jth eigenvalue and uj is the corresponding eigenvector of A, where λj is assumed tobe unique. uj is normalized with respect to M as

uTj Muj = 1. (4.2)

Differentiating both (4.1) and (4.2) and rearranging leads to

(A− λjM)u′j −Mujλ′j = λjM

′uj −A′uj , (4.3)

u′jTMuj = −0.5uTj M

′uj , (4.4)

where (·)′ indicates the derivative with respect to a design parameter. Lee and Jungwrite equations (4.3) and (4.4) as a single matrix equation[

A− λjM −Muj−uTj M 0

] [u′jλ′j

]=

[(λjM

′ −A′)uj0.5uTj M

′uj

]. (4.5)

It can be shown ([39]) that the left hand side of (4.5) is of full rank if λj is unique

(including the case λj = 0, λk 6= 0, k 6= j). Solving (4.5) for[u′jT , λ′j

]Tgives the desired

derivatives of the eigenvalues and eigenvectors. LR-decomposition for the matrix on theleft-hand side can be used to minimize computational effort. In our application, we haveM = I and M ′ = 0.

4.2.2. Extension to arbitrary derivatives

The extension of (4.5) to derivatives of arbitrary order is straightforward. We are usingmulti-index notation as introduced in Appendix A.1 to describe higher order partialderivatives. We form the α-derivative of (4.1) and (4.2), rearrange the result in the sameway as (4.5) and get [

A− λjM −Muj−uTj M 0

][u

(α)j

λ(α)j

]=

[L1 − L2

L3

], (4.6)

18


with

L1 :=∑β<αβ 6=0

∑γ≤α−β

(α

β

)(α− βγ

)λ

(β)j M (α−β−γ)u

(γ)j + λj

∑γ<α

(α

γ

)M (α−γ)u

(γ)j , (4.7)

L2 :=∑β<α

(α

β

)A(α−β)u

(β)j , (4.8)

L3 :=∑β<αβ 6=0

∑γ≤α−β

(α

β

)(α− βγ

)u

(γ)j

TM (α−β−γ)u

(β)j +

∑γ<α

(α

γ

)u

(γ)j

TM (α−γ)uj , (4.9)

where for the multi-indexes α, β ∈ Nn0 we define β < α if ∃i : βi < αi, βj ≤ αj , j 6= i.Proof of (4.7) and (4.8):Forming the α-th derivative of (4.1) with the Leibniz rule we get

(Auj)(α) = (λjMuj)

(α),∑β≤α

(α

β

)A(α−β)u

(β)j =

∑β≤α

∑γ≤α−β

(α

β

)(α− βγ

)λ


(γ)j . (4.10)

Rearranging the left and right side of (4.10), we get

left side = Au(α)j +

∑β<α

(α

β

)A(α−β)u

(β)j ,

right side = λ(α)j Muj +

∑β<α

∑γ≤α−β

(α

β

)(α− βγ

)λ


(γ)j

= λ(α)j Muj +

∑β<αβ 6=0

∑γ≤α−β

(α

β

)(α− βγ

)λ


(γ)j

+ λj∑γ≤α

(α

γ

)M (α−γ)u

(γ)j

= λ(α)j Muj +

∑β<αβ 6=0

∑γ≤α−β

(α

β

)(α− βγ

)λ


(γ)j

+ λj∑γ<α

(α

γ

)M (α−γ)u

(γ)j + λjMu

(α)j .

These left and right sides form the first row of (4.6),Proof of (4.9):

19


Forming the α-th derivative of (4.2) we get

(uTj Muj)(α) = 1(α)∑

β≤α

∑γ≤α−β

(α

β

)(α− βγ

)u

(γ)j

TM (α−β−γ)u

(β)j = 0

∑β<αβ 6=0

∑γ≤α−β

(α

β

)(α− βγ

)u

(γ)j

TM (α−β−γ)u

(β)j

+∑γ<α

(α

γ

)u

(γ)j

TM (α−γ)uj = −2uTj Mu

(α)j ,

which proves (4.9).(4.6) has the nice feature that, regardless of the order of the derivative, the left side of thelinear system of equations remains unchanged. Therefore, the system is always solvable.Using LR-decomposition to reduce the computational load is a natural choice.

20

5. Derivatives of singular values andsingular vectors

5.1. Introduction

In the last chapter, we assumed the derivatives of a matrix A, of which we calculateits eigenvalue derivatives, are given up to arbitrary order. Regarding the MUSIC costfunction (3.6), we see that these derivatives will require the derivative of the left singularvectors of the lead-field matrix.Let A ∈ Rm×n,m, n ∈ N,m ≥ n and rank(A) = o ≤ n. The singular value decomposi-

tion (SVD) of A is a factorization

A(x) = USV T ,

where U ∈ Rm×m, V ∈ Rn×n are orthonormal matrices and S ∈ Rm×n, where diag(S) =(σ1, . . . , σo) with zeros filling the remaining diagonal and off-diagonal elements of S. σiis called a singular value and the columns of U and V build the left and right singu-lar vectors, respectively. The singular values are nonnegative by construction and aretypically arranged in descending order. We use the operator diag() akin to the built-inMatlab function diag which, given x ∈ Rm, generates an m×m diagonal matrix with theelements of x on the diagonal or, given X ∈ Rm×m, generates a vector x ∈ Rm consistingof the diagonal elements of X. Efficient numerical methods to compute the SVD arewell-known [27].Let A(x) depend smoothly on x ∈ R and let the derivatives of A with respect to x

be available up to any required order. The existence and non-uniqueness of an analyticsingular value decomposition (ASVD)

A(x) = U(x)S(x)V (x)T , (5.1)

where A,U, S and V are all analytic functions was first proven by Bunse-Gerstner et al.[14], along with an algorithm to calculate ASVD’s. [45], [34] and [75] proposed additionalalgorithms for computing the ASVD. Derivatives of singular values and vectors are notinvestigated on their own, but Wright [75] shows means to calculate these derivatives touse them in an explicit Runge-Kutta scheme. [55] explicitly investigates derivatives ofthe singular values but basically repeats the results of [75]. To our knowledge, there isno theory on the existence of singular value derivatives if A is not analytical, but, as inthe case of eigenvalue derivatives, our numerical experience shows that the algorithm wepresent in the following sections gives feasible results.

21

5. Derivatives of singular values and singular vectors

We use normalization of the singular vectors as

U(x)TU(x) = I, (5.2)

V (x)TV (x) = I. (5.3)

The first o left singular vectors, e.g. the first o columns of U are determined uniquely.The remaining columns of U are chosen in a way that U builds an orthonormal system ofRm, although they are not needed to build A. V is built analogously. The normalization(5.2) also holds if only the first o columns of U are considered, but, as it will becomeapparent, we also need the following normalization, which only holds if U and V aresquare orthonormal matrices:

U(x)U(x)T = I, (5.4)

V (x)V (x)T = I. (5.5)

Computing full U and V matrices for given x poses no problem since basically all numer-ical routines provide means to calculate a full SVD (often these routines are performedby default and the rectangular versions of U, V are called ’sparse’). For the rest of thischapter, as before, we will not indicate the dependence on x explicitly for ease of notation.

5.2. Proposed method

We adapt and extend the approach by Wright ([75]). The following section thereforelargely duplicates his work. Wright uses derivatives of singular values and vectors tobuild ASVD’s. The importance of this approach to compute singular value derivativesis not explored. Furthermore, only a non-defective quadratic A is examined explicitly.We expand this approach to rectangular matrices and provide closed-form expressions.These closed-form expressions allow the calculation of arbitrary derivatives in the sameform as (4.6). Furthermore, we examine the case σp = σq, σ′p = σ′q, σ′′p 6= σ′′q , p, q =1, . . . , o, p < q, where (·)′ indicates the derivative w.r.t. a design parameter as before.To our knowledge, this has not been treated yet in the existing literature.

5.2.1. Distinct singular values

We differentiate (5.1), (5.2) and (5.3):

A′ = U ′SV T + US′V T + USV ′T, (5.6)

0 = UTU ′ + U ′TU,

0 = V TV ′ + V ′TV.

By setting Z := UTU ′ and W := V TV ′ we get

Z + ZT = 0,

W +W T = 0,

22


thus Z and V are anti-symmetric and diag(Z) = diag(W ) = 0. With (5.4), (5.5), wehave U ′ = UZ and V ′ = VW and so, we now have to determine Z and W to calculatethe derivatives of the singular vectors. We set Q = UTA′V and restate (5.6) as

S′ = Q− ZS + SW. (5.7)

Since Z, V are anti-symmetric and S is diagonal, we have diag(S′) = diag(Q) and thus thederivatives of the singular values are already determined. Wright continues to calculatethe off-diagonal elements of Z and W element-wise using (5.7). W.l.o.g. we assumep < q, p = 1, . . . , o− 1, q = 2, . . . , o and get

σqZpq − σpWpq = Qpq, (5.8)−σpZpq + σqWpq = Qqp. (5.9)

Excluding the trivial case of either σp = 0 or σq = 0, these two equations can be used toobtain Zpq,Wqp:

Zpq =σpQqp + σqQpq

σ2q − σ2

p

(5.10)

Wqp =Qpk − σqZpq

σp. (5.11)

Clearly, these equations only hold for σq 6= σp, the case σq = σp will be examined insection 5.2.3. Since p, q ≤ o, several elements of Z,W still have to be determined. Letp = 1, . . . , o, k = o + 1, . . . ,m and l = o + 1, . . . , n. Then (5.7) gives information aboutthe remaining elements as

σpZkp = Qkp,σpWpl = −Qpl.

(5.12)

The remaining blocks of Zkl,Wkl, k, l > o, contain no information and are therefore setto zero, as proposed by Wright.

5.2.2. Closed-form expressions of the derivatives

The formulas (5.10), (5.11) and (5.12) of the previous section allow calculation of Z andW element-wise. It may be favorable to obtain a closed form for both. A completeclosed form is not possible. This becomes obvious considering (5.10), (5.11) and (5.12).For n 6= o 6= m, several cases for the calculation of Z and W have to be distinguished(generally to avoid dividing by 0). However, for each of these separate cases, there areclosed-form expressions possible. We decompose Z and W in submatrices as

Z =

[Z1 Z2

−ZT2 0

],

W =

[W1 W2

−W T2 0

],

23


where Z1,W1 ∈ Ro×o are anti-symmetric, Z2 ∈ Ro×m−o andW2 ∈ Ro×n−o. The followingexpressions are inspired by Anderson [3], where a closed expression for the derivatives ofsingular vectors is stated. Unfortunately, only some of the derivatives shown there arecorrect, since Anderson sets Z1 = W1 = 0, which is erroneous. To see this, consider Ato be square and of full rank. In this case, Z2 and W2 vanish and the singular vectorderivatives would always be zero which is obviously wrong.We split the singular value decomposition into parts, which do contribute to A or do not:

A = [U1U2]

[Λ 00 0

] [V T

1

V T2

],

where U1 ∈ Rm×o, U2 ∈ Rm×m−o, V1 ∈ Rn×o, V2 ∈ Rn×n−o and Λ ∈ Ro×o. Now, Q isstructured as

Q =

[Q1 Q2

Q3 Q4

]=

[UT1 A

′V1 UT1 A′V2

UT2 A′V1 UT2 A

′V2

].

Rewriting (5.7) with these expressions, we immediately get ([3])

ZT2 = −UT2 A′V1Λ−1,W2 = −Λ−1UT1 A

′V2,(5.13)

andΛ′ = Q1 − Z1Λ + ΛW1. (5.14)

We introduce a new operator vec where vec(A) regroups the upper diagonal elements ofA in a vector in a row-wise manner. A simple example is shown in Appendix A.2.Now we can define z := vec(Z1) ∈ R

12o(o−1), w := vec(W1) ∈ R

12o(o−1) and y :=[

vec(Q1), vec(QT1 )]∈ Ro(o−1). Let

s1 := [σ2, . . . , σo, σ3, . . . , σo, . . . , σo−1, σo, σo] ∈ R12o(o−1),

s2 := [σ1, . . . , σ1︸︷︷︸o−1 times

, σ2, . . . , σ2︸︷︷︸o−2 times

, . . . , σo−1] ∈ R12o(o−1),

S1 := diag(s1),

S2 := diag(s2).

(5.14) can now be regrouped as a system of linear equations[S1 −S2

−S2 S1

]︸︷︷︸

=:S

[zw

]= y. (5.15)

The inverse of the matrix on the left-hand side can easily be found using techniques forthe inverse of a block matrix. (5.15) is of full rank for σi 6= σj . To see this consider

det(S) = det(S1) det(S1 − S2S−11 S2),

24


where det(S1) =∏oi=1 σ

i−1i 6= 0. The second determinant is also diagonal and we have

to check if zero entries on the diagonal are possible. If we examine s2 more closely, wesee that it is separated in o− 1 blocks with decreasing length. We consider block l withlength o− l. We assume the first entry of this block has index k in s2, then

(S1 − S2S−11 S2)kk = σk+1 − σk

1

σk+1σk,

which is zero only if σk = σk+1. Since the singular values are arranged in decreasingorder, this suffices to prove that (5.15) is of full rank if σi 6= σj .We now show the validity of (5.15) by proving that all equations in (5.8) and (5.9) are

incorporated in (5.15). Let p = 1, . . . , o − 1, q = 2, . . . , o, p < q, i = 1, . . . , 12o(o − 1),

j = 12o(o− 1) + 1, . . . , o(o− 1). We introduce the equalities

i =∑p−1

k=1(o− k) + q − p,j = i+ 1

2o(o− 1),(5.16)

where one can observe that Qpq = y(i) and Qqp = y(j).For given i, the elements of the i-th row of S are zeros, except for Sii and Sij . Byconstruction, (s1)i = σ1+p+q−p−1 = σq and (s2)i = σp. Therefore, the i-th row of thesystem of linear equations (5.15) is equivalent to (5.8). The same approach can be usedto prove the equivalence between the j-th row of (5.15) and (5.9) which concludes theproof.Higher derivatives of Z,W and thus higher derivatives of U, V and S can be obtained

by constructing derivatives of (5.13) and (5.15) analogously to (4.6). This will not befleshed out here, but its clear that we will have the benefits of LR-decomposition again.

5.2.3. Repeated singular values

Similar to repeated eigenvalues, the singular vectors associated to repeated singular valuesare not unique. In fact for σp = σq, any linear combination of the singular vectors up, uqis again a singular vector. As in the case of repeated eigenvalues (see for example [5]),one has to utilize information from higher derivatives of the SVD.Wright [75] already showed the derivation of Z and W in the case of one repeated

singular value pair with distinct derivatives. We will follow the derivation there, but willincorporate it in our closed form expression (5.15). Consider q = p+ 1 and σp = σq andσ′p 6= σ′q.We will not include the extension to multiple equal singular values σk = . . . = σl,k − l > 1, explicitly but as it will become clear, this extension will be a straightforwardcontinuation of our approach.All entries of Z,W except entries (p, q), (q, p) are computable. To see this, letS ∈ Ro(o−1)−2×o(o−1)−2 be the matrix S where rows i, j and columns j, i are eliminated,where the indices (i, j) are computed according to (5.16). z, w and q are constructedanalogously by erasing entries i, j. Now, by construction, the system of linear equations

S

[zw

]= q (5.17)

25


is of full rank. In combination with (5.13), all entries of Z,W except (p, q), (q, p) areknown. Since Q is readily computable, also S′ is known.We introduce the concept of reduced matrices, which will simplify the notation. Let

ZR be the reduced matrix of Z, where (ZR)kl = Zkl for (q, p) 6= (k, l) 6= (p, q) and(ZR)pq = (ZR)qp = 0. Now for Q,P ∈ Rm×m

(ZQ)pq = (ZRQ)pq + ZpqQqq,

(QZ)pq = (QZR)pq − ZpqQpq,(PZQ)pq = (PZRQ)pq − PpqZpqQpq + PppZpqQqq,

holds.If σp = σq, then (5.8) and (5.9) do not include sufficient information to determine

Zpq,Wpq uniquely. [75] calculates the respective derivatives and adds (5.8)′ to (5.9)′ toobtain a second equation. With the use of reduced matrices and

Q′ = UTA′′V − ZQ+QW, (5.18)

we obtain

2(σ′q − σ′p)Zpq + 2(σ′q − σ′p)Wpq = (UTA′′V )pq + (UTA′′V )qp

−(ZRQ)pq − (ZRQ)qp

+(QWR)pq + (QWR)qp. (5.19)

(5.8) (or (5.9) which contains the same information) and (5.19) determine Zpq and Wpq

uniquely and the first derivatives of U and V can be calculated.In the case of σk = . . . = σl, k− l > 1, we have to eliminate additional rows and columnsin (5.17) and consequently consider more equations similar to (5.19) and (5.8).To our knowledge, there is no existing procedure in literature which deals with the case

σp = σq, σ′p = σ′q and σ′′p 6= σ′′q . [75] addresses this issue briefly as being ’rather long’.The derivation, which is indeed complicated and technical, is presented in Appendix A.3.We show here the resulting system of linear equations. As before, we assume Zkl, Wkl,(k, l) 6= (p, q) are already computed by (5.17) and (5.13) and S′ is computed from Q.Zpq, Wpq can be calculated by solving

σp −σq 0T

l1 + l2 + l3,z + l4,z l1 + l2 + l3,w + l4,w vlM S

ZpqWpq

z′

w′

=

Qpqr2 + r3 + r4

r5

(5.20)

26


with

l1 = P ′′qq − P ′′pp − (ZRQ)qq + (ZRQ)pp + (QWR)qq − (QWR)pp,

l2 = 2(P ′′qq − P ′′pp),r2 = P ′′′pq + P ′′′qp − 2(ZRP

′′)pq + 2(P ′′WR)pq − 2(ZRP′′)qp + 2(P ′′WR)qp,

l3,z = −(ZRQ)qq + (ZRQ)pp + 2(QWR)qq − 2(QWR)pp,

l3,w = −2(ZRQ)qq + 2(ZRQ)pp + (QWR)qq − (QWR)pp,

r3 = (ZRZRQ)pq + (ZRZRQ)qp + (QWRWR)pq + (QWRWR)qp

−2 ((ZRQWR)pq + (ZRQWR)qp) ,

l4,z =1

σp(Q2Q

T2 )qq −

1

σq(Q2Q

T2 )pp,

l4,w =1

σp(QT3 Q3)qq −

1

σq(QT3 Q3)pp,

r4 =(

Λ−1(

Λ′Z2 − (WR)1QT3 −W2Q

T4 + P ′′

T3 +QT1 Z2

)·Q3

)pq

+ (. . .)qp

+(Q2[W T2 Λ +QT2 (ZR)1 +QT4 Z

T2 − P ′′

T2 −W T

2 QT1 ]Λ−1)pq + (. . .)qp ,

H = P ′′1 − Z1,RQ1 − Z2Q3 +Q1W1,R −Q2WT2 ,

h =

[vec(H)

vec(HT )

],

r5 = h− S′[zw

],

where, for ease of notation, the term (. . .) always contains the same components as theprevious bracketed expression. We used P as

P ′ := UTA′V = Q,

P ′′ := UTA′′V,

P ′′ =

[P ′′1 P ′′2P ′′3 P ′′4

]=

[UT1 A

′′V1 UT1 A′′V2

UT2 A′′V1 UT2 A

′′V2

],

and vl ∈ R1×o(o−1)−2, M ∈ Ro(o−1)−2×2 are calculated by Algorithms 2 and 3 shown inAppendix A.3, respectively. Note that solving (5.20) also provides all entries of Z ′ andW ′, except the entries (p, q), (q, p) . This would come in handy for the calculation ofhigher derivatives of the singular vectors but we will focus on the first derivatives. Asmall numerical example to illustrate the validity of (5.20) is shown in the next section.

27


5.2.4. Example of repeated singular values with repeated derivatives

We present an example of our calculation of singular value and vector derivatives in thecase of σp = σq, σ′p = σ′q, σ′′p 6= σ′′q . Let

U :=1√3

cos(x) 1 sin(x) −1− sin(x) −1 cos(x) −1

1 − sin(x) 1 cos(x)−1 cos(x) 1 sin(x)

,

V :=

cos(x) − sin(x) 0sin(x) cos(x) 0

0 0 1

,S := diag

(ex, ln(x+ 1) + 1, ex+1

),

set A := USV T . This is a modified example taken from [5]. U and V are orthonormaland A has two equal singular values and singular value derivatives for x = 0, we denotep = 1, q = 2, e.g. we have i = 1, j = 4. We can derive Z and W and their respectivederivatives from these analytic expressions, with Z1,2 = −1/3, W1,2 = −1.With (5.13), we obtain Z2 =

[−1

3 ,13 ,

13

]and W2 = ∅. The system of linear equations

(5.17) reads asσ3 0 −σ1 00 σ3 0 −σ2

−σ1 0 σ3 00 −σ2 0 σ3

Z1,3

Z2,3

W1,3

W2,3

=

Q1,3

Q2,3

Q3,1

Q3,2

with the result

Z1,3

Z2,3

W1,3

W2,3

=

1/31/300

.With these elements, the reduced matrices have the shape

ZR =

0 0 1/3 −1/30 0 1/3 1/3−1/3 −1/3 0 1/31/3 −1/3 −1/3 0

, WR =

0 0 00 0 00 0 0

.We state now the elements of (5.20):

l1 = −2 , l2 = −4,l3,z = l3,w = 0 , l4,z = l4,w = 0,r2 = 8 , r3 = 0,

r4 = −1/9 , r5 =

2e/97e/9

0−1

−e/3e/3−1/3−1/3

,M and vl are calculated through Algorithms 2 and 3 as

M =

e/3 0−e/3 0

0 −1/30 1/3

, vl =

−1/3−1/3e/3e/3

T

.

28


With these elements, we can solve Equation 5.20 and obtain

[Z1,2

W1,2

]=

[−1/3−1

],

[z′

w′

]=

0

1/300

which are indeed the desired missing elements of Z andW as established at the beginningof this section. The correctness of the elements z′ and w′ can easily be checked with thegiven the analytic singular values and singular vectors, which concludes the example.

29

6. Derivatives of 3D MUSIC

With the algorithms provided in Chapters 4 and 5 we are now able to differentiate theMUSIC cost function with systematic errors (3.7). We will need these derivatives inthe analysis of the asymptotic behavior of the estimation error in Chapter 7. To ourknowledge, there are no derivatives mentioned in the huge amount of literature andproposed algorithms. These might be of interest in themselves, since exact rather thanapproximate knowledge of the derivatives has definite advantages for solving the inverseoptimization problem. We recall the perturbed MUSIC projection function

F (x, h) = UT (x) Π(h)U(x). (6.1)

and the cost function (3.7)

f(x, h) = λmax(UT (x) Π(h)U(x)

)=: λmax (F (x, h)) . (6.2)

We calculate the required derivatives of the maximum eigenvalue through the algorithmpresented in Chapter 4, but since λmax in (6.1) is unique by assumption, we are able todescribe the derivative of (6.1) with respect to hi in closed form as (see [44])

∂

∂hif(x, h) = uTmax

∂

∂hi[F (x, h)]umax,

where umax(x, h) is the normalized eigenvector corresponding to λmax(F ). This allowsus to describe higher order derivatives of f in also closed form as opposed to a linearsystem. First and second order derivatives of the MUSIC projector Π are calculated in[73], Appendix B. To abbreviate notation, we write ∂

∂hiK = Khi and

∂∂xiK = Kxi for

some K. We set yhi := Ahix, yhihj := Ahihjx and Π⊥ := I −Π and get

∂

∂hiΠ =

(Π⊥yhiy

+)

+ (. . .)T ,

∂

∂hihjΠ =

(−Πhjyhiy

+ + Π⊥yhihjy+ + Π⊥yhi(y

T y)−1yThjΠ⊥ −Π⊥yhiy

+yhjy+)

+(. . .)T .

To get a feeling of the characteristics of the derivatives of (6.2) we show some of them.Extensions to higher derivatives should be straightforward:Derivatives w.r.t. h:

∂

∂hif(x, h) = uTmaxU

TΠhiUumax,

∂

∂hihjf(x, h) = (umax)ThjU

TΠhiUumax + uTmaxUTΠhihjUumax +

uTmaxUTΠhiU(umax)hj .

30

6. Derivatives of 3D MUSIC

Derivatives w.r.t. x:

∂

∂xif(x, h) = uTmax

(UTxiΠU + UTΠUxi

)umax,

∂

∂xixjf(x, h) = (umax)Txj

(UTxiΠU + UTΠUxi

)umax

+(umax)T(UTxiΠU + UTΠUxi

)xjumax

+(umax)T(UTxiΠU + UTΠUxi

)(umax)xj .

Mixed derivatives:

∂

∂hixjf(x, h) = (umax)TxjU

TΠhiUumax + uTmaxUTxjΠhiUumax

+uTmaxUTΠhiUxjumax + (umax)TUTΠhiU(umax)xj ,

Note that, despite the different appearance, we have f(x, h)xihj = f(x, h)hjxi .

31

7. Expectation of estimation error

7.1. Introduction

The main goal of this work is to establish an expression of the bias of the MUSIC costfunction in the presence of modeling errors. In Section 3.3 we established the perturbed3D MUSIC cost function

f(x, h) = λmax(UT (x) Π(h)U(x)

). (7.1)

The derivatives of f w.r.t. x and h up to arbitrary order are treated in Chapter 6. Severalauthors have considered error estimates for the distance between the true source positionx and the estimated perturbed position x due to modeling errors in the case of scalarx, in which (7.1) takes the form (3.3). We assume the asymptotic case, in which theprojection matrix Π(h) is fully known. Generally, the perturbation h is not known in anapplication. To achieve results, it is treated as a random variable and the expectation orhigher order moments of the distance estimate are calculated.The existing approaches vary in the modeling of h and the order of the utilized meth-

ods. [35] examined the effects of finite sampling which we do not address in this work.Swindlehurst [68] and Friedlander [25] used infinite sampling and a first order approachfor both position and modeling error. As a result of the first order modeling error, theMUSIC cost function was found to be unbiased. Ferréol et al. [21] expanded this ap-proach to a first order position error with a second order treatment of the modeling error,which resulted in a biased estimator even for zero-mean h. Their results are expandedin [22] to the probability of the proper resolution of two distinct sources.To our knowledge, no author has treated fully 3D problems yet. We will use the

approach in [21], but will treat the modeling error as general as possible. Furthermore,we propose a fixed-point iteration scheme to augment the position error to potentiallysecond order. Sections 7.2.1 and 7.2.2 will recap and simplify the approach used byFerréol. Section 7.2.3 gives a method to generate a second order position error. We willadapt the methods from these Sections to 3D problems in the remaining Sections of thischapter.

7.2. Scalar valued x

7.2.1. Estimation error as ratio of quadratic forms

In this section, we consider scalar x and h ∈ Rm to clarify the ideas used in our ap-proach. We will basically rely on several Taylor expansions to create an approximation

32


of the estimation error as a ratio of quadratic forms. The derivation is in parallel to thederivation shown in [21]. In Section 7.3, we will expand this to vector-valued x and h.This does not change the theoretical approach significantly but will increase the size ofthe involved matrices and therefore the computational load drastically. This will be aproblem which has to be considered separately (see Chapter 8).We want to derive an estimate for the estimation error x − x. Let fx denote the

derivative of the MUSIC cost function f(x, h) w.r.t. x ∈ R. With the assumptionfxx(x, h) 6= 0∀x, h ∈ R, we can use first-order Taylor expansion of fx(x, h) with respectto x at x

0 = fx(x, h)

≈ fx(x, h) + fxx(x, h)(x− x),

x− x ≈ fx(x, h)

fxx(x, h). (7.2)

(7.2) is still nonlinearly dependent on h. To overcome this, we use second order Taylorexpansion with respect to h at 0 to form quadratic forms of fx and fxx:

fx(x, h) ≈ fx(x, 0) + f ′x(x, 0)h+ 0.5f ′′x (x, 0)h2

=

[1h

]T [fx(x, 0) f ′x(x, 0)

0 12f′′x (x, 0)

] [1h

]=: ε(h)TQ1(x)ε(h), (7.3)

fxx(x, h) ≈ fxx(x, 0) + f ′xx(x, 0)h+ 0.5f ′′xx(x, 0)h2

=

[1h

]T [fxx(x, 0) f ′xx(x, 0)

0 12f′′xx(x, 0)

] [1h

]=: ε(h)TQ2(x)ε(h), (7.4)

with ε(h) ∈ Rm+1, and we used (·)′ and (·)′′ to designate first and second order derivativesw.r.t. h. Combining (7.2), (7.3) and (7.4) leads to the main formula of this section:

x− x ≈ fx(x, h)

fxx(x, h)≈ εTQ1(x)ε

εTQ2(x)ε. (7.5)

This estimate is a ratio of quadratic forms and is valid but requires knowledge of h. Inthe next section, we introduce means to calculate expectation and higher order momentsof (7.5) in the case of h being a random variable.

7.2.2. Moments of quadratic forms

The moments of a quadratic form are well known [44], but there are few results on theratio of hermitian or quadratic forms [23],[32] with limited applicability for our case. Werestate the results on moments of quadratic forms presented in [21], where a thorough

33


proof is also provided. We give an outline of this proof as well as an introduction intothe necessary Kronecker calculus in Appendix A.5. Let the event A1 be given by

A1 : 0 < εTQ2ε < 2E[εTQ2ε

]. (7.6)

A1 is likely to hold if the covariance matrix of ε is small. This condition ensures theconvergence of the following expression. The conditional expectation of the quadraticratio (7.5) is given by

E

[εTQ1(x)ε

εTQ2(x)ε| A1

]=

tr(Q1R)

tr(Q2R)

(1 + γ(R,R(4), . . .)

), (7.7)

where

γ(R,R(4), . . .) =∞∑n=1

n−1∑i=0

(−1)2n−i−1

(n− 1i

)tr(Q2R)i

·(tr(Q

⊗(i+1)2 R(2i+2))

tr(Q2R)− tr(Q1 ⊗Q⊗i2 )R(2i+2)

tr(Q1R)

),

Q⊗2 = Q⊗Q,Q⊗i = Q⊗ . . .⊗Q,R = R(2) = E

[εεT],

R(2i) = E[ε⊗iε⊗i

T],

where ⊗ denotes the Kronecker product and tr(·) denotes the trace. Note that R(2)⊗2 6=R⊗4. From now on, we will describe all Q-matrices as system matrix, since Q1 and Q2

depend only on the given physical system. R(2i) will be called covariance matrices forall i. These are independent of the physical system and depend only on the assumptionsmade on the systematic noise.

Note

[21] requires Q1, Q2 to be hermitian (symmetric in the real case). However, in the proofprovided in Appendix C there, this property is never required. Consequently, the proofextends to arbitrary quadratic Q1, Q2, as we use in our approach, without modifications.See also the outline of the proof in Appendix A.5.

Higher order moments

Higher order moments of the estimation error can be calculated easily through (see [21])

E

[(εTQ1ε

εTQ2ε

)n| A1

]=

tr(Q⊗n1 R(2n))

tr(Q⊗n2 R(2n))

(1 + γ(R(2n), R(4n), . . .)

).

We will investigate the higher order moments of the estimation error in more detail inthe 3D Section 7.3.

34


Summary

We used first order Taylor expansion to get an approximation of the estimation error(7.2), second order Taylor expansion to transform this approximation into a ratio ofquadratic forms (7.3), (7.4). The moments of this ratio (7.5) can be calculated up to anarbitrary order through (7.7), but there is little reason to calculate an expansion higherthan second order since (7.3) is only of second order. Additionally, as we will see inthe three dimensional case in the next section, numerical costs increase exponentially,limiting the feasible order to two in (7.7).In numerical simulations, each of these approximations has to be investigated sepa-

rately and we can expect (7.2) to be the ’weakest link’ because the expansion is onlyof first order. We denote the expansion of expectation up to different orders as singleexpressions for later use:First order expansion of expectation:

E [x− x]1D0 :=tr (Q1R)

tr (Q2R)

1 +−1

1

(tr(Q2R)

tr(Q2R)− tr(Q1R)

tr(Q1R)

)︸︷︷︸

=0

=

tr (Q1R)

tr (Q2R). (7.8)

As can be seen, the term for i = 0 always vanishes in (7.8).Second order expansion of expectation:

E [x− x]1D1 :=tr(Q1R)

tr(Q2R)

[1 +

1

tr(Q2R)·

(tr(Q⊗2

2 R(4))

tr(Q2R)− tr((Q1 ⊗Q2)R(4))

tr(Q1R)

)]. (7.9)

7.2.3. Extension to higher order estimates

Fixed-point iteration

We noted previously that the first order estimation (7.2) is expected to be the step whichperforms poorest numerically. This can be alleviated by using a fixed-point iteration. Weexpand (7.2) to second order

0 ≈ fx(x, h) + fxx(x, h)(x− x) +1

2fxxx(x, h)(x− x)2. (7.10)

This can be rewritten in fixed point iteration scheme with x := x− x as

x0 =fx(x, h)

fxx(x, h), (7.11)

xn =fx(x, h)

fxx(x, h)+

1

2

fxxx(x, h)

fxx(x, h)x2n−1. (7.12)

35


According to Banach’s fixed-point theorem, if Φ(y) := fx(x,h)fxx(x,h) + 1

2fxxx(x,h)fxx(x,h) y

2 is a con-traction with Lipschitz constant L < 1 in a neighborhood of x, (7.12) will converge withat least linear rate. In our approach, this leads to

‖Φ(y1)− Φ(y2)‖2 ≤ L‖y1 − y2‖2,

‖fxxx(x, h)

2fxx(x, h)

(y2

1 − y22

)‖2 ≤ L‖y1 − y2‖2, (7.13)

where y1, y2 lie in a neighborhood of x. (7.13) will be difficult to prove for a givenapplication. However, numerical experience shows that we can expect Φ(y) to be acontraction in most cases. Given that Φ(y) is a contraction, we have a good chance ofconvergence if y is small, e.g. x is small and x0 provided by the linear expansion (7.11)gives a reasonable first order estimate.

Expectation of higher order estimates

In this section, we will incorporate the fixed point iteration into the expansion of theexpectation. The first iterate reads as

x1 =fx(x, h)

fxx(x, h)+

1

2

fxxx(x, h)f2x(x, h)

f3xx(x, h)

. (7.14)

We define Q3 :=

[fxxx(x, 0) f ′xxx(x, 0)

0 12f′′xxx(x, 0)

]and get by Taylor expansion analogously to

(7.3) fxxx(x, h) ≈ ε(h)TQ3(x)ε(h).[21] proved that (uTAu)(uTBu) = u⊗2T (A ⊗ B)u⊗2, see also Appendix A.5. Utilizingthis, we write (7.14) as a ratio of quadratic forms

x1 ≈εTQ1(x)ε

εTQ2(x)ε+

1

2

ε⊗3T(Q3 ⊗Q⊗2

1

)ε⊗3

ε⊗3TQ⊗32 ε⊗3

. (7.15)

(7.7) is directly applicable, higher powers of the Kronecker product have to be considered,though. Note that when calculating the expectation of (7.15), we cannot just insert theexpectation of the previous iterate, e.g.

E

[1

2

εTQ3ε

εTQ2ε

εTQ1ε

εTQ2ε

]6= E

[1

2

εTQ3ε

εTQ2εE

[εTQ1ε

εTQ2ε

]].

For higher iterates, these powers will increase even further and are therefore not takeninto consideration. The expectation of (7.15) when the second term is expanded to firstorder is given by

E [x− x]1D1.5 := E [x− x]1D1 +1

2

tr((Q3 ⊗Q⊗2

2

)R(6)

)tr(Q⊗3

2 R(6)) . (7.16)

36


Higher order Moments

We will show the calculation of the second order moment and adapt the fixed pointiteration shown in (7.11) and (7.12). Taking the square of both equations results in

x20 =

f2x(x, h)

f2xx(x, h)

,

x2n =

f2x(x, h)

f2xx(x, h)

+fx(x, h)fxxx(x, h)

f2xx(x, h)

x2n−1 +

1

4

f2xxx(x, h)

f2xx(x, h)

x4n−1.

The zeroth and first iterate approximated as quadratic forms result in

x20 ≈ ε⊗2TQ⊗2

1 (x)ε⊗2

ε⊗2TQ⊗22 (x)ε⊗2

,

x21 ≈ ε⊗2TQ⊗2

1 (x)ε⊗2

ε⊗2TQ⊗22 (x)ε⊗2

+ε⊗4T

(Q3 ⊗Q⊗3

1

)ε⊗4

ε⊗4TQ⊗42 ε⊗4

+1

4

ε⊗6T(Q⊗2

3 ⊗Q⊗41

)ε⊗6

ε⊗6TQ⊗62 ε⊗6

.

Again, we can calculate the expectation of these terms with (7.7), where the size of thecorresponding matrices will grow very quickly with growing order.

7.3. Vector-valued x

7.3.1. Estimation error as ratio of quadratic forms

For vector-valued x ∈ R3 and h ∈ Rm, we have to consider the Jacobian, Hessian andhigher tensor derivatives of f(x, h). First of all, (7.2) takes the form

x− x ≈ fxx(x, h)−1fx(x, h). (7.17)

Since we restricted ourselves to one source, fxx(x, h) ∈ R3×3. We set

(Q2)ij =

[fxixj (x, 0) f ′xixj (x, 0)

0 12f′′xixj (x, 0)

]. Again, we used (·)′ and (·)′′ to denote Jacobian

and Hessian of the bracketed expression w.r.t. h. Note that (Q2)ij ∈ Rm+1×m+1 for

vector-valued h. With ε :=

[1h

]∈ Rm+1, we get the quadratic form

fxx(x, h) ≈

εT (Q2)11ε εT (Q2)12ε ·εT (Q2)21ε · ·· · ·

= (I3 ⊗ ε)T

(Q2)11 (Q2)12 ·(Q2)21 · ·· · ·

(I3 ⊗ ε) .

37


We set Q2j :=

εT (Q2)1jεεT (Q2)2jεεT (Q2)3jε

∈ R3, j = 1, . . . , 3, and write the inverse of the Hessian as

f−1xx (x, h) ≈ 1

〈Q21 , Q22 ×Q23〉

(Q22 ×Q23)T

(Q23 ×Q21)T

(Q21 ×Q22)T

,where 〈·, ·〉 and × are scalar and cross product, respectively. We used a simple techniqueto write the inverse of a 3× 3-matrix in closed form, see Appendix A.4.

Similarly, we define (Q1)i := εT[fxi(x, 0) f ′xi(x, 0)

0 12f′′xi(x, 0)

]ε ∈ R, fx(x, h) ≈

(Q1)1

(Q1)2

(Q1)3

Tand are now able to rewrite (7.17) as

x− x ≈

(Q22 ×Q23)T

(Q23 ×Q21)T

(Q21 ×Q22)T

·(Q1)1

(Q1)2

(Q1)3

〈Q21 , Q22 ×Q23〉

. (7.18)

We want to write (7.18) as a ratio of quadratic forms similar to (7.5). Therefore, weintroduce the operators � and � as cross- and vector-product extensions of the Kroneckerproduct ⊗. Let A = [A1, A2, A3]T , B = [B1, B2, B3]T and define

A�B :=

A2 ⊗B3 −A3 ⊗B2

A3 ⊗B1 −A1 ⊗B3

A1 ⊗B2 −A2 ⊗B1

,A�B := A1 ⊗B1 +A2 ⊗B2 +A3 ⊗B3.

We already showed that (uTAu)(uTBu) = u⊗2T (A ⊗ B)u⊗2. This easily extends toour cross- and vector-product extension. Let A := [uTA1u, u

TA2u, uTA3u]T , B :=

[uTB1u, uTB2u, u

TB3u]T , then

A×B =

(uTA2u)(uTB3u)− (uTA3u)(uTB2u)(uTA3u)(uTB1u)− (uTA1u)(uTB3u)(uTA1u)(uTB2u)− (uTA2u)(uTB1u)

= (I3 ⊗ u⊗2)T

A2 ⊗B3 −A3 ⊗B2

A3 ⊗B1 −A1 ⊗B3

A1 ⊗B2 −A2 ⊗B1

u⊗2

=: (I3 ⊗ u⊗2)T (A�B)u⊗2,

and

〈A,B〉 = (uTA1u)(uTB1u) + (uTA2u)(uTB2u) + (uTA3u)(uTB3u)

= u⊗2T (A1 ⊗B1 +A2 ⊗B2 +A3 ⊗B3)u⊗2

= u⊗2T (A�B)u⊗2.

38


With these definitions, we are able to write (7.18) as a ratio of quadratic forms. Let

Q2j:=

(Q2)1j

(Q2)2j

(Q2)3j

∈ R3(m+1)×(m+1) and Q1 :=

(Q1)1

(Q1)2

(Q1)3

∈ R3(m+1)×(m+1) and we have

x− x ≈

(I3 ⊗ ε⊗3)T

(Q22 �Q23)T

(Q23 �Q21)T

(Q21 �Q22)T

�Q1

ε⊗3

ε⊗3T [(Q22 �Q23) �Q21 ] ε⊗3, (7.19)

where the numerator is in R3 and the denominator is in R. As we can see, the expansionof the expectation of a quadratic ratio (7.7) is directly applicable, though we now haveto deal with ε⊗3 instead of ε. The first and second order expansion of (7.7) in thevector-valued case are straightforward. Nevertheless, we will present the results heresince upcoming problems will be illustrated well in these formulas.

7.3.2. First and second order expansion

In this section, we expand (7.19) using (7.7). For ease of notation, we write the numeratorand denominator of (7.19) as

Q11 := (Q22 �Q23)T �Q1, Q12 := (Q23 �Q21)T �Q1,

Q13 := (Q21 �Q22)T �Q1, Q2 := (Q22 �Q23)T �Q21 ,

with Q1j,Q2 ∈ R(m+1)3×(m+1)3 , j = 1, . . . , 3.

First order

We expand (7.19) using the notation we established at the beginning of this section forn = 1:

E

[ε⊗3TQ1j

(x)ε⊗3

ε⊗3TQ2(x)ε⊗3| A1

]≈

tr(Q1jR(6))

tr(Q2R(6))[1+

−1

1

(tr(Q2R

(6))

tr(Q2R(6))−

tr(Q1jR(6))

tr(Q1jR(6))

)︸︷︷︸

=0

=

tr(Q1jR(6))

tr(Q2R(6)), (7.20)

with j = 1, . . . , 3. The term for i = 0 always vanishes in (7.7), just like in the 1D case.We denote

E [x− x]0 :=tr(Q1j

R(6))

tr(Q2R(6)). (7.21)

39


Second order

The second order expansion E1 of (7.19) (i.e. n = 2) is given by

E

[ε⊗3TQ1j

(x)ε⊗3

ε⊗3TQ2(x)ε⊗3| A1

]≈

tr(Q1jR(6))

tr(Q2R(6))

[1 +

1

tr(Q2R(6))·

(tr(Q2

⊗2R(12))

tr(Q2R(6))−

tr((Q1j⊗Q2)R(12))

tr(Q1jR(6))

)],

=: E [x− x]1 , (7.22)

with j = 1, . . . , 3.

7.3.3. Extension to higher order estimates

We will derive a fixed-point iteration akin to (7.12). In the vector-valued case, thesecond-order expansion of (7.17) formulated as a fixed-point iteration reads as

xn+1 ≈ fxx(x, h)−1fx(x, h) +1

2fxx(x, h)−1

[xTnfxxx(x, h)xn

], (7.23)

with x = x − x and fxxx is a tensor of third order. We will now rewrite the bracketedexpression as a quadratic form of ε:

xT fxxx(x, h)x =

xT ∂∂x1

fxx(x, h)x

xT ∂∂x2

fxx(x, h)x

xT ∂∂x3

fxx(x, h)x

. (7.24)

The first iterate can be transformed to

xT fxxx(x, h)x ≈

·

ε⊗7T [QT1 �Mp�Q1]ε⊗7

ε⊗6T (Q2⊗Q2)ε⊗6

·

, (7.25)

with p = 1, . . . , 3, Mp ∈ R3(m+1)×3(m+1) and

Mp =∂

∂xp

M11 · M13

· · ·M31 · M33

,Mij =

∂2

∂2xixjQ0, i, j = 1, . . . , 3,

Q0 =

[f(x, 0) f ′(x, 0)

0 12f′′(x, 0)

],

Q1 = [Q11 ,Q12 ,Q13 ]T ∈ R3(m+1)3×(m+1)3 .

40


Proof of (7.25):Second order Taylor expansion of f(x, h) w.r.t.h at 0 leads to f(x, h) ≈ εTQ0ε. Considerone element of (7.24)

xT[∂

∂xp

∂2

∂2xf(x, h)

]x ≈

∑ij

(ε⊗3TQ1i

ε⊗3

ε⊗3TQ2ε⊗3

)T∂

∂xp

[∂2

∂2xixjεTQ0ε

](ε⊗3TQ1j

ε⊗3

ε⊗3TQ2ε⊗3

)

=ε⊗7T

[QT

1 �Mp �Q1

]ε⊗7

ε⊗6T (Q2 ⊗Q2) ε⊗6,

p = 1, . . . , 3, which proves (7.25).Connecting (7.19),(7.23) and (7.25) yields

x− x ≈ A1 +1

2

(I3 ⊗ ε⊗9T

)(Q22 �Q23)T

(Q23 �Q21)T

(Q21 �Q22)T

�

QT1 �M1 �Q1

QT1 �M2 �Q1

QT1 �M3 �Q1

ε⊗9

ε⊗9T (Q2 ⊗Q2 ⊗Q2)ε⊗9

=: A1 +

(I3 ⊗ ε⊗9T

)Q3ε

⊗9

ε⊗9TQ2⊗3ε⊗9

, (7.26)

where A1 is the first order estimate calculated by (7.19) and Q3 := [Q31 ,Q32 ,Q33 ]T isthe enclosed numerator of the second term. The expectation of (7.26) can be calculatedby (7.7). As we have seen in (7.22), the power of the required Kronecker products doublesin the expansion of second order. Given the already high power in our approach, we willonly use first order expansion of the second term (7.26) in the fixed point iteration. Thisresults in

E [x− x]1.5 := E1 +1

2

tr(Q3R

(18))

tr(Q2⊗3R(18)

) (7.27)

where E1 is calculated through (7.22). We use the notion E1.5 to emphasis that we donot have a full second order approximation of the expected estimation error since we onlyuse one iteration step in the fixed-point iteration (7.23) and first order expansion of thesecond element in (7.26).

7.3.4. Higher order Moments

Derivation of higher order moments is, as was the extension to higher order estimates,analogue to the scalar case presented in Section 7.2.3. We show the derivation of thesecond order moment. Taking the square of (7.17) and (7.23) component-wise results in

(x20)j ≈

(fxx(x, h)−1fx(x, h)

)2j,

(x2n+1)j ≈

(fxx(x, h)−1fx(x, h)

)2j

+(fxx(x, h)−1fx(x, h)

)j

(fxx(x, h)−1

[xTnfxxx(x, h)xn

])j

+1

4

(fxx(x, h)−1

[xTnfxxx(x, h)xn

])2j

41


j = 1, . . . , 3. The zeroth and first iterate approximated as quadratic forms result in

(x20)j ≈

ε⊗6TQ1j⊗2ε⊗6

ε⊗6TQ2⊗2ε⊗6

, (7.28)

(x21)j ≈

ε⊗6TQ1j⊗2ε⊗6

ε⊗6TQ2⊗2ε⊗6

+ε⊗12T

(Q1j⊗Q3j

)ε⊗12

ε⊗12TQ2⊗4ε⊗12

+1

4

ε⊗18TQ3j⊗2ε⊗18

ε⊗18TQ2⊗6ε⊗18

. (7.29)

We denote the first, second and third summand in (7.29) as basic, mixed and quadraticcomponent of the fixed-point approximation of the second order moment. We calculatethe expectation of these terms using (7.7) and define

M [x− x]0 :=tr(Q1j

⊗2R(12))

tr(Q2⊗2R(12)

) , (7.30)

M [x− x]1 := M0 ·[1 +

1

tr(Q2⊗2R(12))

·(tr(Q2

⊗4R(24))

tr(Q2⊗2R(12))

−tr((Q1j

⊗2 ⊗Q2⊗2)R(24))

tr(Q1j⊗2R(12))

)],

(7.31)

M [x− x]m0:=

tr((Q1j

⊗2 ⊗Q3j⊗2)R(24)

)tr(Q2

⊗4R(24)), (7.32)

M [x− x]q0 :=tr(Q3j

⊗2R(36))

tr(Q2⊗6R(36)

) , (7.33)

M [x− x]1.5 := M1 +Mm0 +Mq0 . (7.34)

We calculate the expectation of the mixed and quadratic components of the fixed-pointapproximation only up to first order to prevent the matrix sizes from becoming too large.

7.3.5. Outlook and Numerical problems

The extension from one dimensional problems treated in [21] and summarized in Section7.2 to fully three dimensional ones shown in Section 7.3.1 posed little theoretical chal-lenges, except finding an adequate notation.The numerical problems introduced are severe, though. The covariance matrices R(2) ∈R(m+1)×(m+1) and R(4) ∈ R(m+1)2×(m+1)2 in (7.7) may still be computable for moder-ately large m. In the three dimensional case this expands to R(6) ∈ R(m+1)3×(m+1)3

and R(12) ∈ R(m+1)6×(m+1)6 . Even for small problems with m = 4, this results inR(12) ∈ R56×56 ≈ 244 · 106 elements. A fixed point iteration will increase the requiredmatrix sizes even further. For larger m, this will soon render numerical storage of the

42


system and covariance matrices impossible for any given system. Furthermore, the nec-essary numerical tasks, essentially traces of huge matrix products in (7.20) and (7.22),will be prohibitively expensive.Therefore, we show in the next chapter general strategies to reduce the numerical

load in computing the trace of a matrix product and show some crucial simplificationsin the case of h being an uncorrelated random Gaussian vector with zero-mean. Thesesimplifications make a calculation of feasible moments of the estimation error possible inthe first place.

43

8. Implementation

In this chapter we present techniques to reduce both the computational load and the re-quired storage space to calculate the components of the approximation of the estimationerror moments. For simplicity, we refer only to (7.22) as a template of the necessarytasks. Higher order terms like (7.30)−(7.34) have the same shape and require the sametechniques. We show general techniques independent of the distribution of h and conse-quently of the shape of R(2i) in Section 8.1. In Section 8.2, we restrict ourselves to anuncorrelated Gaussian, zero-mean distribution of h and show important simplificationsand approximations of R(2i).As a general observation, we can see that the components of the first order approxima-

tion (7.20) reappear several times in the second order approximation (7.22). We ’merely’have to compute four different traces of matrix-matrix products to obtain a second orderestimate.

8.1. General techniques

8.1.1. Economic computation of the trace of a matrix product

Let Q ∈ Rk×k be arbitrary and R ∈ Rk×k be symmetric. We can split Q into a symmetricQs and anti-symmetric part Qa as

Q =1

2

(Q+QT

)+

1

2

(Q−QT

)=: Qs +Qa.

The trace of the matrix-matrix product QR can be decomposed as

trace(QR) = trace(QsR) + trace(QaR)︸︷︷︸=0

=k∑i=1

k∑j=1

QsijRji =k∑i=1

QsiiRii +k∑

j=i+1

(Qsij +Qsji)Rij

=

k∑i=1

QsiiRii + 2k∑

j=i+1

QsijRij , (8.1)

since Qs, R are symmetric. We used

trace(QaR) =

k∑i=1

Qaii︸︷︷︸=0

Rii +k∑

j=i+1

(Qaij +Qaji)︸︷︷︸=0

Rij .

44

8. Implementation

This way, we can reduce the computational effort from k2 to 12(k2 + k) multiplications.

However, in our case, we are not able to build up the whole matrix Qs, since k isprohibitively large. This will be addressed in the next section.

8.1.2. Storage issues

As already mentioned, large problem sizes m make storage of the required system andcovariance matrices difficult to impossible. We recall definition and dimension of therequired system matrices in (7.22) as introduced at the beginning of Section 7.3.2. Wehave little information about their structure in general, namely some of the elements arezero due to construction (see Section 7.3.1). However, applying the diagonalization shownin the previous section will render this information useless. With no further informationon the physical system, we have to assume these matrices are dense, so sparse matrixtechniques are not applicable.We showcase the shape of the system matrices with Q11 :

Q11 = ((Q2)22 ⊗ (Q2)33 − (Q2)32 ⊗ (Q2)23)⊗ (Q1)1

+ ((Q2)32 ⊗ (Q2)13 − (Q2)12 ⊗ (Q2)33)⊗ (Q1)2

+ ((Q2)12 ⊗ (Q2)23 − (Q2)22 ⊗ (Q2)13)⊗ (Q1)3. (8.2)

As can be seen, one entry (Q11)ij , i, j = 1, . . . , (m+ 1)3 is a sum of Kronecker productsof R(m+1)×(m+1) matrices, which are all readily available for a given system. We won’tbuild up the whole system matrix, but calculate its entries as needed. Therefore we needan algorithm which provides indices i1, . . . , i3, j1, . . . , j3 used on the right hand side of(8.2) to calculate a required entry (i, j) of Q11 .We provide an algorithm which gives the required indices i1, . . . , in, j1, . . . , jn to cal-

culate entry (i, j) of an arbitrary matrix Q⊗n, Q ∈ Rp×p. The indices are calculatedrecursively through

ik = d[i−∑k−1

l=1 (ik − 1)pn−l]/pn−ke,jk = d[j −

∑k−1l=1 (jk − 1)pn−l]/pn−ke, k = 1, . . . , n,

(8.3)

where d·e is the ceiling operator. With these indices, we can calculate

Q⊗nij = Qi1j1 ·Qi2j2 · . . . ·Qinjn . (8.4)

This enables us to calculate all entries of the system matrices in (7.22) as needed, elimi-nating the need to store the full matrices. For the sake of completeness, we present theinverse operation to (8.3):

i =∑n

k=1(ik − 1)pn−k + 1,

j =∑n

k=1(jk − 1)pn−k + 1.(8.5)

The problem of calculating R(2i) will be addressed in the next section.

45

8. Implementation

8.2. Gaussian, zero-mean distributed h

In this section, we will investigate means to calculate the covariance matrices R(2i) effi-ciently.

8.2.1. Scalar h

First, we will assume the scalar random variable h to be normal with zero mean andvariance σ2. We state an important formula of the expectation of arbitrary powers of h.

E [hn] =

{(n− 1)!!σn, for n even,0, for n odd.

(8.6)

where n!! = n · (n− 2) · . . . · 1 for n odd and n!! = n · (n− 2) · . . . · 2 for n even.Proof of (8.6):

E [hn] =1√

2πσ2

∫ ∞−∞

hn exp

(− h2

2σ2

)dh.

If n is odd, we have an odd integrand and the integral vanishes. If n is even, the solutioncan be found in books with integral tables (see i.e. [13]) and

E [hn]n even=

1√2πσ2

(n− 1)!! · 2√π

2n2

+1 ·(

12σ2

)n/2+1/2,

which, after some transformations, proves (8.6).

8.2.2. Multivariate h

Now, we consider h ∈ Rm normal distributed with zero mean and covariance Σ = σ2Im.

We set ε :=

[1h

]. As in Section 8.1.2, we will focus on calculating elements (i, j) of

R(2n) = E[ε⊗nε⊗n

T], n ∈ N, rather than trying to calculate the covariance matrix

completely. Similar to the component-wise multiplication of the elements of the systemmatrix (8.4), we can describe R(2n)

ij as a product of components of ε:

R(2n)ij = E [εi1 · . . . · εin · εj1 · . . . · εjn ] , (8.7)

where ik, jk, k = 1, . . . , n, are calculated with (8.3), ε1 = 1, εl = hl−1, l = 2, . . . ,m + 1.Since we assumed h to be uncorrelated, we can split (8.7) in components

R(2n)ij = E [1q0 · hq11 · . . . · h

qmm ]

= E [hq11 ] · . . . · E [hqmm ] , (8.8)

46

8. Implementation

where qp ∈ N0 is the multiplicity of hp determined by ik, jk, or the multiplicity of 1 forp = 0. Obviously,

∑mp=0 qp = 2n. We set (−1)!! := 1 and are able to apply (8.6):

R(2n)ij =

1, if q0 = 2n, qp = 0, p 6= 0, e.g. i, j = 1,

0, if ∃ qp odd , p = 1, . . . ,m,∏mp=1(qp − 1)!!σqp , if ∃ qk>0 even ∧ [(qp even ∨ qp = 0)∀ p > 0] .

(8.9)

As one can see, we can expect the vast majority of the entries of R(2n) to be zero, namelyall which have a single odd index qp>0. Indeed, as we will see later,if m = 4, thenR(12) ∈ R56×56 with ≈ 244 ·106 elements has only ≈ 7.7 ·106 nonzero elements. However,we still need to calculate the resulting indices ik, jk for every entry R(2n)

ij through (8.3),

determine the multiplicities of qp and then calculate R(2n)ij using (8.9). We show an

example of R(6) and R(12) in Appendix A.6.2. It would be favorable if we can determinethe nonzero entries of R(2n) beforehand and ignore all zero entries.

8.2.3. Nonzero entries of the Gaussian covariance matrix

As shown in the previous section, we can determine the value of R(2n)ij by specifying the

corresponding vector q ∈ Nm+10 . We assume we have such a function f which determines

the indices ik, jk with (8.3) and their multiplicities (see (8.8)).

f : N2 → Rm+1

(i, j) → q = [q0, q1, . . . , qm] .

If only one entry of qk>0 is odd, R(2n)ij = 0, so we can dismiss this entry. The goal is now

to identify these entries beforehand, so that they can be ignored. Our further approachis best illustrated by an example.

Example

We set R(2n)ij = R(2n)(q). Let problem size m = 3 and power of the covariance matrix be

n = 3. Then R(6) ∈ R64×64. From (8.8) and (8.9), we can observe

R(6) ([4, 2, 0, 0]) = R(6) ([4, 0, 2, 0]) = R(6) ([4, 0, 0, 2]) = σ2,

R(6) ([0, 4, 2, 0]) = R(6) ([0, 2, 0, 4]) = . . . = 3σ4 · σ2 = 3σ6,(8.10)

where (. . .) denotes all possible permutations of (0, 0, 2, 4) where the first entry 0 doesnot change. Note that each q usually represents a whole set of indices (i, j) as in ourexample f(5, 2) = f(21, 1).

As illustrated in the example, the problem of determining the nonzero entries of R(2n) canbe transformed into the combinatoric problem of determining all possible permutationsof certain m−tuples. We split this problem into three parts:

47

8. Implementation

1. We need to determine all possible configurations of q which lead to nonzero entriesof R(2n). These will be stored in a so-called configuration matrix A.In our example, we investigated the possible configurations of h2

k and h2k, h

4l , k 6= l.

2. Then, we determine all possible permutations of these configurations.In our example, h2

k has three possible permutations, h2k, h

4l , k 6= l has six possible

permutations (not all shown).

3. Finally, we have to determine all indices corresponding to these permutations.In our example, permutation [4, 2, 0, 0] has 15 corresponding (i, j)-tuple.

In our implementation, Step 1 has to be provided by the user. This is a surprisingly easybut tedious task. In Appendix A.6.1, we show several examples and guidelines to buildup A ∈ Ns×n+1 for arbitrary problems, where s depends on the n,m. The kth row of Awill be called configuration ak ∈ Nn+1.To determine the possible permutations in Step 2, we provide the recursive Algorithm

4 in Appendix A.6.3. For all rows k = 1, . . . , s, we initialize fill in Algorithm 4 withv = [0, 2, . . . , 2n] and p = ak.Step 3 will determine all indices i, j which correspond to the permutations of q obtained

in Step 2. Given the simplifications in Section 8.1.1, we see that we only need to determineindices with i ≥ j. In the context of components of R(2n)

ij in (8.7) this reads as

i ≥ j ⇐⇒ (∀k = 1, . . . , n : εik = εjk) ∨ (8.11)(∃k : εik > εjk ∧ εil = εjl , l < k) . (8.12)

This is accomplished by a modified version of Algorithm 4. We determine εik and εjkalternately, rather than all entries of ik and jk consecutively, and check if (8.11) or (8.12)hold. If one of the properties is invalid, the recursion is aborted. Note that if (8.12)holds true for a specific k ∈ (1, . . . , n), then εil < εjl for l > k is feasible.

8.2.4. Summary and computational effort

In this section, we recap the previous sections and give an overview on the necessary taskswe have to perform to calculate the expectation of the estimation error (7.19) throughfirst and second order expansion (7.20) and (7.22), respectively.We assume problem size m, noise variance σ2 and Jacobian and Hessian of the MUSIC

cost function (see Section 7.3.1) are available. Algorithm 1 shows all necessary steps tocompute the required traces of matrix/matrix products.

Computational effort

The amount of possible configurations as highly depends on the dimension m of a givenproblem. If m exceeds a number dependent on n, A won’t change, though.The number of possible permutations lk per configuration ak is shown in the Appendixin (A.17) to be

lp =m!∑mi=1 ai!

.

48

8. Implementation

Algorithm 1 Overview of tasksRequire: Problem size m, variance σ2 and necessary system matrices.

for n as required by (7.20) and (7.22) do- The user has to provide the configuration matrix A by hand. Guidelines are givenin Appendix A.6.1.for all possible configurations ak do- Calculate the value of R2n(q) corresponding to a configuration using (8.9).- Determine all possible permutations of a configuration using Algorithm 4.for all possible permutations do- Determine all possible indices of the permutation using a modified Algorithm4 as described in Section 8.2.3.- sum up entries of the system matrices as required in (7.20) and (7.22) accordingto (8.1), (8.2) and (8.4).

end for- Calculate portion of traces as required in (7.20) and (7.22).

end for- Form result of the expansion of the expectation.

end for

The number of possible indices li per permutation q is given by

li = 0.5

[(2n)!∏m+1i=1 qi!

+n!∏m+1

i=1qi2 !

],

as can be seen by combinatorial considerations. The number of possible permutationstherefore does not depend on the dimension of the problem.

8.2.5. Critical simplification step

The shape of the configuration matrix allows us to identify the entries of the covariancematrix by magnitude in σ. With n getting larger, we can expect some elements ofR(n) to approach infinity, regardless of the magnitude of σ since (n − 1)!! will grow toinfinity faster than σn will approach zero. However, for small n and σ, the nonzeroelements of R(n) approach zero before getting larger after reaching a certain n. Wewill proof this statement with the use of Stirling’s formula. With n even, we have

49

8. Implementation

(n− 1)!! = (n−1)!

2(n−2)/2(n−22 )!

and the following holds:

(n− 1)!!σn =(n− 1)!

2(n−2)/2(n−2

2

)!σn

=n!

2n/2(n/2)!σn

≈√

2πn(ne

)n2n/2

√2πn/2

(n2e

)n/2 σn=

(ne

)n/2√2σn =

(nσ2 n√

2

e

)n/2,

which means the left side will approach zero as long as the bracketed expression is smallerthan 1, which is the case if

nσ2 n√

2

e<nσ2√

2

e< 1,

n <e

σ2√

2. (8.13)

As can easily be verified numerically, this estimate is valid but pessimistic. If σ = 0.1,(8.13) will lead to n < 192, whereas (n − 1)!!σn ≈ 1 at n ≈ 250. Nevertheless, (8.13)shows that the elements of R(2n) can indeed be shown to approach zero by for moderatelylow n and small σ, which is the case in our application. Typically, n does not exceed 36and we operate with systematic noise in the magnitude of σ ≈ 0.1.

This allows a crucial simplification step, as we now are able to ignore entries of R(2n)ij

which exceed a to be defined magnitude. This magnitude has to be sought numerically,since we made no assumptions on the system matrices. Up to this date, we are notaware of a procedure which will enable us to estimate the result of the aforementionedsimplification a-priori.Our numerical experience showed feasible results with truncation after reaching magni-

tude σ4, thus allowing to truncate the vast majority of nonzero entries of R(2n). Chapter10.1 shows some model calculations.

50

9. Moving source: tracking viasequential Bayesian filtering

9.1. Introduction

In the previous chapters, we assumed the source to be spatially static over the duration ofthe measurements. In several applications, like radar, this is clearly not the case. It maybe feasible to assume the state to be quasi-static, i.e. the state changes slowly comparedto the amount of measurements possible in a given time frame. Nevertheless, additionalinformation on the dynamics of the source, if available, would be ignored.We recall the 3D estimation problem introduced in Section 2.1. The EM source is now

considered to be changing its state z ∈ Rs over time in a discrete sequence {zt, t ∈ N}.In the context of our inverse problem, the state of the source can be described throughits position x and orientation ω, i.e. z = (x, ω). The state sequence over time is assumedto be characterized through a state function

zt = gt(zt−1, vt−1), (9.1)

where z1:t = (z1, . . . , zt), zi ∈ Rsp , i = 1, . . . , t, is the sequence of states and v1:t =(v0, . . . , vt), vi ∈ Rup , i = 1, . . . , t is the sequence of the so-called process noise. Note thatthe state function gt may change over time and that (9.1) describes a Markov processof order 1. The state of the system is observed indirectly through means of a generalmeasurement function

yt = ft(zt, nt), (9.2)

where nt ∈ Rum is the measurement noise and y1:t = (y1, . . . , yt), yi ∈ Rsm , i = 1, . . . , t,is the sequence of measurements. The statistics of process and measurement noise areassumed to be known. Also, we need to assume that the measurements are conditionallyindependent if the state is known, i.e. p(yt|z1:t, y1:t−1) = p(yt|zt).

In contrast to the deterministic approach presented in Chapter 3, we treat the un-known state parameters as random variables. In the Bayesian inference approach, ratherthan estimating the parameters of the system, one attempts to determine the conditionalprobability density function (pdf) p(zt|y1:t) of the state given the measurements, calledthe posterior probability density (PPD), recursively over time. Informations of the actualstate as well as confidence intervals are potentially deducible from this pdf. The predic-tion is recursive in that the PPD is updated as soon as new information on the system inthe form of measurements is available. This means that one does not have to save pre-ceding measurements or has to process a whole batch of data. Research and algorithmsin the field of Bayesian inference have a long and rich history. Among the immense work

51

9. Moving source: tracking via sequential Bayesian filtering

and survey papers, we refer to the excellent tutorial paper of Arulampalam [7] and thethorough introduction of Chen [15]. We now give an introduction into the ideas andnecessary tasks of Bayesian estimation.Bayesian inference consists of two steps, prediction and update. In the prediction

stage, we assume that the PPD p(zt−1|y1:t−1) at timestep t−1 is known. We calculate anestimate on the state probability density at time t through the prior density p(zt|y1:t−1).With the use of the Chapman-Kolmogorov equation (see [70], pp.78), we have

p(zt|y1:t−1) =p(zt, y1:t−1)

p(y1:t−1)

=

∫p(zt, zt−1, y1:t−1) dzt−1

p(y1:t−1)

=

∫p(zt|zt−1, y1:t−1) p(zt−1|y1:t−1) p(y1:t−1) dzt−1

p(y1:t−1)

=

∫p(zt | zt−1) p(zt−1 | y1:t−1)dzt−1, (9.3)

where we used p(zt|zt−1, y1:t−1) = p(zt|zt−1) due to the Markov property. The priorconsists of the PPD of timestep t − 1 and the transition probability p(zt|zt−1) which isdefined through (9.1) and the known statistics of vt−1.

With new information in the form of yt arriving, we are able to calculate the posteriorprobability density. More precisely, we update the prior density using Bayes’ rule as

p(zt | y1:t) =p(zt, y1:t)

p(y1:t)

=p(yt | zt, y1:t−1) p(zt, y1:t−1)

p(y1:t)

=p(yt | zt, y1:t−1) p(zt | y1:t−1) p(y1:t−1)

p(y1:t | y1:t−1)p(y1:t−1)

=p(yt | zt, y1:t−1) p(zt | y1:t−1)

p(yt | y1:t−1)

=p(yt | zt) p(zt | y1:t−1)

p(yt | y1:t−1). (9.4)

The posterior density consists of

• the prior (9.3),

• the likelihood p(yt | zt) and

• the evidence p(yt|y1:t−1) =∫p(yt | zt) p(zt | y1:t−1) dzt, which itself consists of the

prior and the likelihood.

As one can see, all information on preceding timesteps is comprised in the PPD oftimestep t − 1, e.g. no information on the past is needed to be stored. Calculating orapproximating these values is the core element of Bayesian estimation [7].

52


The actual state of the system can be estimated by calculating the expectation of thePPD. (9.3) and (9.4) can be considered templates which form the groundwork for optimalestimation. In general, these quantities are not known and have to be approximated. Insome special cases, though, analytic solutions exist. In the case of linear state andmeasurement functions and additive Gaussian process and measurement noise, this willresult in the well-known Kalman filter [15]. Our method of choice are particle filtersdescribed below, which provides an approximate solution through the use of weightedsamples.We will focus our attention on a special case of measurement functions, namely (2.17),

which is used in MEG/EEG and is the main requirement for the use of the MUSIC algo-rithm. The consideration of Bayesian filtering in the context of MEG/EEG is relativelynew. To our knowledge, Somersalo [63] first introduced a particle filter approach. Ad-ditionally, a dynamic scheme to detect a non-constant number of sources over time wasprovided. Sorrentino and Pascarella [64] presented a thorough introduction in filteringand a comparison with existing static estimators like RAP-MUSIC and Beamforming(see also [57],[65] for preliminary work) as well as a comparison between particle filter,Beamforming and subspace approaches [56]. If the EEG/MEG Problem is considered asan imaging problem, the measurement function (9.2) is transformed into a linear func-tion with respect to the orientations of the sources. In combination with a linear statefunction (9.1), the inference problem can be solved with the Kalman filter [66].For the next section we assume a general state function (9.1). To solve the resulting

inference problem we will introduce particle filters in Section 9.2. We propose a newmethod of weighting the samples at the end of this section.

9.2. Particle filtering

The idea of particle filtering is to approximate the posterior probability density (9.4) bya set of independent random samples (called particles) xi1:t =

(xi1, . . . , x

it

)taken from the

posterior p(z1:t | y1:t) with corresponding weights wit, where i = 1, . . . , Np denotes theindex of a particular particle. The weights are normalized as

∑Npi=1w

it = 1. The posterior

probability density can therefore be approximated by

p(z1:t | y1:t) ≈Np∑i=1

witδ(z1:t − xi1:t), (9.5)

where δ is the Dirac delta function. Note that in contrast to (9.4), we calculate the PPDof the whole system up to timestep t. We will drop this assumption later for a PPD in theshape of (9.4) but calculating (9.5) simplifies the following derivation. The expectationof the PPD is approximated by

E [z1:t | y1:t] =

∫z1:t p(z1:t | y1:t) dz1:t (9.6)

≈Np∑i=1

wit xi1:t. (9.7)

53


Ideally, the samples should be taken proportional to the underlying PPD. Since we arenot able to do this, it is common to draw the samples from a to be defined proposaldensity q(xi1:t | y1:t) and change the integral (9.6) to

E [z1:t | y1:t] =

∫z1:t q(z1:t | y1:t)

p(z1:t | y1:t)

q(z1:t | y1:t)dz1:t (9.8)

≈Np∑i=1

witxi1:t. (9.9)

The crucial difference between (9.7) and (9.9) is that the samples are drawn from theproposal density q instead of the PPD and the weights are defined by wit =

p(xi1:t|y1:t)q(xi1:t|y1:t)

.Since the numerator is not known, the weights are only known up to a proportionalitywit ∝

p(xi1:t|y1:t)q(xi1:t|y1:t)

and therefore have to be normalized as wit := wit/∑Np

i=1wit ([15]). To

be able to calculate weights and PPDs over time recursively, the importance density isdefined to factorize as

q(z1:t | y1:t) = q(zt | z1:t−1, y1:t) q(z1:t−1 | y1:t−1). (9.10)

It can then be shown that

p(z1:t | y1:t) =p(yt | zt) p(zt | zt−1)

p(yt | y1:t−1)p(z1:t−1 | y1:t−1)

∝ p(yt | zt) p(zt | zt−1) p(z1:t−1 | y1:t−1). (9.11)

Combining (9.10) and (9.11) we have

wit ∝ wit−1

p(yt | xit) p(xit | xit−1)

q(xit | xi1:t−1, y1:t).

The application of the above recursion leads to the sampling importance sampling (SIS)algorithm which is shown in [7].In the case that q(zt | z1:t−1, y1:t) = q(zt | zt−1, yt), the importance density is only

dependent on the last state and the current measurement. Only the particles xit need tobe stored in this case in contrast to the whole state and measurement sequence [7]. Theexpectation of the PPD (9.4) can be approximated by

E [zt | y1:t] ≈Np∑i=1

witxit. (9.12)

Choice of proposal density

A crucial step in the above sampling is the choice of a good proposal densityq(zt | z1:t−1, y1:t). The best choice in the sense that the variance of weights is minimizedis ([7])

q(zt | z1:t−1, y1:t) = p(zt | z1:t−1, y1:t).

54

9. Moving source: tracking via sequential Bayesian filteringparticle true state

1

2

3

4

t = t + 1

12

3

4weigh particles

w1 = 0.35

w2 = 0.05

w3 = 0.4

w4 = 0.2

resample

1

3, 2

4

Figure 9.1.: Illustration of SIR algorithm. From left to right: initial state and particledistribution, state and particle update, particle weighting, resampling stage,particle 2 is abandoned, therefore particle 3 is duplicated.

However, this probability density function is often not available. The most commonchosen proposal density ([20], [19], [30]) is

q(zt | z1:t−1, y1:t) = p(zt | zt−1),

which leads to a particularly simple weight update

wit ∝ wit−1 p(yt | xit).

Sample degeneracy and resampling

A known problem of the SIS algorithm is sample degeneracy, i.e. only a few particleswill have nonzero weight after a few iterations. Thus, a significant amount of computingpower will be performed which has negligible effect on the approximation of the PPD(9.5) or its approximated expectation (9.12). A possible, but undesirable solution is tosimply increase the amount of particles Np or try to adjust the sample size on the fly (see[24]). A different approach is to focus the computing power on areas with high weightand cancellation of particles with low weight, which is called resampling.A simple realization of this is the sampling importance resampling (SIR) algorithm,

which performs resampling in every recursion step. The resampling step builds a new setof particles xit from the existing set xit, i = 1, . . . , Np, where the probability to incorporatea particle from the old set into the new set is determined by the particle weight inthe old set. Note that drawing a particle with high weight repeatedly is possible andactually desired. We present the resulting SIR and resampling algorithm in AppendixA.7. The weights are reset to wit = 1/Np ∀i, i.e. the weight update equation reducesto wit ∝ p(yt | xit). The particles are now clustered in a few states with high weight.Therefore, resampling alleviates the sample degeneracy but introduces loss of varietyamong the observed states [15], [19]. To ensure that the particles fan out in the followingtimestep, [7] supposes in each time update for each particle to draw a sample vit−1 of theprocess noise and update the particle according to xit = gt(x

it−1, v

it−1). An illustration of

particle filtering with resampling is shown in Figure 9.1

55


Choice of likelihood function

We still need to define the necessary likelihood function p(yt | xit). Let the state z = (x, ω)denote the state of the system through position x and orientation ω of the source. Weassume the measurement function to be of the form

yt = A(xt)ωt + nt,

where nt is uncorrelated Gaussian, zero-mean with covariance σ2In. The natural choiceof likelihood of one particle for this Gaussian setting is

p(yt | xit) =1

2πn2 |σ2In|

12

exp

(−1

2(yt −A(xit)ω

it)T (σ2In)−1(yt −A(xit)ω

it)

)=

1

2πn2 σn

exp

[− 1

2σ2

(yt −A(xit)ω

it

)T (yt −A(xit)ω

it

)], (9.13)

which is the multivariate Gaussian probability density function. This treatment of thelikelihood function is the most common, see [64], [7], [36] and [56]. However, as hasalready been pointed out ([63]), this will result to a highly peaked likelihood in the caseof low measurement noise. The weights may all be close to zero or numerically zero,resulting in an erroneous representation of the PPD, which will get worse over time.Our proposition is to use the MUSIC estimator (3.6) as the likelihood function to

weigh the particles. In contrast to (9.13), the accuracy of the MUSIC estimator will notdecrease with decreasing noise. We can actually expect it to increase. A first Idea of thiswas proposed in [54].We emphasize that the use of the MUSIC estimator is purely heuristic, since the estimatoris not a probability density function (see [48]). Nevertheless, our numerical tests showencouraging results. We show numerical examples which compare different kinds ofweighting functions in Section 10.3.

56

10. Simulations

In this chapter, we use simulated data to test the concepts and algorithms presented inthe previous chapters. The calculations necessary to form asymptotic error estimates areperformed on a Quad-Core AMD Opteron Processor 2380. All of these calculations aredone on the same system to achieve comparability of the required computing times. Thecomputation of the necessary matrix traces is implemented in Fortran 90 and parallelizedusing OpenMP.To validate the results of asymptotic error estimates presented in Chapter 7, we com-

pare the predictions to a sampled expectation and sampled second moment in test prob-lems. Therefore, we sample h ∼ N (0, σI) and solve the resulting perturbed inverseproblem of determining the source location through measurements with systematic er-rors. We use the Levenberg-Marquardt algorithm to solve the underlying optimizationproblem. Implementations of the Levenberg-Marquardt algorithm are available in Mat-lab and Alglib1. Measurement noise is set to zero since we consider the asymptotic case.It should be noted that, even in the perfect foresight model of h = 0, the true sourceposition is determined with an error of about 10−7. This is due to the optimizer reach-ing certain optimality conditions and inevitable rounding errors. Therefore, we cannotexpect to recreate a predicted asymptotic error in this magnitude using samples. Over100000 sample solutions to the perturbed inverse problem are typically necessary to geta staple sample mean. Calculations for the sampled mean are done on a cluster of 48AMD Opteron Processor 6180 SE and require 4 days of computing time in the 1D caseand about 14 days in the 3D case for all considered source locations and systematic noiselevels.First, we test the validity of the truncated trace calculation introduced in Section 8.2.3.

Next, the asymptotic error estimate of Chapter 7 is investigated in a 1D and 3D testcase. A test case on weighting techniques of the particle filter concludes this chapter.

10.1. Trace calculation

10.1.1. Convergence rate of truncated trace calculation

In Section 8.2.3 we presented means to determine and calculate the nonzero entries ofR(2i) in decreasing order of σ. A critical step to achieve a reasonable computing time isto truncate the calculation of the full trace of the matrix-matrix products in (7.7) afterthe desired accuracy is obtained. Since we assumed that no information on the entriesof the system matrices is available, we will have to determine a calculation threshold

1http://www.alglib.net/

57

10. Simulations

0 5 10 15 20 25 3010

0

102

104

106

108

no. of in

dic

es c

orr

espondin

g to a

perm

uta

tion

configuration no.

0 5 10 15 20 25 300

100

200

perm

uta

tions o

f a c

onfigura

tion

indices

permutations

(a) Indices and permutations for m = 6.

0 5 10 15 20 25 3010

0

102

104

106

108

no

. o

f in

dic

es c

orr

esp

on

din

g t

o a

pe

rmu

tatio

n

configuration no.

0 5 10 15 20 25 300

500

1000

1500

pe

rmu

tatio

ns o

f a

co

nfig

ura

tio

n

indices

permutations

(b) Indices and permutations for m = 10.

Figure 10.1.: Number of permutations and associated number of indices.

numerically. Calculating the expansion of first order with n = 3 poses little problemsince computing times are typically below one minute. We therefore investigate mainlythe problems n = 6 and n = 9 arising in second order expansion (7.22) and first orderfixed-point iteration (7.26). In the following examples, we set σ = 0.1. Of course, theeffect of truncation depends highly on the magnitude of σ.

Full trace calculation

For moderate problem size m, we are able to compute the required traces in (7.7) exact.We use these results to establish a truncation threshold which we will use for furthercalculations. For n = 6 andm ≥ 6, the configuration matrix A ∈ R30×7 remains constant.Figure 10.1 shows the number of permutations per configuration and the number ofindices associated with one permutation for n = 6 and m = 6, 10. Note that the numberof possible indices per permutation has a logarithmic scale. We point out that the

58

10. Simulations

0 5 10 1510

−15

10−10

10−5

100

Re

lative

err

or

configuration no.

tr( (Q11⊗ Q

2)R

(12))

0 5 10 1510

−15

10−10

10−5

100

tr( (Q12⊗ Q

2)R

(12))

Re

lative

err

or

configuration no.

0 5 10 1510

−20

10−10

100

tr( (Q13⊗ Q

2)R

(12))

Re

lative

err

or

configuration no.0 2 4 6

10−15

10−10

10−5

100

tr( (Q2⊗ Q

2)R

(12))

Re

lative

err

or

configuration no.

(a) Calculation of matrix-matrix producttraces for m = 6.

0 10 20 3010

−20

10−10

100

Re

lative

err

or

configuration no.

tr( (Q11⊗ Q

2)R

(12))

0 10 20 3010

−20

10−10

100

tr( (Q12⊗ Q

2)R

(12))

Re

lative

err

or

configuration no.

0 10 20 3010

−15

10−10

10−5

100

tr( (Q13⊗ Q

2)R

(12))

Re

lative

err

or

configuration no.0 5 10 15 20

10−20

10−10

100

tr( (Q2⊗ Q

2)R

(12))

Re

lative

err

or

configuration no.

(b) Calculation of matrix-matrix producttraces for m = 10.

Figure 10.2.: Effects of truncated calculation 1

number of indices does not change for one given configuration. The number of possiblepermutations increases with increasing m, though. R

(12)m=6 ∈ R76×76 has ≈ 122 · 106

nonzero entries, R(12)m=10 ∈ R116×116 has ≈ 3.44 · 109 nonzero entries, e.g. in both cases,

we are able to ignore about 99% of the covariance matrix since these entries are zero.For illustrative reasons, we choose to calculate

tr(Q2⊗2R(12)) and tr((Q1j

⊗Q2)R(12)), j = 1, . . . , 3,

as required in (7.22). Computing time was 3 minutes for m = 6 and 2 hours for m = 10.Figure 10.2 shows the results. We plot the relative error between truncated and completecalculation, where we do not show calculations if the relative error is zero in floating pointrepresentation. Both calculations show the same qualitative behavior. A relative errorof below 1% is well achieved if the calculation is truncated after step 4. If the truncationis done as described, the number of processed entries of the covariance matrix is furtherreduced to ≈ 24 · 103 for m = 6 and ≈ 70 · 103 for m = 10, reducing the computing timein both cases to below 1 second.

As shown with these examples, in utilizing truncated trace calculation we are able toreduce the computing time from hours to seconds while suffering an error of only about1%.

Truncated trace calculation

Problem sizes of m > 20 make truncated trace calculation mandatory, since a full calcu-lation would require weeks of computing time. At this rate, a direct calculation of x− xwith sampled h and calculating the sample mean would be much more efficient. Figure10.3 shows the results of a trace calculation with m = 31. The calculation is abortedafter calculating all elements up to O(σ6) due to time constraints. A relative error istherefore calculated by using the trace calculation which used the most configurationsas a reference value. We show calculations for σ = 0.1 on the left and σ = 0.05 on

59

10. Simulations

0 2 4 610

−2

10−1

100

Re

lative

err

or

configuration no.

tr( (Q11⊗ Q

2)R

(12))

0 2 4 610

−2

10−1

100

tr( (Q12⊗ Q

2)R

(12))

Re

lative

err

or

configuration no.

0 2 4 6 810

−2

10−1

100

tr( (Q13⊗ Q

2)R

(12))

Re

lative

err

or


10−4

10−2

100

tr( (Q2⊗ Q

2)R

(12))

Re

lative

err

or

configuration no.

(a) Calculation of matrix-matrix producttraces for m = 31, σ = 0.1.

0 2 4 610

−4

10−2

100

Re

lative

err

or

configuration no.

tr( (Q11⊗ Q

2)R

(12))

0 2 4 610

−3

10−2

10−1

100

tr( (Q12⊗ Q

2)R

(12))

Re

lative

err

or

configuration no.

0 2 4 610

−4

10−2

100

tr( (Q13⊗ Q

2)R

(12))

Re

lative

err

or


10−5

100

tr( (Q2⊗ Q

2)R

(12))

Re

lative

err

or

configuration no.

(b) Calculation of matrix-matrix producttraces for m = 31, σ = 0.05.

Figure 10.3.: Effects of truncated calculation 2

the right. Figure 10.3(a) shows the potentially introduced errors due to truncated tracecalculation, since we encounter a relative error of 10% if the calculation is truncated after4 configurations. This error can be expected to increase with increasing σ. The error isnot an issue if σ is smaller, as illustrated in Figure 10.3(b), though. In our further 3Dsimulations, we typically use σ ∈ [0.001, 0.1]. We can expect the results of the trace cal-culation to be valid in the case of small σ. With σ approaching 0.1, additional attentionon the results is required.

10.2. Moments of localization error

10.2.1. 1D direction of arrival estimation

Test problem

We simulate 4 isotropic sensors located in the xy-plane. We assume 2 sources impingingat the sensors. One source is impinging from the fixed angle 0◦, the incidence angle ofthe second source is varied from 20◦ to 160◦, see Figure 10.4. We set ωc‖rk‖ = π, s1(t) =2 · sin(t) and s2(t) = cos(t) in (2.18). We disturb the sensor data as

A(x, h) = A(x) + h · [1, 1] .

60

10. Simulations

This simple example of a perturbed forward solution leads to very simple derivativesw.r.t. h as

∂

∂hiA(x, h) =

0 0...0 01 10 0...0 0

←ith row,

∂2

∂2hihjA(x, h) = 0 ∀i, j.

Derivatives of the 1D MUSIC cost function (3.4) w.r.t. x are given by (see also [21])

fx(x, h) = 2 Re[ax(x)HΠF (h)a(x)

],

fxx(x, h) = 2 Re[axx(x)HΠF (h)a(x)

]+ 2ax(x)HΠF (h)ax(x),

fxxx(x, h) = Re[2axxx(x)HΠF (h)a(x) + 6axx(x)HΠF (h)ax(x)

].

1st order error moment

We compare our approach with potentially second order error estimation with the ap-proaches shown in [21]. The predicted error expectation is compared to the samplederror expectation of independent trials. Thanks to the dimensionality of the problemand the small dimension of h, we can compute the covariance matrices required for theexpectations E1D

0−1.5 (see (7.8), (7.9) and (7.16)) fully and do not require the truncatedtrace calculation as shown in Section 10.1. Figures 10.5-10.6 show the results of 1.3 · 105

−1

−0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0.6

0.8

1

1.2

1.4

−1.4−1.2−1−0.8−0.6−0.4−0.200.20.40.60.811.21.4

y

x

variableincoming plain wave

fixed sourceangle 0◦

sensors

Figure 10.4.: Sensor position and source angles

61

10. Simulations

0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 0.18 0.2−3

−2

−1

0

1

2

3

DOA estimation, second source = 20 [grad]

noise

de

gre

e [

gra

d]

0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 0.18 0.20

0.5

1

1.5

2

2.5

3

3.5

|residual|

de

gre

e [

gra

d]

noise

0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 0.18 0.2−3

−2.5

−2

−1.5

−1

−0.5

0

second source = 60 [grad]

noise

de

gre

e [

gra

d]

errorE

0

1D

E1

1D

E1.5

1D

0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 0.18 0.20

0.1

0.2

0.3

0.4

0.5

|residual|

de

gre

e [

gra

d]

noise

0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 0.18 0.20

1

2

3

4

5

6

second source = 100 [grad]

noise

de

gre

e [

gra

d]

0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 0.18 0.20

0.2

0.4

0.6

0.8

1

|residual|

de

gre

e [

gra

d]

noise

Figure 10.5.: Left: Mean error and estimated first moment of DOA estimation for differ-ent incident angles for source 2. Right: absolute residual

independent runs of the optimization algorithm compared to the different first momentapproximations shown in Chapter 7.2. Figure 10.5 shows the result of estimated tosampled mean error in purple for different incident angles of source 2 as

error =1

n

n∑i=1

(x− xi

). (10.1)

The figures on the left show the values of the error in degree while the figures on the rightside show the residual |error−E1D

0−1.5|. A legend is only displayed in one subfigure sincethe graphs are all equally colored. We can distinguish 3 cases. If the incidence angledifference of the 2 sources is moderately large, the E1D

1.5 estimate is able to predict theerror with an accuracy of under 1%. In the case of the two sources being closely spaced,E1D

0 and E1D1 fail to predict the mean error, while E1D

1.5 is able to predict the error upto a threshold before diverging. If the second source is impinging nearly orthogonal tothe first source, all estimators are able to predict the mean error qualitatively but notquantitatively. Figure 10.6 shows the mean error for all positions of the second sourcewith 3 different noise values. The error is symmetric, as the symmetry of the test problemsuggests. Again, 3 different case can be observed. If the noise level is small, the second

62

10. Simulations

20 40 60 80 100 120 140 160−0.2

−0.1

0

0.1

0.2

Expectation of error, noise = 0.01

position [grad]

de

gre

e [

gra

d]

20 40 60 80 100 120 140 1600

0.02

0.04

0.06

0.08

0.1

|residual|

de

gre

e [

gra

d]

position [grad]

errorE

0

1D

E1

1D

E1.5

1D

20 40 60 80 100 120 140 160−5

0

5

noise = 0.09

position [grad]

de

gre

e [

gra

d]

20 40 60 80 100 120 140 1600

0.1

0.2

0.3

0.4

0.5

0.6

0.7

|residual|

de

gre

e [

gra

d]

position [grad]

20 40 60 80 100 120 140 160−6

−4

−2

0

2

4

6

noise = 0.2

position [grad]

de

gre

e [

gra

d]

20 40 60 80 100 120 140 1600

0.5

1

1.5

2

2.5

3

3.5

|residual|

de

gre

e [

gra

d]

position [grad]

Figure 10.6.: Left: Mean error and estimated first moment of DOA estimation for differ-ent systematic noise. Right: absolute residual

order estimate outperforms the estimates with lower order. When noise is higher and theincidence angles are closely spaced, the second order approach diverges more slowly thanthe other approaches. In the case of the sources being near orthogonal, all estimationsfail to predict the error correctly.

10.2.2. 3D parameter estimation problem

Test problem

We simulate 31 coil sensors located on a sensor plate in the xz-plane. The orientationof all sensors is [0, 1, 0]T . We simulate 27 source positions placed in an operating volumewith size x ∈ [−0.1, 0.1], y ∈ [−0.1, 0.1] and z ∈ [0.04, 0.08]. The source orientation is

1√3[1, 1, 1]T for all source positions. Figure 10.7 is showing the source and sensor configu-

ration. One sensor is characterized by 6 parameters, namely spatial position in Cartesiancoordinates, normed orientation in polar coordinates and scalar measurement gain. Thegain is set to k = 1 for all sensors. We use (2.15) to calculate the lead-field matrix.Note that this configuration of sensors actually violates the requirements of the MUSICalgorithm since for a = [0, b, 0]T , the forward solution satisfies A(a)ω(t) = −A(−a)ω(t),

63

10. Simulations

−0.2

−0.1

0

0.1

0.2−0.2

−0.15−0.1

−0.050

0.050.1

0.15

−0.04

−0.02

0

0.02

0.04

0.06

0.08

9

18

27

6

15

24

3

12

21

8

17

26

z

5

14

23

2

11

20

7

16

25

4

13

22

1

10

19

x

y

Figure 10.7.: Sensor and source position.

i.e. the columns of A are not linearly independent. This is a minor problem, which canbe avoided by constrained optimization. The Levenberg-Marquardt algorithm allows theutilization of box-constraints, which enable us to set the feasible region of optimizationto a operating volume on the positive x-axis. With this, the linear independence of thecolumns of A(x) is ensured.The calculation of the expectations E0 (see (7.21)), E1 (see (7.22) ) and E1.5 (see

(7.27)) were done with truncated trace calculation as shown in Section 10.1. About5 ·105 independent inverse problems with systematic noise are computed for every spacialposition and noise level, where the noise level has a range of σ ∈ [0.005, 0.1].

One parameter per sensor

We introduce systematic errors by means of disturbing the measurement gain for eachsensor, i.e. h ∈ R31. Computation times of the predicted expectations are about 10seconds, 25 seconds and 4 minutes for E0, E1 and E1.5 for one point, respectively. We areable to calculate every desired noise magnitude for one point in this period of time sincethe majority of computing time is required to assemble the entries of the system matricesusing Algorithm 4. Inserting different noise levels in this assembly has a negligible effecton computing time.As a preliminary result, we will evaluate the validity of (7.21) and (7.22), i.e. the

approximation of the expectation of a quadratic form in the shape of the right handside of (7.19). These results are new since existing asymptotic error estimates onlyconsidered the 1D case. This can also be considered as a test of the truncated tracecalculation. Figure 10.8 and 10.9 show the expectation of the ratio of quadratic formsdeduced through samples and predicted by E0 and E1, where the quadratic form is

64

10. Simulations

0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.10

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1x 10

−3 Expectation of error, point = (0, 0.04, 0)

noise

no

rm [

m]

quadratic formE

0

E1

0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.10

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8x 10

−5 |residual|

no

rm [

m]

noise

Figure 10.8.: Left: Norm of sample mean and estimated mean of quadratic form for onesource position over all systematic errors for 31 parameters. Right: absolutevalue of the residual.

0 5 10 15 20 25 300

0.2

0.4

0.6

0.8

1

1.2x 10

−4 Expectation of error, noise = 0.01

source position

norm

[m

]

quadratic formE

0

E1

0 5 10 15 20 25 300

1

2

3

4

5

6

7

8

9x 10

−5 |residual|

source position

norm

[m

]

Figure 10.9.: Left: Norm of sample mean and estimated mean of quadratic form over allsource positions for 31 parameters, σ = 0.01. Right: absolute value of theresidual.

calculated with n samples through

quadratic form =

∣∣∣∣∣∣∣∣∣∣ 1n

n∑n=1

εTQ1ε

εTQ2ε

∣∣∣∣∣∣∣∣∣∣2

.

65

10. Simulations

0 5 10 15 20 25 300

0.2

0.4

0.6

0.8

1

1.2x 10


source position

norm

[m

]

errorE

0

E1

E1.5

0 5 10 15 20 25 300

0.5

1

1.5

2

2.5

3

3.5

4

4.5

5x 10

−5 |residual|

source position

norm

[m

]

Figure 10.10.: Left: norm of sample mean error and estimation of mean error over allsource positions for 31 parameters, σ = 0.01. Right: absolute value of theresidual.

As one can expect, the first order expectation predicts the mean quadratic form generallypoorer than the second order approximation. Overall, both estimations perform well,though. The first order estimate generally predicts the quadratic form with an accuracyof roughly 1%. Exceptions are the source positions 1, 9, 18 and 27. These positions, as canbe seen in Figure 10.7, have no sensors directly underneath them and therefore are highlysensitive to systematic errors. Regarding these results, truncated trace calculation canbe considered as successful, since no noticeable drop in accuracy of the estimate occursfor higher noise values.Figure 10.10 shows the normed sample mean error of the perturbed inverse problem

x− x for n samples as

error =

∣∣∣∣∣∣∣∣∣∣ 1n

n∑i=1

(x− xi

)∣∣∣∣∣∣∣∣∣∣2

, (10.2)

and the predicted estimation errors. The labeling of error and estimates is the same as inSection 10.2.1. The estimation error is shown for all source positions and σ = 0.01. Wecan see that the two estimates for first order position error E0 and E1 generally cannotpredict the mean sampled error accurately. Occasionally, the second order expectationestimate E1 gives a worse prediction than the first order prediction E0. This is entirelypossible, since the E0 and E1 approximate the quadratic form (7.19), which is a firstorder estimate of the estimation error. An improved estimate of the quadratic formsratio (7.19) does not necessarily lead to an improved estimate of the estimation errorx− x. E1.5 provides a good approximation for all tested source positions. These resultsjustify the increased computational load to calculate E1.5. We encounter peaks at sourcepositions no. 9, 18, 27 again, as already shown in Figure 10.9. Note that E1 instead of

66

10. Simulations

0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.10

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1x 10


noise

norm

[m

]

errorE

0

E1

E1.5

0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.10

1

2

3

4

5

6

7

8x 10

−4 |residual|

norm

[m

]

noise

Figure 10.11.: Left: Norm of sample mean error and estimation of mean error for onesource position over all systematic errors for 31 parameters. Right: abso-lute value of the residual.

0 5 10 15 20 25 300

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1x 10


source position

norm

[m

]

errorE

0

E1

E1.5

0 5 10 15 20 25 300

1

2

3

4

5

6

7

8

9x 10

−5 |residual|

source position

norm

[m

]

Figure 10.12.: Sample mean error and estimation of mean error over all positions for 62parameters.

E0 was utilized in the calculation of E1.5. Figure 10.10 shows that this is necessary foran accurate near second order estimate. To see this examine source position no. 19. Thepredictions E0 and E1 differ significantly and so, E1.5 would be inaccurate if E0 wouldhave been utilized to calculate it.Figure 10.11 shows the sample mean error and the different predicted errors over the

67

10. Simulations

0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.10

1

2

3

4

5

6x 10


noise

norm

[m

]

errorE

0

E1

E1.5

0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.10

1

2

x 10−4 |residual|

norm

[m

]

noise

Figure 10.13.: Sample mean error and estimation of mean error over all systematic errorsfor 62 parameters.

tested noise levels for one source position. The two estimations for first order positionerror E0 and E1 quickly diverge in the case of growing systematic noise, where the fixedpoint iteration E1.5 provides a good estimate for all tested noise levels. This can againbe considered as a justification of the truncated trace calculation since it does not seemto influence a accurate estimate.

Two parameters per sensor

In this test, we perturb the orientation of each sensor which can be characterized in polarcoordinates by two parameters, i.e. h ∈ R62. This leads to a significant increase of therequired computing time. About 3 minutes, 4.5 minutes and 90 minutes to calculate E0,E1 and E1.5 are required for one source position, respectively. The calculation is stillreasonably faster than using samples to calculate the expectation directly, since we cancalculate an arbitrary number of noise for one source position in one run.Figure 10.12 and Figure 10.13 show that the calculated and estimated error expectation

exhibit the same qualitative behavior as in the 31 parameter case. This indicates thatthe predicted mean will also be accurate for higher dimensions but computing time willquickly reach levels in which a direct sampling of the error expectation will be morereasonable.

10.2.3. Higher order moments

One parameter per sensor

To validate the calculation of the second order moment of the estimation error, we usethe same optimization runs used in the first part of the last section. The gain of each

68

10. Simulations

0 5 10 15 20 25 300

0.5

1

1.5

2

2.5

3x 10

−6 2nd moment of error, noise = 0.01

source position

norm

[m

]

squared errorM

0

M1

M1.5

0 5 10 15 20 25 300

0.5

1

1.5

2

2.5

3

3.5

4x 10

−8 |residual|

norm

[m

]

source position

Figure 10.14.: Sample mean squared error and estimated second moment over all posi-tions for 31 parameters.

sensor is distorted, i.e. h ∈ R31. We calculateM0,M1 andM1.5 as shown in (7.30), (7.31)and (7.34), respectively. When comparing M0 with E1 in (7.22) we can see that partsof the necessary calculations for M0 are readily available through computing E1. Sinceassembling the system matrices takes the majority of computing time, the remaining partsof M0 can be calculated while computing E1 without a significantly higher workload.Higher order estimates of the second moment require considerably more computing

time, though. In the one parameter per sensor setting, calculation of M1 and M1.5

required 6 minutes and 2 hours, respectively.Figure 10.14 shows the normed mean of the squared position estimation error as

squared error =

∣∣∣∣∣∣∣∣∣∣∣∣∣∣1

n

n∑i=1

(x1 − xi1

)2(x2 − xi2

)2(x3 − xi3

)2∣∣∣∣∣∣∣∣∣∣∣∣∣∣2

, (10.3)

and the different estimates of the second moment. Unlike the expectation, all momentsprovide an overall good estimate of (10.3). The peaks at the source positions 9, 18 and27 reappear, as already seen in Figure 10.10. A closer look at several points shows amore nuanced picture, though. Generally, we encounter M1.5 to give a better estimateof (10.3) in the far field of the operating volume, i.e. in source locations no. 20 − 26.M0 generally provides a comparable to superior estimate of the 2nd order moment forsource positions in closer proximity of the sensors. This is further illustrated in Figure10.15, which shows near and far field behavior of predicted and sampled second moment.The reason for this inferior performance of a nominally higher order estimation schemelies in the form of (7.34). Since the high Kronecker power of the mixed and quadraticterm in (7.29), we can only calculate the first order expansion of the expectation (7.32)

69

10. Simulations

0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.080

1

2

3

4

5

6x 10


noise

no

rm [

m]

errorM

0

M1

M1.5

0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.080

1

2

3

4

5

6x 10

−7 |residual|

no

rm [

m]

noise

0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.080

0.2

0.4

0.6

0.8

1

1.2x 10


noise

no

rm [

m]

errorM

0

M1

M1.5

0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.080

1

2

3

4

5

6

7x 10

−7 |residual|

no

rm [

m]

noise

Figure 10.15.: Sample mean squared error and estimated second moment with 31 param-eters for near and far field estimation.

and (7.33). In the near field, this approximation is insufficient to describe the behaviorof the mixed and quadratic term and actually introduces additional errors.

10.3. Particle filters

Test problem

We simulate the same sensor plate as used in Section 10.2.2. The state function is chosento describe a circle in the measurement volume. We use the state functiong : R3 × R3 × R→ R3:

xt = xt−1 + [r sin(ht−1), 0, r cos(ht−1)]T + vt−1, (10.4)

70

10. Simulations

with x0 = [−0.1, 0.15, r]T , r = 0.003, t0 = 0, ht = ht−1 + k · π/100, k = 1, . . . , N = 300and vt−1 ∼ N (0, σpI). We fix the source orientation as σ = [1, 0, 0]T for all timesteps.Note that in the one source case, there is no need for the orientation to exhibit a timeseries. The choice of N and the state function makes the source rotate one and a halftimes in a circle above the sensors in the noiseless case. If σp 6= 0, the source will soonleave the circle and will take an unpredictable path. The measurement function is givenby

yt = A(xt)σ + nt,

with nt ∼ N (0, σmI). We test a grid of 20 measurement noise levels with σm ∈[0.1, 0.2, . . . , 2] and 10 process noise levels with σp ∈ [0.001, 0.002, . . . , 0.01]. The processnoise reaches from almost not disturbing the source path with σp = 0.001 to producebasically a random walk with σp = 0.01. For each grid point, 50 samples of the wholesource path are taken and 100 particles are used.We test three possible weighting functions f . First, we use the classical Gaussian

covariance weighting function (9.13)

p(yt | xit) =1

2πn2 σnm

exp

[− 1

2σ2m

‖yt −A(xit)A+(xit)yt‖2

]. (10.5)

With n = 31 in our application, we encountered the sum of the sample weights being nu-merically zero virtually in every timestep for small measurement noise. This renders theGaussian weighting useless since the particles contain no feasible information. Therefore,we also test the least-squares weighting

fl(xit) = ‖yt −A(xit)A

+(xit)yt‖2. (10.6)

It is easy to see that (10.6) can be obtained from (10.5) in applying continuous functionsand ignoring constants (this is well known, see e.g. [8]). Finally, we use the MUSIC

−0.1

0

0.1

−0.15−0.1

−0.050

0.050.1

0.15

0

0.05

0.1

0.15

z [m]x [m]

y [

m]

Figure 10.16.: Marker path in noiseless case

71

10. Simulations

−0.2

−0.1

0

0.1

0.2−0.25 −0.2 −0.15 −0.1 −0.05 0 0.05 0.1 0.15 0.2

z [m]

x [

m]

0 50 100 150 200 250 3000

0.1

0.2

0.3

0.4

time step

dis

tan

ce

[m

]

covariance

least−squares

MUSIC

−0.2

−0.1

0

0.1

0.2−0.2 −0.15 −0.1 −0.05 0 0.05 0.1 0.15

z [m]

x [

m]

0 50 100 150 200 250 3000

0.01

0.02

0.03

0.04

0.05

time step

dis

tan

ce

[m

]

covariance

least−squares

MUSIC

Figure 10.17.: Left: sample of true and estimated source path for σp = 0.003, σm =0.3. Marker path (blue) and estimated paths on top. Euclidean distancebetween source and particle mean at the bottom. Right: sample for σp =0.003, σm = 1.5.

estimator (3.6)

fm(xit) = λmax(A(xit)

TΠA(xit)). (10.7)

Remark

We assume the state and process noise statistics being known for all calculations. How-ever, we encounter that neither of the particle filter variants was able to track an unex-pected (not circle shaped) source path. The reason is that the particles do not fan outstrong enough in the particle update stage of Algorithm 5. One way to solve this problemwould be the use of a larger number of particles which is undesirable since computingtime constraints. Our solution was to introduce artificial process noise in the particleupdate as σap := 20σp. The factor is obtained by try and error and ensures particlediversity without drawing particles too far away from high weight areas.

Results

Figure 10.17 shows the performance of the three weighting techniques for one sample. Inthe case of low sensor noise, particle filter with Gaussian weighting diverges in the firsttime step. With increasing sensor noise, the Gaussian weighting becomes more stableto the point where it outperforms the other two weighting techniques slightly. However,the Gaussian weighting is more sensitive to high process noise, as can be seen in Figure10.18. It is noteworthy to see that the particle filter with Gaussian covariance weightingis able to regain track the position of the source after seemingly diverging. In the case ofboth high process and sensor noise, all particle filter variants have difficulties in trackingthe source path correctly.

72

10. Simulations

−0.6

−0.4

−0.2

0

0.2

0.4−0.4 −0.3 −0.2 −0.1 0 0.1 0.2 0.3

z [m]

x [

m]

0 50 100 150 200 250 3000

0.1

0.2

0.3

0.4

time step

dis

tan

ce

[m

]

covariance

least−squares

MUSIC

−0.3

−0.2

−0.1

0

0.1

0.2−0.5 −0.4 −0.3 −0.2 −0.1 0 0.1 0.2 0.3

z [m]

x [

m]

0 50 100 150 200 250 3000

0.1

0.2

0.3

0.4

time step

dis

tan

ce

[m

]

covariance

least−squares

MUSIC

Figure 10.18.: Left: true and estimated source path sample for σp = 0.008, σm = 0.3.Marker path (blue) and estimated paths on top. Euclidean distancebetween source and particle mean at the bottom. Right: sample forσp = 0.008, σm = 2.

0 0.002 0.004 0.006 0.008 0.01

0

1

20

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

process noisemeasurement noise

mean e

uclid

ean d

ista

nce [m

]

covariance

least−squares

MUSIC

00.005

0.010.015

0

1

20

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

process noisemeasurement noise

me

an

eu

clid

ea

n d

ista

nce

[m

]

covariance

least−squares

MUSIC

Figure 10.19.: Left: mean distance of all timesteps and samples for tested process andmeasurement noise. Right: mean distance for a random walk statefunction.

Figure 10.19 left shows the mean euclidean distance over all timesteps and all samplesfor the tested process and measurement noise. As seen in the samples, the Gaussiancovariance weighting is prone to diverge in the case of small sensor noise. Least-squaresand MUSIC weighting are comparable with MUSIC weighting being superior, especiallyin the case of increasing process noise. Under ideal conditions, i.e. low process noise andmoderately high sensor noise, the covariance weighting outperforms both other weightingtechniques, albeit only slightly.On the right-hand side of figure 10.19, the result for a random walk state function is

73

10. Simulations

shown asxt = xt−1 + vt−1.

This state function is used to model brain currents in MEG (see [64], [63]). The resultsof tracking the random walk through particle filtering with the three possible weightingfunctions mirrors the result of the circular path qualitatively, i.e. the MUSIC weightingcan be considered to perform comparable to superior to the other weighting technique.However, all weighting techniques show higher mean localization error compared to wheninformation on the source’s path is available. This reflects the general unpredictablenature of a random walk and the low used number of particles to track it, especially inhe case of high process noise.

74

11. Conclusion and Outlook

In this work, we investigated and augmented asymptotic error estimates for a certain fieldof parameter estimation problems. Existing one dimensional asymptotic error estimatesused a first order approximation in at least one step of the derivation. We augmentedthis scheme with fixed point iteration to achieve potentially second order. The approachwas tested on a model DOA problem with two incident wavefronts. In general, thefixed point iteration estimate performs better than existing ones if the incident sourceangles are neither too closely spaced together nor nearly orthogonal. In these cases,all approaches to predict the asymptotic error accurately fail. A closer investigation ofthis phenomenon may be part of future research. Ferréol et. al. [21] investigated theresolution probability of the MUSIC algorithm, i.e. the probability of two sources beingdistinguished correctly. This was not considered in this work, but our approach may givevaluable additional insight in this problem.The main focus of this work was to reformulate the fixed point approximation scheme

to cover the three dimensional case. As some side results of this task, we investigated thederivatives of eigensystems and singular systems. In the case of eigensystem derivatives,a linear system of equations was presented that is both easy to solve and to expand. Inthe case of singular systems, we derived a closed form expression similar to the eigensys-tem linear system. This enables a point-wise calculation of singular value and singularvector derivatives of any desired order. A special case of repeated singular values andsingular value derivatives was examined and solved. These side results enabled us toformulate derivatives of the 3D MUSIC cost function. Although not covered in thiswork, analytical derivatives can be expected to enhance the performance of the MUSICalgorithm remarkably. The existing applications of three dimensional MUSIC resorted tooptimization algorithms with approximate derivatives ([48],[50]). A test of the MUSICalgorithm on simulated or real data with an optimization scheme which utilizes exactderivatives can be expected to give superior or at least accelerated results.Asymptotic estimation error approximations were stated in the three dimensional case.

This case also introduced the curse of dimensionality for which the computational loadwill grow exponentially with linearly growing problem size. A new truncated trace calcu-lation scheme to alleviate this with approximated matrix products was introduced. Thisscheme relied heavily on the Gaussian assumption. In the case of error expectation (firstmoment), the existing first order approaches have been shown to generally fail to predictthe asymptotic error. In contrast to this, the fixed point iteration was able to predict theerror accurately even in the case where truncated trace calculation introduced significanterrors. Considering the second moment of the asymptotic error showed a different pic-ture. Overall, the test problems seem to indicate that utilizing second order approachesin this case is not worth the effort, since first order approaches give comparable results.

75

11. Conclusion and Outlook

Even more so, since the terms of the first order moment estimate can be calculated on thefly in the second order expectation estimate calculation and therefore require negligibleadditional computational effort. The next logical step of this direction of research seemsto be the test of these results on real world data.In the case of a moving source and sufficient information on this movement, we pre-

sented a new approach on weighting in the sample based particle filter algorithm. Inthe case of low process and measurement noise, utilizing the MUSIC cost function asa weighting function performs comparable to traditional techniques. In the case of ei-ther or both high process or measurement noise, Gaussian weighting becomes unstablewhile MUSIC weighting remains stable and also outperforms the least-squares approach.Since our approach showed its prowess also in the case of a random walk movement, itsapplication to MEG/EEG data may be interesting.With no doubt the greatest future tasks arising from this work are reducing the compu-

tational burden and the extension to non-Gaussian systematic errors. The computationalburden rises exponentially with linearly rising dimension of systematic error h. The testcase with dim(h) = 64 showed the temporal boundaries of our approach. Examiningproblems with higher dimensionality of h will soon require months of computation, evenon parallelized clusters. In this case, simple error sampling will lead to better and fasterresults.There are some approaches different from ours to calculate the expectation of a quadratic

forms ratio in the Gaussian case, which might be fruitful. The work of Magnus [42], [43]and Kumar [38] may be a good starting point. Although the expansion of the estimationerror did not make assumptions on the systematic error and is therefore suited for anon-Gaussian case, practical considerations limit applicability. In [21] some covariancematrices for non-Gaussian errors are shown, which are all not sparse. Therefore, ourapproach will be not applicable due to the overwhelming computational burden. Ap-proximations as presented in the Gaussian case or other bounding techniques will benecessary. We are currently not aware of any research in this area and we doubt thatvaluable information on non-Gaussian covariance matrices is easy to accomplish boththeoretically and numerically.

76

A. Appendix

A.1. Multi-index notation

Let α, β ∈ Nn0 . We define the calculation rules

α± β = (α1 + β1, . . . , αn + βn),

α ≤ β ⇔ αi ≤ βi ∀ i ∈ {1, . . . , n},α! = α1! · . . . · αn!,(α

β

)=

(α1

β1

)(α2

β2

)· · ·(αnβn

)=

α!

β!(α− β)!,

|α| = α1 + . . .+ αn,

∂α

∂αx=

∂α1

∂α1x1. . .

∂αn

∂αnxn.

A definition of α < β is given in Section 4.2.2.

A.2. Example of operator vec

Let A ∈ R4×4 be

A =

a11 a12 a13 a14

a21 a22 a23 a24

a31 a32 a33 a34

a41 a42 a43 a44

,then

vec(A) = [a12, a13, a14, a23, a24, a34]T .

A.3. Repeated singular values

We consider σp = σq, σ′p = σ′q and σ′p 6= σ′q. We will subsequently derive all elements inthe left and right hand side of (5.20), beginning with the first row.As in the case considered in Section 5.2.3, (5.8) and (5.9) contain the same information,

which is incorporated in the first row of (5.20). Also, one can see from (5.8) and (5.9)that

Qpq = −Qqp. (A.1)

77

A. Appendix

Forming the second derivative of (5.8) and (5.9) and building the sum of these equationsyields

(σ′′q − σ′′p)Wpq + (σ′′q − σ′′p)Zpq = Q′′pq +Q′′qp, (A.2)

which will ultimately form the second row in (5.20), where

Q′′pq =(ZTP ′′ + P ′′′ + P ′′W − Z ′Q− Z

[P ′′ − ZQ+QW

]+[P ′′ − ZQ+QW

]W +QW ′

)pq

=(2P ′′W − 2ZP ′′ + P ′′′

−Z ′Q+QW ′ + ZZQ+QWW − 2ZQW)pq. (A.3)

There are several quadratic and mixed terms of Zpq and Wpq on the left and right sideof (A.2) but as it will turn out, all these elements cancel each other. The derivatives Z ′

and W ′ are not known and another system of equations has to be considered, which willresult in the third row of (5.20).Before we consider the derivatives of Z andW , we examine the different parts of (A.2)

separately, beginning with the left side:

σ′′q = Q′qq = P ′′qq − (ZRQ)qq + ZpqQpq + (QWR)qq +QqpWpq,

σ′′p = Q′pp = P ′′pp − (ZRQ)pp − ZpqQqp + (QWR)pp −QpqWpq.

Taking (A.1) into account results in l1.The non-quadratic, non-mixed elements on the right side are(

2P ′′W − 2ZP ′′ + P ′′′)pq

+(2P ′′W − 2ZP ′′ + P ′′′

)qp

= −2(P ′′qq − P ′′pp)Zpq − 2(P ′′qq − P ′′pp)Wpq

+P ′′′pq + P ′′′qp − 2(ZRP′′)pq − 2(ZRP

′′)qp + 2(P ′′WR)pq + 2(P ′′WR)qp,

which form l2 and r2 in (5.20).Before we examine the quadratic and mixed elements, we state some easy to prove

results on reduced matrices:

(ZZQ)pq = (ZRZRQ)pq + Zpq(ZRQ)qq + ZpqZqpQpq,

(ZQW )pq = (ZRQWR)pq + Zpq(QWR)qq +Wpq(ZRQ)pp + ZpqQqpWpq,

(QWW )pq = (QWRWR)pq +Wpq(QWR)pp +WpqWqpQpq.

With these results and (A.1), the quadratic and mixed elements result in

(ZZQ+QWW − 2ZQW )pq + (ZZQ+QWW − 2ZQW )qp= (ZRZRQ)pq + Zpq(ZRQ)qq + (ZRZRQ)qp − Zpq(ZRQ)pp

+(QWRWR)pq +Wpq(QWR)pp + (QWRWR)qp −Wpq(QWR)qq

−2 [(ZRQWR)pq + Zpq(QWR)qq +Wpq(ZRQ)pp

+(ZRQWR)qp − Zpq(QWR)pp −Wpq(ZRQ)qq] ,

78

A. Appendix

which form l3,z, l3,w and r3 in (5.20).Now we examine the derivatives of Z and W . Again, we utilize reduced matrices to

get

(−Z ′Q+QW ′)pq + (−Z ′Q+QW ′)qp

= (−Z ′RQ+QW ′R)pq + (−Z ′RQ+QW ′R)qp. (A.4)

Since p, q ≤ o, we get

(Z ′RQ)pq = ((Z ′R)1Q1 − Z ′2Q3)pq,

(QW ′R)pq = (Q1(W ′R)1 −Q2W′T2 )pq,

and with (5.13):

Z ′2 = Λ−1Λ′Λ−1QT3 − Λ−1Q′T3

= −Λ−1Λ′Z2 + Λ−1(W1Q

T3 +W2Q

T4 − P ′′

T3 −QT1 Z2

),

W ′T2 = −W T

2 ΛΛ−1 +(−QT2 Z1 −QT4 ZT2 + P ′′

T2 +W T

2 QT1

)Λ−1,

where we used [U ′1, U′2] =

[U1Z1 − U2Z

T2 , U1Z2

]and

[V ′1 , V′

2 ] =[V1W1 − V2W

T2 , V1W2

]. Finally we obtain

−((Z ′RQ)pq + (Z ′RQ)qp

)= (Λ−1[Λ′Z2 − (WR)1Q

T3 −W2Q

T4 + P ′′

T3 +QT1 Z2]Q3)pq + (. . .)qp

−((Z ′R)1Q1)pq − ((Z ′R)1Q1)qp −1

σpWpq(Q

T3 Q3)qq +

1

σqWpq(Q

T3 Q3)pp,

and analogously((QW ′R)pq + (QW ′R)qp

)= (Q2[W T

2 Λ +QT2 (ZR)1 +QT4 ZT2 − P ′′

T2 −W T

2 QT1 ]Λ−1)pq + (. . .)qp

+(Q1(W ′R)1)pq + (Q1(W ′R)1)qp +1

σq(Q2Q

T2 )ppZpq −

1

σp(Q2Q

T2 )qqZpq.

The components l4,w, l4,z and r4 of (5.20) are readily available through these equations.To incorporate the derivatives of Z and W into (5.20), we build a vector vl such that

vl

[z′

w′

]= ((Z ′R)1Q1)pq + ((Z ′R)1Q1)qp − (Q1(W ′R)1)pq − (Q1(W ′R)1)qp.

= −∑i<p

(Z ′R)ipQiq +∑i>pi 6=q

(Z ′R)piQiq −∑i<qi 6=p

(Z ′R)iqQip +∑i>q

(Z ′R)qpQip

−∑i<qi 6=p

Qpi(W′R)iq +

∑i>q

Qpi(W′R)qi −

∑i<p

Qqi(W′R)ip +

∑i>pi 6=q

Qqi(W′R)pi.

79

A. Appendix

We propose the straightforward Algorithm 2 in Matlab pseudo code. The algorithm cal-culates the position of the respective elements of Qmn in vl. For notational convenience,vl is initialized as a vector which also incorporates entries i, j according to (5.16). Theseentries remain zero throughout the algorithm and are erased at the end.We now need additional information to calculate the components of Z ′R,W

′R. Deriva-

tion of (5.17) yields

S

[z′

w′

]= q′ − S′

[zw

], (A.5)

where q′ contains the elements Q′ij , i < j, (i, j) 6= (p, q). Obviously, some Q′ij containelements Zpq,Wpq which we will store in a matrix M to incorporate (A.5) in (5.20).Algorithm 3 provides means to build up M using the same ideas used for Algorithm 2.The main observation is that Q′ij = (P ′′ − ZRQ+QWR)ij for i 6= p, q and j 6= p, q. The

Algorithm 2 Calculate vlRequire: Q, p, q, o and resulting i, j ac-

cording to (5.16).vl = zeros(1, o(o− 1));for m = 1 : o do

if p 6= m 6= q thenif m < p thenn =

∑m−1k=1 (o− k) + p−m;

vl(n) = vl(n)−Qmq;n = n+ 0.5 · o(o− 1));vl(n) = vl(n)−Qqm;

elsen =

∑p−1k=1(o− k) +m− p;

vl(n) = vl(n) +Qmq;n = n+ 0.5 · o(o− 1));vl(n) = vl(n) +Qqm;

end ifif m < q thenn =

∑m−1k=1 (o− k) + q −m;

vl(n) = vl(n)−Qmp;n = n+ 0.5 · o(o− 1));vl(n) = vl(n)−Qpm;

elsen =

∑q−1k=1(o− k) +m− q;

vl(n) = vl(n) +Qmp;n = n+ 0.5 · o(o− 1));vl(n) = vl(n) +Qpm;

end ifend if

end forvl = vl([1 : i − 1, i + 1 : j − 1, j + 1 :o(o− 1)]);

Algorithm 3 Calculate MRequire: Q, p, q, o and resulting i, j ac-cording to (5.16).M = zeros(o(o− 1), 2);for m = 1 : o do

if p 6= m 6= q thenn =

∑p−1k=1(o− k) +m− p;

M(n, 1) = M(n, 1) +Qqm;n = n+ 0.5 · o(o− 1));M(n, 2) = M(n, 2) +Qmq;n =

∑q−1k=1(o− k) +m− q;

M(n, 1) = M(n, 1)−Qpm;n = n+ 0.5 · o(o− 1));M(n, 2) = M(n, 2)−Qmp;

end ifend forM = M([1 : i − 1, i + 1 : j − 1, j + 1 :o(o− 1)], :);

80

A. Appendix

remaining elements can be divided in four different cases:

i = p, q 6= j > p : Q′pj = (P ′′)pj − (ZRQ)pj − ZpqQqj + (QWR)pj ,

i = q, j > q : Q′qj = (P ′′)qj − (ZRQ)qj + ZpqQpj + (QWR)qj ,

j = q, p 6= i < q : Q′iq = (P ′′)iq − ZpqQiq + (QWR)iq +QipWpq,

j = p, i < p : Q′ip = (P ′′)ip − ZpqQip + (QWR)ip −QiqWpq.

All known elements of (A.5) form r5 in (5.20).All components of (5.20) are defined at this point and the linear system of equations canbe solved to obtain Zpq, Wpq and additionally all elements of Z ′,W ′, except (p, q), (q, p).The validity of the results is shown by an example in Section 5.2.4.

A.4. Closed form of the inverse of a 3× 3-matrix

Let A ∈ R3×3 be nonsingular, A := [a1, a2, a3] and ai ∈ R3, i = 1 . . . , 3. Then the inverseof A is given by

A−1 =1

det(A)

(a2 × a3)T

(a3 × a1)T

(a1 × a2)T

=1

〈(a2 × a3), a1〉

(a2 × a3)T

(a3 × a1)T

(a1 × a2)T

.A.5. Quadratic forms ratio expectation

A.5.1. Kronecker product calculus

Let A be a m× n and B be a p× q matrix. Let aij , bij be the components of A,B. TheKronecker product A⊗B is defined as the mp× nq matrix

A⊗B :=

a11B · · · a1nB...

. . ....

am1B · · · amnB

.Now let A and B be two k × k matrices. Then(

εTAε) (εTBε

)=

∑i

∑j

εiaijεj(εTBε

)=

∑i

∑j

εiεTaijBεjε

= (ε⊗ ε)T (A⊗B) (ε⊗ ε) = ε⊗2T (A⊗B) ε⊗2. (A.6)

It is easy to prove with (A.6) and induction that(εTAε

)n= ε⊗n

TA⊗nε⊗n. (A.7)

81

A. Appendix

Now consider ε being a random variable. We have

E[εTAε

]=

∑i

∑j

aijE [εjεi]

=∑i

∑j

aijE[εεT]ji

= tr(AE

[εεT]). (A.8)

With (A.7) and (A.8), it is easy to see that

E[(εTAε

)n]= tr

(A⊗nE

[ε⊗nε⊗n

T]), (A.9)

E[(εTAε

)m (εTBε

)n]= tr

((A⊗m ⊗B⊗n

)E[ε⊗m+nε⊗m+nT

]). (A.10)

A.5.2. Proof of (7.7)

We proof here the expectation of a ratio of quadratic forms. This proof was alreadystated in [21] and is included here for the sake of completeness and with some additionalexplanations. With R = E

[εεT], we rewrite the ratio as

εTQ1ε

εTQ2ε=

tr(Q1R)

tr(Q1R)

1 + αu

1 + v, (A.11)

where

u =εTQ1ε− tr(Q1R)

tr(Q2R), v =

εTQ2ε

tr(Q2R)− 1 and α =

tr(Q2R)

tr(Q1R).

If |v| < 1, the second term of (A.11) can be expanded into a series as

1 + αu

1 + v= 1 +

∞∑n=1

(−1)nvn−1(v − αu). (A.12)

Note that constraint A1 in (7.6) ensures absolute convergence of (A.12). Through theexpansion, we eliminated the ratio and have to deal with a series of quadratic forms. Thisis the crucial step of this proof. In the remainder of this proof, we will rewrite (A.12) sothat its moments can be calculated efficiently. It can be shown that [21]

1 + αu

1 + v= 1 +

∞∑n=1

n−1∑i=0

(−1)2n−i−1

(n− 1

i

)(αi+1

2 − α1αi2), (A.13)

where α1 =εTQ1ε

tr(Q1R)and α2 =

εTQ2ε

tr(Q2R).

82

A. Appendix

With (A.9) and (A.10), the expectation of αi+12 and α1α

i2 is

E[αi+1

2

]=

tr(Q⊗i+1

2 E[ε⊗i+1ε⊗i+1T

])tr (Q2R)i+1

, (A.14)

E[α1α

i2

]=

tr((Q1 ⊗Q⊗i2

)E[ε⊗i+1ε⊗i+1T

])tr (Q1R) tr (Q2R)i

. (A.15)

Since (A.13) is absolutely convergent given constraint A1 we have

E

[1 + αu

1 + v|A1

]= 1 + E

[ ∞∑n=1

n−1∑i=0

(−1)2n−i−1

(n− 1

i

)(αi+1

2 − α1αi2)

]

= 1 +

∞∑n=1

n−1∑i=0

(−1)2n−i−1

(n− 1

i

)E[(αi+1

2 − α1αi2)]. (A.16)

Combining (A.11), (A.14), (A.15) and (A.16) results in (7.7), which concludes the proof.Note that [21] required Q1, Q2 to be hermitian. This is actually not required in the proofand therefore the Q1, Q1 can be arbitrary quadratic matrices.

A.6. Nonzero entries of the covariance matrix

A.6.1. Examples for possible configurations

We show several examples of all possible configurations of q depending of problem size mand covariance matrix size n. These examples should make build-up for different prob-lems straightforward.The possible configurations are summarized in configuration matrices Ak ∈ Ns×n+1,where k indicates the number of the test case and s is the amount of possible config-urations. As can be seen in (8.10), entries qp>0 determine the magnitude of R(2n)

ij andq0 is acting more like a filler. We therefore ignore q0 at this point and complete q withq0 = 2n −

∑mi=1 qi later in the algorithm. Example matrices Ak show the shape of

q1, . . . , qmThe first column of A denotes the number of zeros in q1, . . . , qm, the second column

denotes the number of twos in q and so on. We consider the test cases

n1 = 3, m1 = 3;

n2 = 3, m2 = 2;

83

A. Appendix

Sparsity pattern of R, m = 1, n = 3

0 2 4 6 8

0

2

4

6

8

10 2 4 6 8

0

2

4

6

8

σ2

0 2 4 6 8

0

2

4

6

8

3σ4

0 2 4 6 8

0

2

4

6

8

15σ6

(a)

Sparsity pattern of R, m = 1, n = 6

0 20 40 60

0

50

1, σ2

0 20 40 60

0

50

3 σ4

0 20 40 60

0

50

15 σ6

0 20 40 60

0

50

105 σ8

0 20 40 60

0

50

945 σ10

0 20 40 60

0

50

10395 σ12

(b)

Figure A.1.: Sparsity pattern of R

this results in

A1 =

3 0 0 02 1 0 01 2 0 02 0 1 00 3 0 01 1 1 02 0 0 1

, A2 =

2 0 0 01 1 0 00 2 0 01 0 1 00 1 1 01 0 0 1

.

Note that row 5 in A1 is not possible in test case 2. Example (8.10) illustrated rows2 and 6 of A1. If n increases, so do possible configurations of q. We will not showthese configuration matrices due to readability and lack of space. As a guideline, forn = 6,m = 2, the corresponding A will have 16 rows. If n = 6,m > 5, then A ∈ N30×7.

A.6.2. Examples of the covariance matrix R

Let the problem dimension be m = 1, we show the sparsity pattern of R(2n) for n = 3, 6.The required configuration matrices are given by

A(n=3) = I4 A(n=6) = I7.

Figure A.1(a) and A.1(b) show the resulting nonzero entries of R(6) ∈ R8×8 and R(12) ∈R64×64, respectively. R(6) has 32 nonzero entries, while R(12) has 2048 nonzero entries.To clarify the distribution of the entries, we show the entries sorted by magnitude.

A.6.3. An algorithm to calculate all possible permutations of a given set

We present an algorithm which calculates all permutations K ∈ Nk×l0 of a given set. Theset is characterized through the possible entries v ∈ Nn0 and their respective power p ∈ Nn0 ,

84

A. Appendix

with∑n

i=1 pi = k. The power l of the permutations can be calculated beforehand. Thisis a simple combinatoric task which results in

l =k!∏ni=1 pi!

. (A.17)

Algorithm 4 uses a recursive scheme which gradually fills a global variable K with allpossible permutations. In each step of the recursion, it checks if the power of an entryof v is higher than zero. If true, the entry is used in the permutation and the power ofthe entry is diminished by one.

Note

In our application, with k >≈ 30, we encountered floating point overflow in calculatingk!. We advise to cancel the biggest component of p in (A.17) and not to calculate thewhole faculty in the numerator.

Algorithm 4 Calculate permutations of a setRequire: v ∈ Nn

0 ,p ∈ Nn0 , which define the set.

k =∑n

i=1 pi; l = k!∏i=1n pi!

;global K = zeros(k, l);global c2 = 1;c1 = 1; pmt = zeros(k, 1);

Begin recursive procedure fill(c1, v, p, pmt):if c1 > m then

load out, c2;K(:, c2) = pmt;c2 = c2 + 1;

elsefor i = 1 : n do

if p(i) 6= 0 thenpmtr = pmt;pmtr(c1) = v(i);pr = p;pr(i) = pr(i)− 1;call fill(c1 + 1, v, pr, pmtr);

end ifend for

end ifreturn K, all possible permutations.

85

A. Appendix

A.7. Particle filter algorithms

We present here the SIR and the resampling algorithm shown in [7] in Algorithms 5 and6, respectively. Unfortunately, the statement of the resampling algorithm shown therecontains minor typos, which will render the algorithm incorrect for small Np. We state acorrected form. The resampling algorithm requires the ability to draw a random samplefrom the uniform distribution U [a, b] from a given interval [a, b] ∈ R. The probability ofdrawing a sample i is determined by the cumulative distribution function (CDF) of theweights.

Algorithm 5 SIR particle filterRequire: xit−1, yt−1, transition pdf p(xt |xit−1) and likelihood density p(yt | xit).for i = 1 : Np do

Draw sample: xit ∼ p(xt | xit−1);Calculate weight: wi

t = p(yt | xit);end forNormalize weight: wi

t = wit/∑Np

k=1 wkt ,

i = 1 . . . , Np;Resample using Algorithm 6 with wi

t;return particles corresponding to cur-rent timestep t: xit−1.

Algorithm 6 Resampling algorithmRequire: xit,wi

t;Initialize CDF: c(1) = w1

t ;for m = 2 : Np do

Build CDF: c(m) = c(m− 1) + wmt ;

end forDraw starting point: u ∼ U

[0, N−1

p

];

for j = 1 : Np doMove along the CDF:uj = u1 +N−1

p (j − 1);while uj > ci doi = i+ 1;

end whileAssign sample: xjt = xit;

end forreturn resampled particles: xit−1.

86

Bibliography

[1] M. Albach. Elektrotechnik. Pearson, München, 2011.

[2] L. Albera, A. Ferréol, D. Cosandier-Rimélé, I. Merlet, and F. Wendling. Fourthorder approaches for localization of brain current sources. In Conf. Proc. IEEEEng. Med. Biol. Soc., pages 4498–4501, 2006.

[3] G. Anderson. A procedure for differentiating perfect-foresight-model reduced-formcoefficients. Journal of Economic Dynamics and Control, 11:465–481, 1987.

[4] A. Andrew, K.-W. E. Chu, and P. Lancaster. Derivatives of eigenvalues and eigen-vectors of matrix functions. SIAM J. Matrix Anal Appl., 14:903–926, 1993.

[5] A. Andrew and R. Tan. Computation of derivatives of repeated eigenvalues andthe corresponding eigenvectors of symmetric matrix pencils. SIAM J. Matrix AnalAppl., 20:78–100, 1998.

[6] A. Andrew and R. Tan. Iterative computation of derivatives of repeated eigenval-ues and the corresponding eigenvectors. Numerical linear algebra with applications,7:151–167, 2000.

[7] M. Arulampalam, S. Maskell, N. Gordon, and T. Clapp. A tutorial on particle filtersfor online nonlinear/non-Gaussian Bayesian tracking. IEEE Transactions on Signalprocessing, 50(2):174–188, 2002.

[8] S. Baillet, J. Mosher, and R. Leahy. Electromagnetic brain mapping. IEEE SignalProcessing Magazine, 18(6):14–30, 2001.

[9] A. Barabell. Improving the resolution performance of eigenstructure-based direction-finding algorithms . Acoustics, Speech, and Signal Processing, IEEE InternationalConference on ICASSP 83, pages 336 –339, 1983.

[10] F. Belloni and V. Koivunen. Unitary root-MUSIC technique for uniform circulararray. In Proceedings of the IEEE International Symposium on Signal Processingand Information Technology (ISSPIT), pages 451–454, Darmstadt, Germany, 2003.

[11] F. A. Belloni, A. Richter, and V. Koivunen. DoA estimation via manifold separationfor arbitrary array structures. IEEE Transactions on Signal Processing, 55(10):4800–4810, 2007.

[12] A. Björck and G. Golub. Numerical methods for computing angles between linearsubspaces. Mathematics of Computation, 27(123):579–594, 1973.

87

Bibliography

[13] I. Bronstein, K. Semendjajev, G. Musiol, and H. Mühlig. Taschenbuch der Mathe-matik. Verlag Harri Deutsch, Thun und Frankfurt am Main, 5th edition, 2000,2001.

[14] A. Bunse-Gerstner, R. Byers, V. Mehrmann, and N. Nichols. Numerical computationof an analytic singular value decomposition of a matrix valued function. NumerischeMathematik, 60:1–39, 1991.

[15] Z. Chen. Bayesian filtering: from kalman filters to particle filters, and beyond.Technical report, Adaptive Systems Lab, McMaster University Hamilton, 2003.

[16] L. Cirillo. Narrowband array signal processing using time-frequency distributions.PhD thesis, TU Darmstadt, 2007.

[17] G. Crevecoeur, H. Hallez, P. V. Hese, Y. D’Asseler, L. Dupré, and R. V. de Walle.A hybrid algorithm for solving the EEG inverse problem from spatio-temporal EEGdata. Medical and biological engineering and computing, 46(8):767–777, 2008.

[18] M. Doron and E. Doron. Wavefield modeling and array processing, part I - spatialsampling. IEEE Transactions on signal Processing, pages 2549–2559, 1994.

[19] A. Doucet. On sequential simulation-based methods for bayesian filtering. Techni-cal Report CUED/F-INFENG/TR. 310, Cambridge University Department of En-gineering, 1998.

[20] A. Doucet, N. de Freitas, K. Murphy, and S. Russell. Rao-Blackwellized ParticleFiltering for dynamic Bayesian networks. In Proc. UAI2000, 2000.

[21] A. Ferréol, P. Larzabal, and M. Viberg. On the asymptotic performance analysis ofsubspace DOA estimation in the presence of modeling errors: case of MUSIC. IEEETransactions on signal processing, 54(3):907–920, 2006.

[22] A. Ferréol, P. Larzabal, and M. Viberg. On the resolution probability of MUSIC inpresence of modeling errors. IEEE Transactions on signal processing, 56(5):1945–1953, 2008.

[23] G. Forchini. The distribution of a ratio of quadratic forms in noncentral normalvariables. Communications in statistics-theory and methods, 34(5):999–1008, 2005.

[24] D. Fox. KLD-sampling: Adaptive particle filters. In Advances in Neural InformationProcessing Systems, 2001.

[25] B. Friedlander. A sensitivity analysis of the MUSIC algorithm. Acoustics, Speechand Signal Processing, IEEE Transactions on, 38(10):1740 –1751, 1990.

[26] L. Godara, editor. Handbook of Antennas in Wireless Communications. CRD Press,2001.

[27] G. Golub and C. V. Loan. Matrix Computations. The John Hopkins UniversityPress, Baltimore and London, 2 edition, 1996.

88

Bibliography

[28] I. Gorodnitsky, J. George, and B. Rao. Neuromagnetic source imaging with FO-CUSS: a recursive weighted minimum norm algorithm. Electroencephalography andclinical Neurophysiology, 95:231–251, 1995.

[29] J. Han and K. Park. Regularized FOCUSS algorithm for EEG/MEG source imaging.In Conference Proceedings. 26th Annual International Conference of the IEEE En-gineering in Medicine and Biology Society, volume 1, pages 122–124, San Francisco,September 2004.

[30] A. Haug. A tutorial on bayesian estimation and tracking techniques applicable tononlinear and non-gaussian processes. Technical report, MITRE, 2005.

[31] L. Hesheng and P. Schimpf. Efficient localization of synchronous EEG source ac-tivities using a modified RAP-MUSIC algorithm. Biomedical Engineering, IEEETransactions on, 53(4):652 –661, 2006.

[32] G. Hillier, R. Kan, and X. Wang. Computationally efficient recursions for top-orderinvariant polynomials with applications. Econometric Theory, 25(1):211–242, 2009.

[33] J. Jackson. Classical Electrodynamics. John Wiley and Sons, Inc., 1999.

[34] D. Janovska, V. Janovsky, and K. Tanabe. An algorithm for computing the analyticsingular value decomposition. World Academy of science, 37:135–140, 2008.

[35] A. Kangas, P. Stoica, and T. Soderstrom. Finite sample and modeling error effectson ESPRIT and MUSIC direction estimators. Radar, Sonar and Navigation, IEEEProceedings -, 141(5):249 –255, 1994.

[36] J. Kotecha and P. Djuric. Gaussian particle filtering. IEEE Transactions on signalprocessing, 51(10):2592–2601, 2003.

[37] H. Krim and M. Viberg. Two decades of array signal processing research: theparametric approach. IEEE Signal Processing Magazine, 13(4):67–94, 1996.

[38] A. Kumar. Expectation of product of quadratic forms. Sankhya: The Indian Journalof Statistics, Series B (1960-2002), 35(3):pp. 359–362, 1973.

[39] I.-W. Lee and G.-H. Jung. An efficient algebraic method for the computation ofnatural frequency and mode shape sensitivities - part I. distinct natural frequencies.Computers & Structures, 62:429–435, 1997.

[40] J.-H. Lee and S.-H. Jo. On initialization of ML DOA cost function for UCA. ProgressIn Electromagnetics Research M, 3:91–102, 2008.

[41] G. Lehner. Elektromagnetische Feldtheorie. Springer-Verlag, Berlin,Heidelberg,2009.

[42] J. Magnus. The expectation of products of quadratic forms in normal variables: thepractice. Statistica Neerlandica, 33(3):131–136, 1979.

89

Bibliography

[43] J. Magnus. The exact moments of a ratio of quadratic forms in normal variables.Annals of Economics and Statistics, (4):95–109, 1986.

[44] J. Magnus. Matrix Differential Calculus with applications in Statistics and Econo-metrics. Wiley/ Chichester, 2002.

[45] V. Mehrmann and W. Rath. Numerical methods for the computations of analyticsingular value decompositions. Electronic Transactions on numerical analysis, 1:72–88, 1993.

[46] W. Mills-Curran. Calculation of eigenvector derivatives for structures with repeatedeigenvalues. AIAA Journal, 26:867–871, 1988.

[47] J. Mosher. Localization from near-source quasi-static electromagnetic fields. PhDthesis, Los Alamos National Lab., NM (United States), 1993.

[48] J. Mosher and R. Leahy. Recursive MUSIC: a framework for EEG and MEG sourcelocalization. IEEE Transactions on Biomedical Engineering, 45(11), November 1998.

[49] J. Mosher and R. Leahy. Source localization using recursively applied and projected(RAP) MUSIC. IEEE Transactions on Signal Processing, 47(2):332–340, February1999.

[50] J. Mosher, P. Lewis, and R. Leahy. Multiple dipole modeling and localizationfrom spatio-temporal MEG data. IEEE Transactions on Biomedical engineering,39(6):541–557, June 1992.

[51] R. Nelson. Simplified calculation of eigenvector derivatives. AIAA Journal,14(9):1201–1205, 1976.

[52] S. Nijima and S. Ueno. MEG source estimation using the fourth order MUSICmethod. IEICE Transactions on Information and Systems, E85-D(1):167–174, Jan-uary 2002.

[53] J. Odendaal, E. Bernard, and C. Pistorius. Two-Dimensional superresolution radarimaging using the MUSIC algorithm. IEEE Transactions in Antennas and Propa-gation, 42(10):1386–1391, 1994.

[54] P. Luff and E. Bänsch. Tracking of a slowly moving magnetic dipole. In Proceed-ings of the X-th International Workshop on Optimization and Inverse Problems inElectromagnetism, pages 28–29, Ilmenau Germany, 2008.

[55] T. Papadopoulo and M. Lourakis. Estimating the Jacobian of the singular valuedecomposition: theory and applications. Lecture Notes In Computer Science,1842:554–570, 2000.

[56] A. Pascarella, A. Sorrentino, C. Campi, and M. Piana. Particle filters, Beamformersand multiple signal classification for the analysis of magnetoencephalographic timeseries: a comparison of algorithms. Inverse Problems and Imaging, 4:169–190, 2010.

90

Bibliography

[57] A. Pascarella, A. Sorrentino, M. Piana, and L. Parkkonen. Particle filters and RAP-MUSIC in MEG source modeling: a comparison. In International Congress Series,pages 161–164, 2007.

[58] R. Pascual-Marqi, C. Michel, and D. Lehmann. Low resolution electromagnetic to-mography: a new method for localizing electrical activity in the brain. InternationalJournal of Psychophysiology, 18:49–65, 1994.

[59] L. Qin, H. Li, J. Yu, and Z. He. A new method of DOA estimation for uniformantenna array. Signal Processing Proceedings, 2000. WCCC-ICSP 2000. 5th Inter-national Conference on, 1:437–440, 2000.

[60] F. Rellich. Perturbation theory of eigenvalue problems. Gordon and Breach, NewYork, 1969.

[61] R. Roy and T.Kailath. ESPRIT-estimation of signal parameters via rotational in-variance techniques. Acoustics, Speech and Signal Processing, IEEE Transactionson, 37(7):984–995, 1989.

[62] R. Schmidt. Multiple emitter location and signal parameter estimation. IEEETransactions on antennas and propagation, AP-34(3):276–280, 1986.

[63] E. Somersalo, A. Voutilainen, and J. Kaipio. Non-stationary magnetoencephalogra-phy by Bayesian filtering of dipole models. Inverse Problems, 19:1047–1063, 2003.

[64] A. Sorrentino. Particle filters for magnetoencephalography. Archives of Computa-tional Methods in Engineering, 17:213–251, 2010.

[65] A. Sorrentino, L. Parkkonen, A. Pascarella, C. Campi, and M. Piana. DynamicalMEG source modeling with multi-target bayesian filtering. Human Brain Mapping,30(6):1911–1921, 2009.

[66] A. Sorrentino, A. Pascarella, C. Campi, and M. Piana. Particle filters for themagnetoencephalography inverse problem: increasing the efficiency through a semi-analytic approach (Rao-Blackwellization). Journal of Physics: Conference Series,124(1):012046, 2008.

[67] J. Sun. Multiple eigenvalue sensitivity analysis. Linear Algebra and its Applications,137–138(0):183–211, 1990.

[68] A. Swindlehurst and T. K. T. A performance analysis of subspace-based methodsin the presence of model errors: I. The MUSIC algorithm. Signal Processing, IEEETransactions on, vol.40(7):1758–1774, 1992.

[69] N. A.-H. M. Tayem. Direction of arrival angle estimation schemes for wirelesscommunication systems. PhD thesis, Wichita State University, 2005.

[70] N. v. G. Kampen. Stochastic processes in physics and chemistry. North-HollandPubl., Amsterdam, 2008.

91

Bibliography

[71] N. van der Aa, H. T. Morsche, and R. Mattheij. Computation of eigenvalue andeigenvector derivatives for a general complex-valued eigensystem. Electronic Journalof Linear Algebra, 16:300–314, 2007.

[72] B. van Veen and K. Buckley. Beamforming: a versatile approach to spatial filtering.IEEE ASSP Magazine, 5:4–24, 1988.

[73] M. Viberg and B. Ottersten. Sensor array processing based on subspace fitting.IEEE Transactions on signal processing, 39:1110–1121, 1991.

[74] K. T. Wong and M. D. Zoltowski. Root-MUSIC-based azimuth-elevation angle-of-arrival estimation with uniformly spaced but arbitrarily oriented velocity hy-drophones. IEEE Transactions on signal processing, 47(12):3250–3260, 1999.

[75] K. Wright. Differential equations for the analytic singular value decomposition of amatrix. Numerische Mathematik, 63(1):283–295, 1992.

92

opus4.kobv.de · Zusammenfassung Parameterschätzung behandelt die Bestimmung des Zustands eines...

Documents

Transcript of opus4.kobv.de · Zusammenfassung Parameterschätzung behandelt die Bestimmung des Zustands eines...