Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von...

169
Dependence of Non-Continuous Random Variables Von der Fakult¨ at f¨ ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit¨ at Oldenburg zur Erlangung des Grades und Titels eines Doktorin der Naturwissenschaften, Dr. rer. nat. eingereichte Dissertation vorgelegt von Johana Neˇ slehov´ a geboren am 26.7.1977 in Prag Oldenburg, den 11. Mai 2004

Transcript of Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von...

Page 1: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

Dependence of Non-Continuous Random Variables

Von der Fakultat fur Mathematik und Naturwissenschaften

der Carl von Ossietzky Universitat Oldenburg

zur Erlangung des Grades und Titels eines

Doktorin der Naturwissenschaften, Dr. rer. nat.

eingereichte Dissertation

vorgelegt von

Johana Neslehova

geboren am 26.7.1977 in Prag

Oldenburg, den 11. Mai 2004

Page 2: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

2

Gutachterin/Gutachter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Zweitgutachterin/-gutachter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Tag der Disputation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Pfeifer
Stempel
Pfeifer
Stempel
Pfeifer
Stempel
Page 3: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

Acknowledgments

This thesis was completed during my time as a research assistant at the Institute of Mathematics ofthe Carl von Ossietzky University Oldenburg.

First of all I would like to thank my supervisor, Prof. Dr. Dietmar Pfeifer, who introduced meto the fascinating subject of copulas. We have agreed on this topic during one afternoon in the springof 2003, when we were working on a draft version of our paper for the ASTIN Colloquium. Sincethen, we have enjoyed a truly exciting cooperation, during which I received so much good advice andconstructive criticism.

Besides my supervisor, special thanks goes to my employer Prof. Dr. Udo Kamps for giving methe opportunity to work as a research assistant at the project ”e-stat”. He and my colleagues Dr.Katharina Cramer and Prof. Dr. Erhard Cramer had always been a great support to me and hadlet me finish this thesis even if the final phase of the ”e-stat” project claimed more than enough of time.

Moreover I wish to thank my other colleagues at the Institute of Mathematics in Oldenburg andin Hamburg, who contributed to this work in one way or the other, in particular Doreen Scholze. Ialso thank all those, who managed to read draft versions of this thesis and provided me with manyuseful hints: Martina Neid, Michael McGillis and Birgit Malzen.

At last but not least, my very special words of gratitude goes to my family and friends for theirpatience and faith in what I was doing, even if they had no idea about it.

Hamburg, May 2004 Johana Neslehova

3

Page 4: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

4

Page 5: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

Abstract

Most of the research on copulas has been based upon the condition that the univariate margins arecontinuous. Without this assumption, many results are no longer valid and and the interaction betweenthe copula and the marginals is far less well understood. The aim of this thesis is to contribute tothe modeling and description of dependence structures between non-continuous random variables.When discontinuities are allowed, major difficulties are caused by the nonuniqueness of the underlyingcopula. In fact, there exists a whole class of copulas which comply with the famous Sklar formula.However, it will be shown in the first part of the thesis that there exists at least one member of thisclass, the so-called standard extension copula, which captures the dependence structure of the jointdistribution function in a way analogous to the continuous case, so that many of the well-known niceresults concerning e.g. quadrant dependence, tail dependence and weak convergence can be carriedover. This in particular makes it possible to obtain measures of dependence and concordance, whichare ”non-continuous” counterparts of the quantities proposed in the literature so far.Furthermore, this thesis also focuses heavily on modeling multivariate distributions with not necessarilycontinuous marginals using the copula approach. Dependence structures attainable within these modelsare studied and results obtained which extend the work done on this subject by Marshall (1996).Although focus is laid upon non-continuous marginals, there are also a few new results concerningmeasures of association for continuous random variables, embedding e.g. the measures of dependencebased on Lp-distance by Schweizer and Wolff (1981) in a more general framework and yielding similarlyconstructed measures of concordance.Because of the ambiguities caused by the non-continuity, the observations will often be restricted to thebivariate case only as well as to discrete marginals with finite support. Nevertheless, this situation mayas well be viewed as a starting point for more general cases. Moreover, there exist practical applicationswhere discrete variables are of particular use. An example here is the modeling of dependent risksin insurance, where the number of claims obtained in a certain time period is usually described as a(discrete) random variable. Therefore, part of the results is applied to modeling of negatively correlatedPoisson counting variables, which in turn can be used as a basis for modeling and generating dependentrisk processes (c.f. Neslehova and Pfeifer (2003)).

5

Page 6: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

6

Page 7: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

Zusammenfassung

Diese Arbeit setzt sich zum Ziel, einen Beitrag zum Studium der Abhangigkeit diskontinuierlicher Zu-fallsvariablen zu leisten. In einer solchen Situation liegen die meisten Schwierigkeiten darin begrundet,dass die zugehorige Copula nicht langer eindeutig ist. Vielmehr gibt es in diesem Fall eine ganze Klassevon Funktionen, die der beruhmten Formel von Sklar genugen. Im ersten Teil dieser Arbeit wird je-doch klar, dass es stets zumindest ein Mitglied dieser Klasse gibt, das die Abhangigkeitsstruktur dergemeinsamen Verteilungsfunktion in einer dem stetigen Fall analogen Weise widerspiegelt. Mit diesersog. Standarderweiterungscopula konnen daher viele der bekannten ,,schonen” Resultate, u.a. uberdie Quadrantabhangigkeit, Tailabhangigkeit und die schwache Konvergenz, ubertragen werden. Ins-besondere lassen sich so Abhangigkeitsmaße fur diskontinuierliche Verteilungen konstruieren, die alsGegenstucke zu den aus der Literatur bekannten ,,stetigen” Varianten angesehen werden konnen.Ein weiterer Schwerpunkt dieser Arbeit liegt auf der Untersuchung des Copula-Ansatzes zur Model-lierung multivariater Verteilungen mit nicht notwendigerweise stetigen Randverteilungen und somitauf einer Erweiterung der bereits bekannten Resultate von Marshall. Besonderer Wert wird dabei aufdie Art von Abhangigkeit gelegt, die in solchen Modellen moglich ist. Daruber hinaus werden Modellezur Konstruktion maximal negativ korrelierter Zufallsvariablen im Detail untersucht. Da sie fur diePraxis von besonderer Bedeutung sind, wird eine explizite Methode zur Berechnung der Zahldichtehergeleitet.Obwohl meist diskontinuierliche Verteilungen betrachtet werden, liefert diese Arbeit ebenfalls einigeneue Beitrage zu Abhangigkeitsmaßen fur stetige Zufallsvariablen. Es wird ein theoretischer Rahmenfur diese Kenngroßen bereitgestellt, der nicht nur die auf der Lp-Norm basierten Abhangigkeitsmaßevon Schweizer und Wolff (vgl. Schweizer and Wolff (1981)) als Spezialfalle beinhaltet, sondern auchzur Herleitung analoger distanzbasierter abstrakter Konkordanzmaße fuhrt.Die Komplikationen, die mit der Diskontinuierlichkeit der Randverteilungen verbunden sind, werden anvielen Stellen jedoch Einschrankungen erzwingen: haufig wird nur der zweidimensionale Fall betrachtetsowie diskrete Verteilungen mit endlichem Trager. Nichtsdestotrotz ist eine solche Situation interessant,zumal diese einen Ausgangspunkt fur allgemeinere Falle darstellt. Daruber hinaus gibt es Beispieleaus der Praxis, die auf diese Situation zuruckzufuhren sind: etwa die Modellierung von abhangigenRisikoprozessen, bei der die Anzahl gemeldeter Schaden innerhalb einer Zeitspanne als eine (endliche)diskrete Zufallsvariable dargestellt wird. Daher ist das letzte Kapitel dieser Arbeit der Anwendungeiniger hergeleiteter Resultate auf die Modellierung maximal negativ korrelierter Poissonverteilungengewidmet, die wiederum als Basis fur die Beschreibung abhangiger Poissonprozesse aufgefasst werdenkann (vgl. Neslehova and Pfeifer (2003)).

7

Page 8: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

8

Page 9: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

Contents

Acknowledgments 3

Abstract 5

Zusammenfassung 7

1 Introduction 11

2 Theoretical Background 15

2.1 Basic Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

2.2 Copulas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

3 Copulas in Statistical Settings 27

3.1 Sklar’s Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

3.2 Sections of Copulas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

3.3 Conditional Probabilities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

3.4 Survival Copulas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

3.5 Examples of Copulas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

3.5.1 Spherical and Elliptical Copulas . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

3.5.2 Archimedean Copulas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

3.5.3 Algebraically Constructed Copulas . . . . . . . . . . . . . . . . . . . . . . . . . . 43

3.5.4 Shuffles of M . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44

4 Dependence Concepts 47

4.1 Perfect Dependence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

4.1.1 Bivariate Perfect Dependence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

4.1.2 Multivariate Perfect Dependence . . . . . . . . . . . . . . . . . . . . . . . . . . . 50

4.2 Nonparametric Dependence Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

4.2.1 Orthant Dependence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

4.2.2 Tail Dependence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54

4.2.3 Likelihood Ratio Dependence and Other Concepts . . . . . . . . . . . . . . . . . 55

4.3 Measures of Association . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56

4.3.1 Measures of Dependence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56

4.3.2 Measures of Concordance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59

4.3.3 Multivariate Extensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65

4.3.4 Tail Dependence Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66

4.3.5 Linear Correlation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67

9

Page 10: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

10 CONTENTS

5 Multivariate Discrete Distributions 71

5.1 Bivariate Bernoulli Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 715.2 The Class CX . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73

5.2.1 The Standard Extension . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 735.2.2 Carley’s Extensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75

5.3 Dependence Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 775.4 Weak Convergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 805.5 Measures of Association . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82

5.5.1 Basic Characteristics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 835.5.2 Measures of Concordance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 865.5.3 Distance-based Measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96

6 Empirical Copulas 99

6.1 Basic Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 996.2 Kendall’s Tau and Spearman’s Rho for Empirical Distributions . . . . . . . . . . . . . . 103

7 Modeling Multivariate Distributions with Copulas 107

7.1 Copula Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1077.1.1 Bivariate Bernoulli Distributions Revisited . . . . . . . . . . . . . . . . . . . . . 108

7.2 Dependence in Copula Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1107.2.1 Linear Correlation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1107.2.2 Kendall’s Tau and Spearman’s Rho . . . . . . . . . . . . . . . . . . . . . . . . . . 112

7.3 Minimum Correlated Discrete Bivariate Distributions . . . . . . . . . . . . . . . . . . . . 1147.3.1 The Class HW . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1147.3.2 The North-West Corner Rule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1167.3.3 Discrete Distributions with Infinite Support . . . . . . . . . . . . . . . . . . . . . 122

8 Negatively Correlated Bivariate Poisson Distributions 125

8.1 Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1258.2 Copula Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1278.3 Calculation of the Minimum Correlation . . . . . . . . . . . . . . . . . . . . . . . . . . . 134

A Proofs 139

A.1 Chapter 4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139A.2 Chapter 5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140A.3 Chapter 8 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147

B Results from Probability and Measure 149

B.1 Differentiation of Lebesgue-Stieltjes Measures . . . . . . . . . . . . . . . . . . . . . . . . 149B.2 Conditional Probabilities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151

List of Symbols 155

Index 158

Bibliography 162

Curriculum Vitae 167

Erklarung 169

Page 11: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

Chapter 1

Introduction

The study of dependence is one of the most frequently considered topics in multivariate statistics andprobability theory, not only as a theoretical challenge, but also for its great importance in variouspractical applications, including insurance and finance.

The essential idea of how to describe the relationship between random variables has been independentlyconceived by various authors in various contexts. The first remarkable and systematic contribution tothis subject was the 1940 Ph.D. dissertation ”Maßstabinvariante Korrelationstheorie” by Wassilij Ho-effding. In his work, Hoeffding obtained a transformation of the joint distribution function, which leadhim to the discovery of a scale-invariant function defined in the square of unit length symmetric aboutthe point (0, 0), [−1/2, 1/2]2. Furthermore, he was able to show that various dependence measuresthat were known are functions of this so-called ”normierte Summenfunktion” alone. But, Hoeffding’sresults unfortunately appeared in scarcely available German journals during Word War II, and so histruly remarkable ideas passed unnoticed. They were, however, essentially the same which lead to thefinal breakthrough in the study of dependence almost twenty years later.

During the late fifties, Maurice Frechet again raised the question of determining the relationshipbetween a multivariate distribution function and its lower dimensional margins. Finally, in 1959, AbeSklar (1959) gave an answer to this query by discovering that at least one function always exists, forwhich he chose the name ”copula”, linking a joint distribution function to its marginals via

H(x1, . . . , xn) = C(F1(x1), . . . , Fn(xn)).

Since then, copulas have become a subject of growing interest in stochastics, especially after a discoveryby Schweizer and Wolff in the mid seventies, that copulas can also be used to define nonparametricmeasures of dependence. From then on, the role played by copulas in mathematical statistics becamemore and more important, and further remarkable results and contributions to modeling of multivari-ate distributions with fixed marginals were obtained.Nowadays, the theory of copulas is a widely developed branch of modern statistics, currently used notonly in modeling dependent random variables, but also in describing stochastic processes in order tocapture temporal dependence.

From the theory developed so far, the impression could arise that the marginals carry just as lit-tle information about the coupling process as the copula does about the marginals. Supported by theknowledge that various satisfactory dependence concepts and measures which can be determined by thecopula alone exist, it may seem reasonable to expect that any kind of relationship between the randomvariables is given by the copula alone. However, most of the research has been based upon the condition

11

Page 12: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

12 CHAPTER 1. INTRODUCTION

that the univariate margins are continuous as this is precisely the case when the underlying copula isunique; in addition, there are often no more than two marginals involved. When these assumptions nolonger hold, far less is known and the interaction between the copula and the marginals is far less wellunderstood. Whereas the multivariate case has been considered by comparatively many mathemati-cians, mostly in connection with the so-called compatibility problem, which is sometimes looked uponas one of the most challenging open questions concerning copulas, non-continuous marginals have beenconsidered by only a few. Rare exception here is the work by Albert Marshall (1996), who noted, thatas soon as discontinuous marginals are involved, most of the known results no longer apply and thedependence between the marginals cannot be determined without them being involved.

The aim of this thesis is to contribute to modeling and describing dependence structures betweennon-continuous random variables. When discontinuities are allowed, major difficulties are caused bythe nonuniqueness of the underlying copula. In fact, there exists a whole class of copulas which complywith the aforementioned Sklar’s formula. In the first part of the thesis we will however obtain thatthere exists at least one member of this class which captures the dependence structure of the jointdistribution function in a way analogous to the continuous case, so that many of the well-known niceresults can be carried over. This in particular makes it possible to obtain measures of dependence andconcordance, which are ”non-continuous” counterparts of the quantities proposed in the literature sofar.Furthermore, this thesis also focuses heavily on modeling multivariate distributions with not necessar-ily continuous marginals using the copula approach, extending the work by Marshall.Although focus is laid upon non-continuous marginals, there are also few new results concerning mea-sures of association for continuous random variables, embedding i.a. the measures of dependence basedon Lp-distance by Schweizer and Wolff (1981) in a more general framework and yielding similar con-structed measures of concordance.It is surely desirable to make as little restrictions as possible, but the ambiguities caused by the non-continuity will often force us to consider the bivariate case only as well as discrete marginals with finitesupport. Nevertheless, this situation may as well be viewed as a starting point for more general cases.Moreover, there exist practical applications where discrete variables are of particular use. An examplehere is the modeling of dependent risks in insurance, where the number of claims obtained in a certaintime period is usually described as a (discrete) random variable. Therefore we will apply part of theresults to modeling of negatively correlated Poisson counting variables, which in turn can be used asa basis for modeling and generating dependent risk processes (c.f. Neslehova and Pfeifer (2003)).

The outline of this thesis is the following: while Chapters 2,3, and 4 list and, if necessary, extendmajor results on copulas known by now, Chapters 5 till 8 finally deal with non-continuous marginalsand thus represent the major new contribution of this thesis. An exception here is Section 4.3 on mea-sures of association, where a new approach to these quantities is given, which generalizes Schweizerand Wolff’s idea of describing dependence via the distance between the surfaces corresponding to theunderlying copula and independence.

Chapter 2 presents the mathematical background necessary for a full appreciation of the resultsdeveloped later on. A precise definition of copulas is given there which presents them as a part of theclass of n-increasing functions (also referred to as quasi-monotone or ∆-monotone). For better orien-tation in the subject, analytic properties of n-increasing functions are emphasized, such as continuity,differentiability and the way the component-wise properties affect the behavior of the function itself.Thereafter, copulas are discussed from the analytical point of view. Treated as a metric space, it isshown that the set of all n-dimensional copulas equipped with the supremum metric is compact. Anew contribution here is Theorem 2.2.3, which is a generalization of the result obtained by Darsowet al. (1992) and which shows, as a consequence, that the set of all absolutely continuous and singular

Page 13: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

13

copulas, respectively, is dense in the set of all copulas.

Chapter 3 concentrates on the role played by copulas in statistical settings. Opening with Sklar’s the-orem, we will extend the famous result on how copulas change under strictly monotone transformationsto the non-continuous case (Proposition 3.1.2). It is also discussed how some important quantities,such as conditional probabilities and survival functions can be described in terms of copulas. There-after, some of the most famous copula families and their properties are listed.

The true importance of copulas is revealed in Chapter 4. Here the way is described how copulascapture dependence between random variables. For the start, focus is laid upon the counterpart of in-dependence, the so-called perfect dependence. Although comonotonicity and countermonotonicity seemto be reasonable approaches here, another concept is mentioned as well: the (arbitrary) functional de-pendence. This will also be the first place where restrictions to the bivariate case are unavoidable;however, an extension of the well-known interpretation of the so-called Frechet-Hoeffding upper boundcopula to higher dimensions (Section 4.1.2) is included. Furthermore, most important dependenceconcepts and measures of association are introduced as well as the way those quantities depend uponcopulas when marginals are continuous. An axiomatic definition is presented for both measures ofdependence and concordance and a new theoretical framework established, which makes it possibleto obtain these quantities from an abstract distance. Dependence measures based on Lp-norm bySchweizer and Wolff (1981) follow then as special cases. The major new contribution here is Section4.3.2, where distance-based measures of concordance are presented, which are from a certain point ofview concordance counterparts of the dependence measures proposed by Schweizer and Wolff. It willalso be shown that choosing the L1-distance yields the well-known Spearman’s rho.For the sake of completeness, two other popular quantities for measuring association between tworandom variables are also discussed: the tail index and the linear correlation coefficient.

Chapter 5 deals with dependence of non-continuous random variables. For a start, it focuses onthe class of all possible copulas, emphasizes its changes under monotone transformations and boundsin case the marginals are discrete with finite support. Among its members, a special interest is laidupon the standard extension copula which seems to reflect a great deal of the dependence structureof the joint distribution function, such as orthant dependence, tail dependence and under some re-strictions even the weak convergence as is revealed in Sections 5.3 and 5.4. Thereafter, measures ofassociation are studied, following the idea by Schweizer and Wolff, that, if arbitrary marginals areinvolved, one can work with one possible copula. The axiomatic definition of these quantities is recon-sidered and some pitfalls caused by non-continuity are presented. A special focus is laid upon measuresof concordance, for which extensions of the results known from the continuous case are developed; inparticular, discrete versions of Kendall’s tau and Spearman’s rho. It becomes clear that the role playedby the underlying copula in the continuous case is again taken over by the standard extension copulahere. It is also discussed that suitable measures of association will inevitably depend on the marginaldistributions and that the relationship between the copula and the Frechet-Hoeffding bounds is nolonger symmetric, which is the reason for why the general distance-based measures fail to satisfy theaxiomatic definition and thus behave similarly as the linear correlation coefficient.

Chapter 6 is dedicated to empirical copulas first obtained by Deheuvels (1979). His definition isextended to the case when ties in the observations are possible. Discrete versions of Kendall’s tau andSpearman’s rho derived in the preceding chapter are then revisited: it is shown that they agree withtheir sample versions proposed in the literature. Chapter 6 is also the only one touching the subjectof estimation and hypothesis testing. Although both interesting and challenging, these subjects gobeyond the scope of this work and hence will be mentioned only very briefly.

Page 14: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

14 CHAPTER 1. INTRODUCTION

In Chapter 7, modeling of multivariate discrete distributions using copulas is considered. The pit-falls caused by non-continuity of the marginal distributions are extensively illustrated by the bivariateBernoulli distribution. Correlation, Kendall’s tau and Spearman’s rho attainable within these modelsare also discussed. A special attention is payed to distributions with maximum negative dependencepossible, which arise by choosing the Frechet-Hoeffding lower bound copula. We show how this questcan be transformed into the mass transportation problem well-known from Linear Programming andGraph Theory and and as a consequence, prove that, when the marginals are discrete, the joint prob-ability densities can be calculated using the famous north-west-corner rule algorithm.

To illustrate part of the results derived so far, Chapter 8 finally considers bivariate distributionswith Poisson margins and negative dependence. Models proposed in the literature so far are listed andthe copula modeling approach discussed in greater detail. The north-west-corner rule for calculationof the joint probability densities is also illustrated. Furthermore, special focus is laid upon the linearcorrelation coefficient and its calculation; explicit formulas for several choices of the marginal param-eters are presented.

To make this thesis more readable and to avoid swamping the reader with formulas, those proofs,which are either standard or unnecessarily notationally complex, are transfered to Appendix A. InAppendix B some additional results from probability theory are listed, whereas a special focus islaid upon the differentiation of Lebesgue-Stieltjes measures in the n-dimensional real space, for it castssome more light on the seemingly complex relationship between Lebesgue densities and partial deriva-tives of copulas.

The Index and List of Symbols are also meant to help the reader finding his or her way throughthe manuscript. The page references include those paragraphs where the entries are either defined oressentially involved. The Bibliography lists not only those sources from which the results or proofsare quoted, but also contains secondary references which I believe can provide the reader with someadditional interesting information on the subjects considered.

Although I tried to give a clear presentation on the material, there are few terms which are not explicitlydiscussed. I would therefore like the reader to take notice of the following items right at the beginning.If not stated otherwise, continuous or absolutely continuous always mean absolutely continuous withrespect to Lebesgue measure in Rn. Similarly, marginals or margins stand for univariate margins.The continuity condition or continuous case refers to the continuity of the univariate marginals, butnot necessarily to the continuity of the joint distribution function or the quantity discussed. If thiscondition no longer holds, the underlying copula is not unique. In this so-called non-continuous caseby copula we usually mean one possible copula complying with Sklar’s theorem. Finally, we refer toany number which describes dependence between random variables as to a measure of association. Ifthis quantity describes the strength of the dependence as such, i.e. if it ranges between independenceand perfect dependence, we will call it a measure of dependence. Measure of concordance on the otherhand will be a quantity which captures additionally the kind of dependence, i.e. tells apart positiveand negative dependence.

Page 15: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

Chapter 2

Theoretical Background

In this chapter, we will establish basic notational and theoretical framework of this thesis. First,we focus heavily on n-increasing functions and describe some of their most important characteristics.These functions, which are also called delta or quasi monotone, are a starting point in the study ofLebesgue-Stieltjes measure, joint distribution functions, and - what is most important to us - copulas,which will play a crucial role in nearly all results obtained later on. After giving a brief introductionto them as well, we turn our attention to their basic analytic properties. As is often the case withmultivariate functions, results concerning copulas turn out to be more complex (if valid at all) if thedimension exceeds two. This also reflects a great deal of literature on this topic, since the investigationsare often restricted to the bivariate case. Here however, we will remain as general as possible and, atleast for the introduction, examine copulas without making this restriction.

For further details, we refer to the introductory book on copulas by Nelsen (1999) and the literaturegiven therein. Quasi-monotone functions are discussed i.e. in Bauer (1992), Hewitt and Stromberg(1975), Munroe (1968) and Kamke (1956).

2.1 Basic Definitions

In the following, by R we will denote the extended real line, [−∞,∞], and by Rn

the extended n-dimensional real space R × R × · · · × R respectively. For points in R

n, we will use the common vector

notation, i.e. x = (x1, x2, . . . , xn). In this sense, we will sometimes use the vector notation for numbersas well, i.e. 1 := (1, . . . , 1).

Furthermore, we consider the following (partial) ordering on Rn,

x ≤ y iff x1 ≤ y1 ∧ . . . ∧ xn ≤ yn.

If xi < yi for all i, we will write x < y.

In the following, some results will be using a norm in Rn. If the result will not depend on a par-ticular choice of such, we will simply write ‖ · ‖ and mean some arbitrary norm by it.

Definition 2.1.1. For x ≤ y, x, y ∈ Rn, the (half open) n-box (x,y] is defined as the Cartesian

product of n (half open) intervals in R:

(x,y] = (x1, y1] × · · · × (xn, yn].

15

Page 16: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

16 CHAPTER 2. THEORETICAL BACKGROUND

In particular, we will call the closed n-box [0, 1]× · · · × [0, 1] the unit n-cube In. Vertices of the n-box(x,y] are points v = (v1, v2, . . . , vn), where each vk equals either xk or yk. For each vertex, sign(v) isgiven by

sign(v) =

1 if ck = ak for an even number of k’s

−1 if ck = ak for an odd number of k’s.

Furthermore, a function H from Rn

to R will be referred to as an n-place real function. Its domainand range will be denoted by domH and ranH , respectively.

Definition 2.1.2. (right continuous) An n-place real function H is said to be right continuous, if forall x in Rn the following holds:

∀ ε > 0 ∃ δ > 0 such that ∀ y ≥ x, ‖y − x‖ < δ : |H(x) −H(y)| < ε.

Definition 2.1.3. (H-volume) Let H be an n-place real function and (a, b] an n-box such that(a, b] ⊆ domH . Then, the H-volume of (a, b] is defined by

(2.1) VH((a, b]) = ∆baH =

v vertex of (a, b]

sign(v)H(v).

Remark 2.1.1. ∆baH can be alternatively defined using first order differences of n-place functions,

∆bkakH(x) = H(x1, . . . , xk−1, bk, xk+1, . . . , xn) −H(x1, . . . , xk−1, ak, xk+1, . . . , xn),

as the n-th order difference of H with regard to a and b:

∆baH = ∆bn

an∆bn−1

an−1. . .∆b2

a2∆b1a1H(x).

The proof of this well-known statement is simple and therefore omitted.

Next, we will use the above definition to introduce n-increasing functions:

Definition 2.1.4. (n-increasing) An n-place real function H is called n-increasing if ∆baH ≥ 0 for all

half open n-boxes (a, b] whose vertices all lie in the domain of H .

n-increasing functions are also often referred to as quasi-monotone or ∆-monotone. It is worth notingthat an n-place real function whose domain is Rn and which is both right continuous and n-increasinguniquely defines a Borel measure over (Rn,Bn) (see Behnen and Neuhaus, 1984).

Next, if the domain of H is shaped like a Cartesian product of certain sets, we can study the be-havior of H on its upper and lower boundary, which is of particular use when defining n-dimensionaldistribution functions and their marginals.

Definition 2.1.5. Suppose the domain of H is of the form S1 × · · ·×Sn where each Si has a minimalelement ai ∈ R. Then we say that H is grounded if H(c) = 0 for all c in domH such that ci = ai forat least one i.If the domain of H is of the upper form with each Si nonempty having a maximal element bi ∈ R, thenwe say that H has margins. A k-dimensional margin (or simply a k-margin) is a k-place real functiondefined by fixing n − k places in H at the points bi. Hence, the univariate margins are functions Hk

on Sk defined by

(2.2) Hk(x) = H(b1, . . . , bk−1, x, bk+1, . . . , bn) for all x in Sk.

Page 17: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

2.1. BASIC DEFINITIONS 17

In the following, where no confusion could arise, we will refer to one-dimensional margins simply as to”margins”. Now, grounded and n-increasing functions are easily shown to be nondecreasing in eachargument.

Lemma 2.1.1. In the case the domain of H is of the form S1 × · · · × Sn where each Si has a leastelement ai and H is n-increasing and grounded, than H is nondecreasing component-wise, i.e. if both(t1, . . . , tk−1, x, tk+1, . . . , tn) and (t1, . . . , tk−1, y, tk+1, . . . , tn) lie in the domain of H and x ≤ y, then

H(t1, . . . , tk−1, x, tk+1, . . . , tn) ≤ H(t1, . . . , tk−1, y, tk+1, . . . , tn).

Proof. This statement is an easy consequence of the n-increasing property and the groundedness. Tosee this, we consider the points

a := (a1, . . . , ak−1, x, ak+1, . . . , an) and b := (t1, . . . , tk−1, y, tk+1, . . . , tn).

Clearly, both lie in domH and due to the n-increasing property we have ∆baH ≥ 0. On the other

hand, since H is grounded,

0 ≤ ∆baH =

v vertex of (a, b]

sign(v)H(v)

H grounded= H(t1, . . . , tk−1, y, tk+1, . . . , tn) −H(t1, . . . , tk−1, x, tk+1, . . . , tn).

From the above proof, the impression can arise that the groundedness cannot be omitted. This isindeed true as well as the fact that the statement cannot be reversed. For counterexamples see Nelsen(1999, Examples 2.1 and 2.2).

The next result shows that increments of grounded and n-increasing functions whose domains areCartesian products of closed intervals in R are bounded from above by the increments of their mar-gins.

Theorem 2.1.1. Let H be a n-place grounded and n-increasing real function, with domain S := [a, b],a 6= b in R

n; and margins Hk. Then, for any x and y in Rn,

(2.3) |H(y) −H(x)| ≤n∑

k=1

|Hk(yk) −Hk(xk)|.

Proof. Before we prove the above statement, we designate a volume as a real valued, nonnegative setfunction µ : Bn(S) → R, which is finitely additive, i.e.

for any disjoint sets A1, . . . , Am ∈ Bn(S) : µ( m⋃

k=1

Ak

)=

m∑

k=1

µ(Ak)

and which satisfies µ(∅) = 0. Then it can be shown (see Elstrodt, 1980, Theorem 5.11) that for anyn-increasing real valued function H with domain S there exists a (unique) volume µH such that

µH((x,y]

)= ∆y

xH for any (x,y] ⊆ S.

Since H is grounded, this leads especially to the following relation:

H(y) = µH((a,y]).

Page 18: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

18 CHAPTER 2. THEORETICAL BACKGROUND

Now, for any x and y in Rn, the left hand side of (2.3) can be estimated as follows:

|H(y) −H(x)| ≤ |H(max(x,y)) −H(min(x,y))|,

with max(x,y) := (max(x1, y1), . . . ,max(xn, yn)) and min(x,y) := (min(x1, y1), . . . ,min(xn, yn)), re-spectively. Since H is grounded and µH finely additive, the right hand side equals

|H(max(x,y)) −H(min(x,y))| = µH((a,max(x,y)]\(a,min(x,y)]

).

By denoting

(min(xk,yk),max(xk,yk)] := [a1, b1] × · · · × [ak−1, bk−1] × (min(xk, yk),max(xk , yk)]×× [ak+1, bk+1] × · · · × [an, bn],

we get, regarding the finite additivity and non-negativity of µH ,

µH((a,max(x,y)]\(a,min(x,y)]

)≤ µH

( n⋃

k=1

(min(xk,yk),max(xk,yk)]),

≤n∑

k=1

µH((min(xk,yk),max(xk,yk)]

).

Since H is grounded and has margins, the right hand side equals

n∑

k=1

µH((min(xk,yk),max(xk,yk)]

)=

n∑

k=1

(Hk

(max(xk , yk)

)−Hk

(min(xk, yk)

))

=

n∑

k=1

|Hk(yk) −Hk(xk)|,

from which the inequality (2.3) readily follows.

Remark 2.1.2. As long as an n-increasing function has margins, continuity properties of the functionclearly determine the continuity properties of the margins. Because it estimates the increments by theincrements of the margins, inequality (2.3) shows that in the situation of the above theorem, this effectalso works the other way round. To be more specific, the continuity properties of the one dimensionalmargins affect the continuity properties of the function itself. In particular, if all the margins arecontinuous, H is continuous as well.

We close this section with a short survey on differentiability of grounded and n-increasing functions.By Lemma 2.1.1 and the well-known result of Lebesgue, stating that a real-valued monotone functionf defined on a closed interval [a, b] ⊂ R has a finite derivative almost everywhere on [a, b] (see e.g.(Hewitt and Stromberg, 1975, Theorem 17.12)), we have the following:

Theorem 2.1.2. Suppose H is a grounded and n-increasing function with domain of the form [a1, b1]×· · · × [an, bn], aj , bj in R. Then for all j = 1, . . . , n

∂xjH(x1, . . . , xn)

exists for almost all (x1, . . . , xn) in domH in the sense of Lebesgue measure.

Page 19: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

2.2. COPULAS 19

Now the question could arise whether an analogous result also applies to the partial derivativesof higher order. It is possible to show that the one dimensional partial derivative, as a function ofthe remaining components x1, . . . , xj−1, xj+1, . . . , xn, is grounded and (n − 1)-increasing. This givesthe impression that a similar relationship between monotonicity and differentiability may also exist inthe multivariate case. But unfortunately, the statement given above guarantees the first-order partialderivatives only to exist almost everywhere, which is not enough when we need to differentiate themfurther. However, in case the first-order partial derivatives exist on a dense set in some neighborhoodof the point considered, a compromise definition of the partial derivative of higher order can be made(see Munroe (1968)).

2.2 Copulas

We are now in the position to define copulas. Before doing so, however, we first mention n-dimensionaldistribution functions.

Definition 2.2.1. An n-dimensional joint distribution function is a function H with domH = Rn

satisfying the following

1. H is right-continuous,

2. H is n-increasing,

3. H is grounded,

4. H(∞, . . . ,∞) = 1.

Definition 2.2.2. An n-subcopula (or simply subcopula ) is a function S satisfying

1. dom(S) = S1 × S2 × · · · × Sn, where each Si is a subset of [0, 1] containing both 0 and 1;

2. S is grounded and n-increasing;

3. S has one-dimensional margins Sk, k = 1, . . . , n satisfying

Sk(u) = u for all u in Sk.

An n-copula (or simply copula ) is an n-subcopula C whose domain is the whole unit n-cube In. Theset of all n-copulas will be denoted by Cn (or simply by C where no confusion can arise).

In the following lemma we list some elementary properties of subcopulas, which can easily be derivedfrom the above definition.

Lemma 2.2.1. Let S be an n-subcopula. Then

1. For every u in S1 × · · · × Sn, S(u) = 0 if at least one coordinate of u is zero.

2. If all coordinates of a point u are 1 except the k-th, then S(u) = uk.

3. For every a and b in S1 × · · · × Sn such that a ≤ b, VS((a, b]) ≥ 0.

4. For every u in S1 × · · · × Sn, 0 ≤ S(u) ≤ 1.

Notation 2.2.1. If S is a subcopula and C a copula respectively, then for any k, 2 ≤ k ≤ n, by

Sj1,...,jk and Cj1,...,jk , 1 ≤ j1 6= j2 6= · · · 6= jn ≤ n

we denote the k-dimensional margins of S and C respectively.

Page 20: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

20 CHAPTER 2. THEORETICAL BACKGROUND

It is straightforward to verify, that all the k-margins are themselves subcopulas and copulas, respec-tively. Next, we present three basic examples of copulas.

Example 2.1. 1. The n-place real function

(2.4) πn(u) = u1 · u2 · · ·un

is the so-called independence copula.

2. The n-place real function

(2.5) Mn = min(u1, . . . , un)

is again an n-copula and commonly denoted as the Frechet-Hoeffding upper bound.

3. The n-place function

(2.6) Wn = max(u1 + u2 + · · · + un − n+ 1, 0)

is called the Frechet-Hoeffding lower bound. Unlike the above two functions, this one is a copulaif and only if n = 2. To see this, we note that, after a straightforward calculation,

VW ((1/2,1]) = ∆1

1/2W = 1− n

2, with (1/2,1] = (1/2, 1]× · · · × (1/2, 1]

and hence W fails to be n-increasing as soon as n exceeds 2.

Remark 2.2.1. For any k, 2 ≤ k ≤ n, all k dimensional margins of Mn, πnand Wn are of the same

kind, that is Mk, πkand Wk respectively.

In the following, we will present some basic properties of subcopulas. First, we show that they areuniformly continuous and hence continuous due to Theorem 2.1.1.

Corollary 2.2.1. Let S be an n-subcopula. Then for every u and v in its domain,

(2.7) |S(v) − S(u)| ≤n∑

i=1

|vi − ui|,

and hence every subcopula is uniformly continuous on its domain.

Proof. The inequality (2.7) follows readily with Theorem 2.1.1 and the special form of the one-dimensional margins of subcopulas.

Remark 2.2.2. Among other things, this theorem allows us to see subcopulas and hence copulas froma different point of view: the continuity property together with the definition yields that subcopulasare n-dimensional joint distribution functions with uniform margins restricted to domS.

Since copulas are special grounded and n-increasing functions, Theorem 2.1.2 also applies in thiscase:

Corollary 2.2.2. Let S be an n-copula and 1 ≤ k ≤ n, then for all u ∈ (0,1) the partial derivative

∂ukC(u1, . . . , un)

exists almost everywhere (in the sense of Lebesgue measure). Moreover,

(2.8) 0 ≤ ∂

∂ukC(u1, . . . , un) ≤ 1.

Page 21: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

2.2. COPULAS 21

Proof. Since the existence of the partial derivatives follows directly from Theorem 2.1.2, we only needto show (2.8). But this is an immediate consequence of (2.7) from the above Lemma 2.2.1.

Another famous result is that the set Cn is bounded pointwise.

Theorem 2.2.1. (Frechet-Hoeffding inequality) For any n-subcopula S and any u ∈ In,

(2.9) Wn(u) ≤ S(u) ≤ Mn(u).

Proof. In order to see that all terms considered throughout the proof indeed make sense, recall thatthe domain of a subcopula is a Cartesian product of sets containing 1 each.Now, the right hand inequality follows easily from the fact that subcopulas are nondecreasing in eachargument:

∀(1 ≤ k ≤ n) : S(u1, . . . , un) ≤ S(1, . . . , 1, uk, 1, . . . , 1) = uk,

and hence S(u1, . . . , un) ≤ min(uk, 1 ≤ k ≤ n).The left hand inequality follows with the above Corollary 2.2.1:

|S(1) − S(u)| ≤n∑

i=1

|1 − ui|

1 − S(u) ≤ n− (u1 + · · · + un).

As we saw in Example 2.1, Wn fails to be a copula if the dimension exceeds 2. Therefore, thequestion could arise whether there exists another n-place function which is both a lower bound and acopula. The following result due to Sklar shows that no such lower bound can be found:

Theorem 2.2.2. For any n ≥ 3 and any u in the unit n-cube, there is a copula (which depends onboth n and u) such that

Cu(u) = Wn(u).

Proof. See Nelsen (1999, Theorem 2.10.13).

Furthermore, the Frechet-Hoeffding inequality can be used to justify the following partial order onthe set of all n-copulas, Cn.

Definition 2.2.3. Let C1 and C2 be n-copulas. We say that C1 is smaller than C2, in symbols C1 ≺ C1,(or C2 is larger than C1, in symbols C2 C1) if

(2.10) ∀ u ∈ In : C1(u) ≤ C2(v).

Remark 2.2.3. The just defined order is often referred to as the concordance ordering. This is dueto its role played while measuring concordance, as we will see in later chapters. Also note that ifwe interpret Ci, i = 1, 2, as a joint distribution functions with uniform margins restricted to the unitn-cube, the phrase ”C2 is larger than C1” is equivalent to ”C2 is stochastically smaller than C1”.

Before we focus on the role played by copulas in statistical settings, we will examine the set Cn moreclosely. By means of Corollary 2.2.1, it is a subset of the space of all continuous real valued functionson the unit n-cube, say C(In), which is a Banach space with respect to the supremum norm, i.e. withrespect to ‖f‖∞ := sup(f(u),u ∈ In).Moreover, it can easily be checked that Cn is a convex set. Since uniform convergence clearly impliespointwise convergence, we have that, after an elementary calculation, if Ck is a convergent sequenceof copulas with limit C, then C itself is a copula. Hence, the set Cn is complete with respect to the

Page 22: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

22 CHAPTER 2. THEORETICAL BACKGROUND

supremum metric d∞, d∞(f, g) := supx∈In |f(x) − g(y)|.Next, the Frechet-Hoeffding inequality yields that Cn is bounded pointwise and Corollary 2.2.1 impliesthat Cn is equicontinuous. Hence, according to the Arzela-Ascoli Theorem (see Rudin, 1973, p.369)Cn is compact. The supremum metric thus provides a reasonable framework when dealing with copulas.

It is also sometimes useful to think of C(u) as an assignment of a mass to the n-box [0,u] for each u inthe unit n-cube In. This indeed makes sense, since, as we mentioned in Remark 2.2.2 copulas can beviewed as restrictions of joint n-dimensional distribution functions with uniform margins to the unitn-cube In. Hence, every copula from Cn generates a unique probability measure PC on (In,B(In))with

PC(a, b] = ∆baC for any half open n-box (a, b]

(see Shiryayev, 1984, Theorem II.3.2). We will also refer to this measure as C-measure. A conversion isnot true, however, since there exist joint distribution functions (or probability measures, respectively)on the unit n-cube whose margins are not necessarily uniform and which, therefore, fail to be copulas.Summarizing we have (see Schmitz, 2003, Corollary 2.17), that a probability measure P on (In,B(In))is induced by a copula if and only if

(2.11) ∀ xk ∈ I , 1 ≤ k ≤ n, P((0, (1, . . . , 1, xk, 1, . . . , 1)

]= xk.

Since copulas induce a probability measure on the unit n-cube, we can use the well known LebesgueDecomposition Theorem B.1.1 to state the following

Definition and Theorem 2.2.1. For any copula C from Cn,

C(u) = AC(u) + SC(u)

where

(2.12) AC(u) := µac(0,u] =

(0,u]

c(t)dλn(t), and SC(u) := µsing(0,u] = C(u) −AC(u).

1. µac is the absolutely continuous part of PC w.r.t. λn and c its Lebesgue density, i.e. a Borelmeasurable, nonnegative function with

∫In c(t)dλ

n(t) <∞;

2. µsing is the singular part of PC w.r.t. λn, i.e. a measure satisfying µsing ⊥ λn.

AC is the so-called absolutely continuous component and SC the singular component of C, respectively.If C ≡ AC then C is absolutely continuous (w.r.t. the Lebesgue measure) and possesses a density c.Otherwise, if C ≡ SC , C is called singular.In addition, the C-measure of the absolutely continuous component is AC(1) whereas the C-measure ofthe singular component is SC(1).

Remark 2.2.4. Since the margins of a copula are uniform and hence absolutely continuous, copulashave no ”atoms”, i.e. µsing(u) = 0 for all u in the unit n-cube.

Remark 2.2.5. By the support of a copula we understand the complement of the union of all opensubsets of the unit n-cube with C-measure zero. Clearly, if the support of a copula C has Lebesguemeasure zero, then C is singular and vice versa. If the support of a copula is the whole unit n-cube, wesay that C has a full support. Please note that there exist numerous copulas with full support whichare neither absolutely continuous nor singular.

Remark 2.2.6. When considering densities of absolutely continuous copulas in the general n-dimensionalcase, we have to be aware of the fact that the relationship between (partial) differentiability and ab-solute continuity w.r.t. Lebesgue measure λn is far more complicated than in the univariate case.

Page 23: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

2.2. COPULAS 23

For in one dimension, we have the well-known result that every absolutely continuous function w.r.t.Lebesgue measure is the indefinite integral of its derivative. In the multivariate case, however, there isno unique concept of differentiability for Lebesgue-Stieltjes measures in Rn in the first place and thetheory differs for different types of derivatives (for further details, see Section B.1).

Example 2.2. The Frechet-Hoeffding upper bound Mn is singular, its support being the main di-agonal connecting 1 and 0. The independence copula πn

is absolutely continuous, with densityc(t) = 1In .

As we obtained earlier in this section, the set of all copulas equipped with the supremum metric is acompact space and hence also separable, i.e. it contains a countable dense set. The following theoremshows one possibility to obtain such sets and, as a consequence, allows us to recognize that the set ofall absolutely continuous and singular copulas, respectively, is dense in C. The result is based upona generalization of the approximation idea used by Darsow et al. (1992) as well as Li et al. (1998)and Kulpa (1999) in order to replace an arbitrary copula by an absolutely continuous one up to aninconspicuous constant (for a detailed proof of this procedure, see Schmitz (2003)). The approximationis based upon dividing the unit interval I into N subintervals of equal length 1

N :

I =[0,

1

N

]∪

( 1

N,

2

N

]∪ · · · ∪

(N − 1

N, 1

].

By dividing I in this manner each, we get a partition of the unit n-cube into Nn disjoint cubes ofequal volume,

Sj :=(j1 − 1

N,j1N

(j2 − 1

N,j2N

]× · · · ×

(jn − 1

N,jnN

], j := (j1, . . . , jn), 1 ≤ jk ≤ N.

With IN being the set of all such grid point indices j ∈ 1, . . . , Nn we get

In =∑

j∈IN

Sj .

To express the position of each point u in the unit cube with regard to the above partition we introducefor each j in the index set IN the following function:

(2.13)mj : In → In

u 7→ mj(u), mj(u)k := max(min

[N(uk − jk−1

N ), 1], 0

), 1 ≤ k ≤ n

which affects each of the coordinates of u in the following way:

(2.14) uk 7→

0, if uk <jk−1N ,

N(uk − jk−1N ) if jk−1

N ≤ uk ≤ jkN ,

1, if uk >jkN .

Theorem 2.2.3. Let C∗ and C be copulas from Cn. Then for any N ∈ N the function defined by

(2.15) CNC∗(u) :=∑

j∈IN

VC(Sj)C∗(mj(u)), u ∈ In

is an n-copula. Furthermore, for any ε > 0 a number N can be chosen such that

(2.16) d∞(C, CNC∗) = ‖C − CNC∗‖∞ < ε.

Page 24: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

24 CHAPTER 2. THEORETICAL BACKGROUND

Proof. To begin with, we show that C corresponds to CNC∗ on the grid points points ( i1N , . . . ,inN ) for any

index i in IN . Indeed, we clearly have

(2.17) C( i1N, . . . ,

inN

)= VC(

j∈In, j≤i

Sj) =∑

j∈In, j≤i

VC(Sj) = CNC∗

( i1N, . . . ,

inN

)

by (2.14) and the fact that C∗ is grounded and satisfies C∗(1) = 1.Next, we prove that CNC∗ is a proper copula. Since CNC∗ is defined on the whole unit n-cube, it remainsto show that

1. CNC∗ is grounded: Suppose, without lack of generality, that one coordinate of the point u is zero,say uk. In this case we get

mj(u) =(mj(u)1, . . . ,mj(u)k−1,max(min(N(0 − jk − 1

N), 1), 0),mj(u)k+1, . . . ,mj(u)n

)

=(mj(u)1, . . . ,mj(u)k−1,max(1 − jk, 0),mj(u)k+1, . . . ,mj(u)n

)

=(mj(u)1, . . . ,mj(u)k−1, 0,mj(u)k+1, . . . ,mj(u)n

).

Hence, since C∗ is grounded, C∗(mj(u)) = 0 for any j ∈ IN which in turn implies CNC∗(u) = 0.

2. CNC∗ is n-increasing: If we choose a ≤ b from the unit n-cube, we have, by interchanging thesummation,

∆baCNC∗ =

j∈IN

VC(Sj)∆ba

(C∗ mj

).

The difference operator ∆ba operates on C∗ mj as follows:

∆ba

(C∗ mj

)= ∆

mj(b)

mj(a)C∗.

The right hand side is nonnegative by the n-increasing property of C∗ as soon as we show thatmj(a) ≤ mj(b). But this is a straightforward consequence of the fact that both max(·, 0) andmin(·, 1) are nondecreasing. Hence, CNC∗ is n-increasing.

3. CNC∗ has uniform margins: We show that, for u = (1, . . . , 1, uk, 1, . . . , 1), CNC∗(u) equals uk for anyk and any uk in the unit interval. It is clearly possible to find an ik ∈ 1, . . . , N such that

ik − 1

N< uk ≤ ik

N.

Thus, by (2.14), we have

mj(u) =

(1, . . . , 1, 0, 1, . . . , 1) if jk > ik,

(1, . . . , 1, N(uk − ik−1N ), 1, . . . , 1) if jk = ik,

(1, . . . , 1, 1, 1, . . . , 1) if jk < ik.

Page 25: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

2.2. COPULAS 25

Since C∗ is a copula, this yields

CNC∗(u) =∑

j ∈ INjk < ik

VC(Sj) · 1 +∑

j ∈ INjk = ik

VC(Sj) ·N(uk −ik − 1

N) +

j ∈ INjk > ik

VC(Sj) · 0

(2.17)= C(1, . . . , 1,

ik − 1

N, 1, . . . , 1)

+N(uk −ik − 1

N)(C(1, . . . , 1,

ikN, 1, . . . , 1) − C(1, . . . , 1,

ik − 1

N, 1, . . . , 1)

)

=ik − 1

N+N(uk −

ik − 1

N)( ikN

− ik − 1

N

)

=ik − 1

N+ uk −

ik − 1

N= uk,

and hence CNC∗ has uniform margins.

To finally show that ‖C − CNC∗‖∞ < ε holds for a proper choice of N , we use Corollary 2.2.1 and thefact that the disjoint cubes Sj cover the whole unit n-cube. In the first place, any u in the unit cubelies exactly in of the cubes, say Si. If we now choose N > 2n

ε , we have by Corollary 2.2.1 and (2.17),

|C(u) − CNC∗(u)| = |C(u) ± C( i1N, . . . ,

inN

)± CNC∗

( i1N, . . . ,

inN

)− CNC∗(u)|

≤ |C(u) − C( i1N, . . . ,

inN

)| + |C

( i1N, . . . ,

inN

)− CNC∗

( i1N, . . . ,

inN

)|

+ |CNC∗

( i1N, . . . ,

inN

)− CNC∗(u)|

≤n∑

k=1

|uk −ikN

| + 0 +

n∑

k=1

|uk −ikN

|

≤ 2

n∑

k=1

1

N=

2n

N< ε

and hence ‖C − CNC∗‖∞ < ε.

Remark 2.2.7. 1. In the above results, the copula C∗ does not play any special role; in fact, it ispossible to choose a different copula for any of the cubes Sj , say C∗

j . Since this is notationallymore complex and of no particular use here, we omit this even more general construction.

2. The copula C influences the above construction through the quantity VC(Sj) and is thereforecrucial for the validity of ‖C−CNC∗‖∞ < ε. Therefore, it is not necessarily true that ‖C1−CNC∗‖∞ < ε

for any other copula C1 in Cn. However, if we replace C in the above construction by C, a copulawhich agrees with C on all the grid points, the estimation remains true.

The above theorem has the following consequence concerning absolutely continuous and singularcopulas.

Corollary 2.2.3. The subset of all absolutely continuous copulas, An, as well as the subset of allsingular copulas, Sn, are dense in Cn.

Page 26: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

26 CHAPTER 2. THEORETICAL BACKGROUND

Proof. To show that Sn is dense is a straightforward application of the above theorem by choosing C∗

singular, e.g. the Frechet-Hoeffding upper bound Mn. To see that An is dense, we can use the aboveconstruction with C∗ =πn

. If we then focus on the quantities

VC(Sj)πn(mj(u))

for an arbitrary u in the unit n-cube, we get after an elementary integral calculus, that if uk ≥ jk−1N

for all k,

VC(Sj)πn(mj(u)) = VC(Sj) ·Nn ·

∫((

j1−1

N,..., jn−1

N),u

] 1Sj(u)dλn;

if uk >jkN for all k = 1, . . . , n,

VC(Sj)πn(mj(u)) = VC(Sj) ·Nn ·

Sj

1Sj(u)dλn;

and VC(Sj)πn(mj(u)) is zero otherwise. Hence CNC∗ is absolutely continuous with the following density

(2.18) cNC∗(u) =∑

j∈IN

VC(Sj)Nn1Sj

(u).

Remark 2.2.8. 1. Note that the choice C∗ :=π yields the aforementioned approximation by Darsowet al. (1992).

2. The above corollary shows, among other things, that an arbitrary copula can be approximatedby an absolutely continuous one as well as by a singular one. In particular, any absolutelycontinuous copula can be substituted by a singular one except for some negligible ε. It is usefulto keep this in mind when using the (normalized) distance

‖πn − C‖∞

as a measure of dependence.

Page 27: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

Chapter 3

Copulas in Statistical Settings

In the preceding chapter, we defined copulas as certain n-increasing functions on the unit cube. Thereason why these functions are of such an immense importance in mathematical statistics alreadybecomes apparent from the way Abe Sklar first introduced them in 1959. He chose the name ”copula”for a function which, as he discovered, has the power to couple together univariate margins to a jointdistribution function. This so-called Sklar’s Theorem thus nowadays became a starting point for nearlyany research on multivariate distributions.It will be the aim of this chapter to introduce copulas from a statistical point of view. Starting withSklar’s Theorem, we will then show how many useful features, such as conditional probabilities orsurvival functions, can be expressed in terms of copulas. In addition, we will also introduce some ofthe basic ways copulas can be constructed and list several well-known copula families, which will beused later on. For further informations on copulas, see the introductory monographs by Nelsen (1999)or Joe (1997) as well as Schweizer (1991) and further literature mentioned therein.

3.1 Sklar’s Theorem

Theorem 3.1.1. (Sklar’s Theorem) Let H be an n-dimensional joint distribution function with mar-gins F1, F2, . . . , Fn. Then there exists an n-copula C such that for all x in R

n,

(3.1) H(x1, . . . , xn) = C(F1(x1), . . . , Fn(xn)).

If F1, F2, . . . , Fn are all continuous, then C is unique, otherwise, C is uniquely determined on ranF1 ×ranF2 ×· · · × ranFn. Moreover, if F

(−1)1 , F

(−1)2 , . . . , F

(−1)n are quasi-inverses of the marginal distribu-

tion functions, then for any u in ranF1 × ranF2 × · · · × ranFn, the copula C satisfies

(3.2) C(u1, . . . , un) = H(F(−1)1 (u1), F

(−1)2 (u2), . . . , F

(−1)n (un)).

Conversely, if C is an n-copula and F1, F2, . . . , Fn are real distribution functions, then the function Hdefined by (3.1) is an n-dimensional joint distribution function with margins F1, F2, . . . Fn.

Proof. See Nelsen (1999, Theorem 2.10.9.) and further references given therein.

Remark 3.1.1. Sklar’s theorem can be viewed as a multivariate generalization of the following wellknown result for one dimensional distribution functions (see Ferguson, 1967, p.216).

27

Page 28: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

28 CHAPTER 3. COPULAS IN STATISTICAL SETTINGS

Let C denote the distribution function of the uniform distribution U(0, 1) restricted to theunit interval I, i.e. C(u) ≡ u. Then for any real distribution function F , C(F (x)) = F (x).Moreover, if F (−1) denotes the quasi-inverse of F , then for any u ∈ ranF , F (F (−1)(u)) =C(u) = u. Consequently, if F is continuous and X a random variable with X ∼ F , thenF (X) is uniformly distributed.

As is commonly known, this result has an enormous use in statistical inference, especially in con-struction of ”distribution free” procedures: if all information required is stored in the ranks of theobservations only, we would deal with the transformed values F (X) instead, which are distributedaccording to the uniform distribution whenever F is continuous.From this point of view, Sklar’s Theorem presents copulas as multivariate extensions of the univariateuniform distribution: if all Fi are continuous, then the random vector (F1(X1), . . . , Fn(Xn)) has jointdistribution function C. So it can be expected that a copula stores information based on the rankingof the observations only and will therefore allow to formulate ”marginal free” methods.

Remark 3.1.2. Sklar’s theorem can be used basically in two ways.

1. If the marginal distributions are all continuous, equation (3.2) provides a method for constructingcopulas from multivariate joint distribution functions. However, in most cases, this procedureturns out to be rather complicated and other construction methods are more favorable, as we willlater see. Nevertheless, this procedure yields several important copula families, as for examplespherical or elliptical copulas, which will be introduced in Section 3.5.1.

2. The other way round, equation (3.1) allows to construct a wide range of multivariate distribu-tions with required marginals by choosing a (suitable) copula. Such models may not possess astochastic interpretation, but various interesting properties of the joint distribution function canbe achieved in this way. This modeling procedure will be the aim of chapter seven.

To state further copula properties, which are useful for statistical inference, we will need the fol-lowing notation.

Notation 3.1.1. If H is an n-dimensional joint distribution function with continuous margins, by CHwe denote the copula satisfying (3.1). Similarly, if X is an n-dimensional random vector with jointdistribution function H and continuous margins, CX will stand for the underlying copula (i.e. thecopula satisfying (3.1)).In case the margins are not necessarily continuous, the underlying unique subcopula defined on ranF1×· · · × ranFn will be denoted by SH and SX respectively. In addition, the class of all possible copulas(i.e. copulas satisfying (3.1)) will be denoted by CH and CX respectively.Since any uniformly continuous mapping f from dom f into a complete metric space can be uniquelyextended1 to a uniformly continuous mapping on dom f , any subcopula possess a unique (and uniformlycontinuous) extension to ranF1 × · · ·× ranFn by means of Corollary 2.2.1. We will denote this uniqueextension by SH and SX , respectively2.

As a consequence of the Transformation Theorem for Lebesgue densities (recall Theorem B.2.3),we have that the densities of a joint distribution function and the corresponding copula are related inthe following way:

Corollary 3.1.1. Let H be an n-dimensional joint distribution function with absolutely continuousmargins F1, . . . , Fn; the corresponding Lebesgue densities are denoted by f1, . . . , fn. Assume, there

1For a proof, see e.g. Royden (1988, Proposition 11, pp. 149).2Note that any copula in CH [CX ] hence agree with SH [SX ] on ranF1 × · · · × ranFn.

Page 29: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

3.1. SKLAR’S THEOREM 29

exists an open subset X in Bn such that the probability measure on (Rn,Bn) induced by H, PH ,satisfies

PH(X ) = 0

and that the transformation operator T defined by

T : X → In,(x1, . . . , xn) 7→ (F1(x1), . . . , Fn(xn))

is injective and differentiable. Then

1. if the underlying copula CH is absolutely continuous w.r.t. Lebesgue measure with density cH ,then H is absolutely continuous w.r.t. Lebesgue measure. Moreover, the density of H, fH , is a.s.given by

(3.3) fH(x) = cH(F1(x1), . . . , Fn(xn)) ·n∏

i=1

fi(xi) · 1X (x).

2. if H is absolutely continuous w.r.t. Lebesgue measure with a density fH and if the set

N := x ∈ X | ∃ 1 ≤ i ≤ n : fi(xi) = 0

has Lebesgue measure zero, then the corresponding copula is also absolutely continuous withdensity a.s. given by

(3.4) cH(u) =fH(F−1

1 (u1), . . . , F−1n (un))∏n

i=1 fi(F−1i (ui))

· 1u: T−1(u)∈N(u).

Proof. Both results follow straightforwardly with the Sklar’s theorem and Theorem B.2.3.

Next, we obtain two immensely important results concerning the changes of the unique (sub)copulasunder strictly monotone transformations. The first proposition states that the unique subcopula re-mains invariant under (component-wise) strictly monotone transformations, whereas the second de-scribes the way it changes when the transformation is decreasing in at least one component.In case the marginals are continuous, the results have been noted by Schweizer and Wolff, who there-after went on to say:

... for us the true importance of copulas lies in a combination of Sklar’s Theorem and theTheorem concerning changes of copulas under strictly monotone transformations. For, fromthe structure of (3.2) and the fact that under a.s. strictly increasing transformations ofX and Y the copula is invariant while the margins may be changed at will, it follows thatit is precisely the copula which captures those properties of the joint distribution which areinvariant under a.s. strictly increasing transformations. Hence the study of rank statistics- insofar as it is the study of properties invariant under such transformations - may becharacterized as the study of copulas and copula-invariant properties.” (Schweizer, 1991,pp.32-33)

Proposition 3.1.1. (Schweizer and Wolff) Let X = (X1, . . . , Xn) be a random vector with marginaldistribution functions F1, . . . , Fn and subcopula SX defined on ranF1×· · ·×ranFn. If α is a component-wise strictly increasing and continuous transformation on ranX1 × · · · × ranXn, then the transformedvector α(X) = (α1(X1), . . . , αn(Xn)) has the same subcopula, i.e.

Sα(X) = SX .

Page 30: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

30 CHAPTER 3. COPULAS IN STATISTICAL SETTINGS

Proof. See Embrechts et al. (2002, Proposition 2).

Remark 3.1.3. The continuity of the transformations is necessary for the general case. If all the marginsare continuous, this requirement can be omitted (see i.e. Nelsen, 1999, Theorem 2.4.3).

Although the next proposition considers the bivariate case only, it also gives the impression of possibleextensions in higher dimensions.

Proposition 3.1.2. (Schweizer and Wolff) Let X and Y be random variables with distribution func-tions FX and FX , respectively, and a subcopula SXY . Furthermore, let α and β be continuous andstrictly monotone on ranX and ranY , respectively. Then

1. If α is strictly increasing and β strictly decreasing, then

Sα(X)β(Y )(u, v) = u− SXY (u, 1− v);

2. If α is strictly decreasing and β strictly increasing, then

Sα(X)β(Y )(u, v) = v − SXY (1 − u, v);

3. If α and β are strictly decreasing, then

Sα(X)β(Y )(u, v) = u+ v − 1 + SXY (1 − u, 1 − v);

whenever (u, v) ∈ ranFα(X) × ran Fβ(X).

Proof. To begin with, suppose that α is increasing and β decreasing. We now show that for any realx and y,

(†) Hα(X)β(Y )(x, y) = Fα(X)(x) − SXY (Fα(X)(x), 1 − Fβ(Y )(y)).

On one hand,Fα(X)(x) = P[α(X) ≤ x] = P[X ≤ α−1(x)] = FX(α−1(x))

and, on the other, because β is decreasing,

Fβ(Y )(y) = P[β(Y ) ≤ y] = P[Y ≥ β−1(y)] = 1 − FY (β−1(y)−).

Moreover, note that

(‡) P[X ≤ x, Y < y] = limn→∞

P[X ≤ x, Y ≤ y − 1

n] = lim

n→∞SXY

(FX (x), FY (y − 1

n)

)=

= SXY (FX (x), FY (y−))

since (FX (x), FY (y−)) lies in domSXY . Hence, the right hand side in (†) equals

Fα(X)(x) − SXY (Fα(X)(x), 1 − Fβ(Y )(y)) = FX(α−1(x)) − SXY (FX (α−1(x)), 1 − 1 + FY (β−1(y)−))

= FX(α−1(x)) − SXY (FX (α−1(x)), FY (β−1(y)−))

(‡)= P[X ≤ α−1(x)] − P[X ≤ α−1(x), Y < β−1(y)]

= P[X ≤ α−1(x), Y ≥ β−1(y)] = P[α(X) ≤ x, β(Y ) ≤ y]

= Hα(X)β(Y )(x, y).

Page 31: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

3.2. SECTIONS OF COPULAS 31

Secondly, if α is decreasing and β increasing, then, with 1.,

Sα(X)β(Y )(u, v) = Sβ(Y )α(X)(v, u) = v − SY X (v, 1 − u) = v − SXY (1 − u, v);

and, finally, if both α and β are decreasing, then, with 1. and 2.,

Sα(X)β(Y )(u, v) = u−Sα(X)Y (u, 1−v) (∗)= u− [1−v−SXY (1−u, 1−v)] = u+v−1+SXY (1−u, 1−v).

((∗) follows from the fact that Sα(X)Y is a restriction of the (uniformly continuous) function C∗,C∗(u, v) := v − SXY (1 − u, v), which is defined on a closed set.)

3.2 Sections of Copulas

Next, we examine k-sections of a copula at some fixed point a ∈ In. These can be viewed as projectionsof the copula in the xk-th direction at the point considered. Especially, in two dimensions, sectionsturn out to be of particular use when interpreting certain dependence properties.

Definition 3.2.1. Let C be a copula, and let a be any number in In.

1. The k-section of C at a is the function from I to I defined by

t 7→ C(a1, . . . , ak−1, t, ak+1, . . . , an).

In particular, when n = 2, the function

t 7→ C(a1, t)

is called a vertical section of C at a and the function

t 7→ C(t, a2)

a horizontal section of C at a respectively.

2. The diagonal section of C is the function δC(t) from I to I defined by

t 7→ δC(t) := C(t, . . . , t).

The following statement is an immediate consequence of the fact that every copula is uniformlycontinuous and nondecreasing in each argument.

Corollary 3.2.1. For any k, 1 ≤ k ≤ n, a k-section as well as the diagonal section of a copula arenondecreasing and uniformly continuous on I.

Proof. For k-sections, the result follows immediately with Lemma 2.1.1 and Corollary 2.2.1. As to thediagonal section, consider t1 ≤ t2 in I . If we denote (t1, . . . , t1) by t1 and (t2, . . . , t2) by t2 respectively,the n-box [0, t1] is included in [0, t2], i.e.,

[0, t1] ⊆ [0, t2].

Now, since C(ti) is the probability assigned by C to the n-box [0, ti], i = 1, 2, we clearly have C(t1) ≤C(t2). Finally, the uniform continuity follows again with Corollary 2.2.1.

Page 32: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

32 CHAPTER 3. COPULAS IN STATISTICAL SETTINGS

3.3 Conditional Probabilities

In this section we show how conditional probabilities of a random vector X can be expressed usingcopulas. To get a first impression of where we are heading, consider a random vector (U, V ) withuniform margins and copula C. Then the conditional probabilities can be re-written as follows:

(3.5) P [V ≤ v|U = u] = limh→0

C(u+ h, v) − C(u, v)

h=

∂uC(u, v);

and, similarly, P [U ≤ u|V = v] = ∂∂vC(u, v). The general case, however, certainly requires greater

precision. To provide it, we start with a random vector U having uniform margins and a joint dis-tribution function C and base our survey upon Theorems B.2.1 and B.2.2. But first, for the sake ofsimplicity, let us introduce some notation (recall Definition B.2.1).

Notation 3.3.1. Let U = (U1, . . . , Un) be an n-dimensional random vector with uniform margins anda copula C. Moreover, let I = i1, . . . , ir and J = j1, . . . , js be some arbitrary subsets of 1, . . . , nand U I = (Ui1 , . . . , Uir ) and UJ = (Uj1 , . . . , Ujs) the r- and s-margins of U respectively, having jointdistribution functions CI and CJ . If existing, the corresponding Lebesgue densities will be denoted bycI and cJ respectively.In addition, if uI denotes the point (ui1 , . . . , uir) in Ir and uJ the point (uj1 , . . . , ujs) in Is, cI|J(uI |uJ )

will stand for the Lebesgue density of the conditional distribution PUI |UJ=uJ of U I given UJ = uJprovided there exists one. Furthermore, if we mention I ∪ J as a set of indizes, we always assume itselements to be given in an ascending order.

Theorem 3.3.1. Let U = (U1, . . . , Un) be an arbitrary random vector with uniform margins and acopula C, which is absolutely continuous w.r.t. Lebesgue measure λn. Furthermore, let c denote the(Lebesgue) density of C. Then

1. For any subset I = i1, . . . , ir of 1, . . . , n, the margin U I is also absolutely continuous w.r.t.Lebesgue measure, and its density is a.s. given by

cI(uI) =

In−r

c(u)dλn−r(ui : i 6∈ I).

2. If I = i1, . . . , ir and J = j1, . . . , js are disjoint subsets of 1, . . . , n, then for any uJ inIs, the conditional distribution PUI |UJ=uJ exists and is absolutely continuous w.r.t. λr. ItsLebesgue density is (a.s.) given by

(3.6) cI|J(uI |uJ) =cI∪J(uI∪J)

cJ(uJ), if cJ(uJ) > 0

and by some arbitrary λr density otherwise.

Proof. The statement follows straightforwardly with Theorem B.2.2.

Theorem 3.3.1 provides us with formulas for conditional densities, which are of rather theoreticaluse. Since conditional copulas are important especially for computer simulations, we now focus onfinding a more suitable expression. By equation (3.5) we can get the impression that the partialderivatives of the underlying copula C may play an important role here. It is indeed so, but asmentioned in Remark 2.2.6, in the multivariate case the relationship between (partial) differentiabilityand absolute continuity w.r.t. the Lebesgue measure λn is far more complex than in the univariatecase (for further details, see Appendix B.1). Nevertheless, the following holds:

Page 33: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

3.4. SURVIVAL COPULAS 33

Corollary 3.3.1. (Schmitz, 2003, Theorem 2.27) Let U be a n-dimensional real valued random vectorwith uniform margins and copula C. Furthermore, assume that C is n− 1-times continuously partiallydifferentiable w.r.t. the first n− 1 arguments. Then,

∂n−1

∂u1···∂un−1C(U1, . . . , Un−1, un)

∂n−1

∂u1···∂un−1C(U1, . . . , Un−1, 1)

is a version of P[Un ≤ un|U1, . . . , Un−1].

Clearly, formulas for the conditional probabilities P Uik|Ui1

,...,Uik−1 can be calculated analogously.Such results are particularly useful for computer simulations (cf. Nelsen (1999) and Schmitz (2003)).

3.4 Survival Copulas

Survival copulas couple survival functions to a multivariate one, in a similar way as copulas coupleunivariate distribution functions. Moreover, they are closely related to copulas of the correspondingjoint distribution functions.

Let (X,Y ) denote a pair of random variables with joint distribution functionH , copula3 C and marginaldistributions F and G respectively. Furthermore, let H denote the joint survival function and F andG the univariate ones, i.e.

F (x) = P[X > x] = 1 − F (x).

Definition 3.4.1. If C is a 2-dimensional copula, the function C from I2 to I defined by

(3.7) C(u, v) = u+ v − 1 + C(1 − u, 1− v)

is called a survival copula.

It is easy to show that C is indeed a copula (c.f. Nelsen (1999, Section 2.6)), satisfying

H(x, y) = C(F (x), G(y)).

Therefore, it is not true that a survival copula is the survival function of two uniform margins havingthe corresponding copula C; but they are closely related:

C(u, v) = P [U > u, V > v] = C(1 − u, 1− v).

On this occasion, we define two other functions closely related to copulas, i.e. the dual of a copula anda co-copula.

Definition 3.4.2. Let C be a 2-dimensional copula. A dual of the copula C is a function C is definedby

C(u, v) = u+ v − C(u, v),

A co-copula of the copula C is a function C given by

C(u, v) := 1 − C(1 − u, 1− v).

Neither of these functions is a copula, but both have useful probabilistic interpretations. The dualof a copula satisfies

C(F (x), G(y)) = P [X ≤ x or Y ≤ y],

3If the marginals are not necessarily continuous, C is some arbitrary copula in CH

Page 34: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

34 CHAPTER 3. COPULAS IN STATISTICAL SETTINGS

whereas a co-copula expresses

C(F (x), G(y)) = P [X > x or Y > y].

In the multivariate case, the situation is slightly more complicated. Suppose (X1, . . . , Xn) is a randomvector having a joint distribution function H and marginal distributions Fi; H and F i being theirsurvival counterparts. Moreover, if C is the corresponding copula with k-dimensional margins Cj1,...,jk(recall Notation 2.2.1), we have, according to the sieve formula (see Billingsley, 1995, p.24),

P [X1 > x1, . . . , Xn > xn] = 1 +n∑

k=1

1≤j1<···<jk≤n(−1)kP [Xji ≤ xji |1 ≤ i ≤ k]

= 1 − n+

n∑

i=1

F i(xi) +

n∑

k=2

1≤j1<···<jk≤n(−1)kCj1,...,jk(1 − F (xji )|1 ≤ i ≤ k).

With this in mind, a survival copula can now be defined similarly as in the bivariate case.

Definition 3.4.3. Let C be a n-variate copula and Cj1,...,jk its k-dimensional margins. The function

C from In to I defined by

(3.8) C(u1, . . . , un) = 1 − n+

n∑

i=1

ui +

n∑

k=1

1≤j1<···<jk≤n(−1)kCj1,...,jk (1 − uji |1 ≤ i ≤ k)

is called a survival copula.

As in the bivariate case, this function satisfies

C(F 1(x1), . . . , Fn(xn)) = H(x1, . . . , xn),

which, in case the margins are uniform with a joint distribution function C and joint survival functionC (restricted to In), turns into

C(1 − u1, . . . , 1 − un) = P [U1 > u1, . . . , Un > un] =: C(u1, . . . , un).

As an easy consequence, we get that C itself is a copula.

In the univariate case, survival functions can be used to express symmetry. For if we call a realrandom variable X satisfying

X − a ∼ a−X

symmetric about a point a in R, we clearly have,

X symmetric about a ⇔ F (a− x) = F (a+ x) for all x ∈ R.

In the multivariate case, however, the understanding of symmetry is far more ambiguous and canbe conceived in a number of ways. The most common approaches are that of marginal symmetry,joint symmetry and radial symmetry, all of which can be viewed as a generalization of the univariateconcept. The joint symmetry is the strongest one, and since jointly symmetric random vectors mustbe uncorrelated when the required second-order moments exist (see Randles and Wolfe, 1979, Section1.3), joint symmetry seems to be too strong a property. The marginal symmetry, on the other hand,seems to be too weak, as there exist marginal symmetric distributions which do not agree with an

Page 35: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

3.5. EXAMPLES OF COPULAS 35

intuitive understanding of symmetry (see Schmitz, 2003, p.27). The radial symmetry is neither thatweak nor that strong in comparison and the condition required involves both the joint distributionand survival functions in an analogous manner as the univariate case. Consequently, we will focus onthis concept only and refer for the other two as well as for further literature on this subject to Nelsen(1999, Section 2.7).

Definition 3.4.4. Let X = (X1, . . . , Xn) be a real-valued random vector and a = (a1, . . . , an) a pointin Rn. The vector X is called radially symmetric about a , if (a − X) and (X − a) have the samedistribution, i.e.

(X1 − a1, . . . , Xn − an) ∼ (a1 −X1, . . . , an −Xn).

As stated above, this requirement can be expressed using the joint distribution and the joint survivalfunctions.

Theorem 3.4.1. In the situation of the above definition, let H denote the joint distribution functionand H the joint survival function of X respectively.Then X is radially symmetric about a if and onlyif

(3.9) H(a1 + x1, . . . , an + xn) = H(a1 − x1, . . . , an − xn) for all (x1, . . . , xn) ∈ Rn.

This theorem has an interesting corollary noted by Schmitz (2003) that allows us to check the radialsymmetry using copulas and univariate margins only.

Corollary 3.4.1. A random vector X = (X1, . . . , Xn) with copula C and univariate survival functionsFi, 1 ≤ i ≤ n, is radially symmetric about a = (a1, . . . , an) if and only if

1. all the Xi are symmetric about ai, i.e.

∀1 ≤ i ≤ n : Fi(ai − xi) = Fi(ai + xi) for all xi ∈ R,

and

2. the survival copula and the underlying copula coincide, i.e.

(3.10) C(u) = C(u) for all u ∈ ranF1 × · · · × ranFn.

Consequently, if U = (U1, . . . , Un) is a random vector with uniform margins and a copula C satisfying(3.10), then U is radially symmetric about ( 1

2 . . . ,12 ).

Proof. See Schmitz (2003), Corollary 2.41. The last statement concerning U easily follows from thefact that uniform distribution is always symmetric about 1

2 .

In the light of this corollary, we introduce the following:

Definition 3.4.5. Let C be a n-copula in Cn and C its survival counterpart. Then C is called radiallysymmetric, iff

C(u) = C(u) for all u ∈ In.

3.5 Examples of Copulas

Throughout this section, we will present some families of copulas as well as construction methods, whichhave been proposed and widely used in the literature. Since we will focus on negative dependence lateron, our examples will be selected from this point of view. Further copula families can be found i.e. inthe introductory books by Nelsen (1999), Joe (1997), Embrechts et al. (2002) or Kotz and Mari (2001).

Page 36: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

36 CHAPTER 3. COPULAS IN STATISTICAL SETTINGS

3.5.1 Spherical and Elliptical Copulas

Copulas corresponding to spherical distributions are called spherical copulas. Spherical distributionsare a family of distributions of symmetric and uncorrelated random vectors with mean zero and hence,from this point of view, they are generalizations of the multivariate standard normal distribution.They are usually defined throughout the typical form of their characteristic function.

Definition 3.5.1. If the characteristic function of an n-dimensional random vector X has the form

(3.11) ψ(x) = φ(‖t‖2),

for some function φ : R+ → R, then X is distributed according to a spherical distribution withcharacteristic generator φ,

X ∼ Sn(φ).

Example 3.1. A most representative example of a spherical distribution is the uniform distribution onthe unit sphere, as can be easily seen with the form of its characteristic function. In the bivariate case,this distribution is also called ”The Circular Uniform Distribution”. The copula of this distribution isgiven by (see Nelsen, 1999, Section 3.2.1):

(3.12) Ccirc(u, v) =

M(u, v), |u− v| > 12 ,

W(u, v), |u+ v − 1| > 12 ,

2u+2v−14 , otherwise.

As can be readily seen with its expression, this copula is singular and thus does not possess a density.

1/2

1/2

Figure 3.1: Support of the copula of the circular uniform distribution.

It’s support is a diamond inside the unit square. Moreover, Ccirc is radially symmetric and satisfiesCcirc(u, v) = Ccirc(v, u) for any u and v in [0, 1].

The following theorem gives some basic properties and characteristics of spherical distributions.

Theorem 3.5.1. (Fang et al. (1990, Theorems 2.1., 2.5. and 2.2.) and Fang and Zhang (1990,Theorem 2.5.5.)) Let X be an n-dimensional random vector with a spherical distribution. Then

1. X has a stochastic representation X ∼ Ru(n), where R ≥ 0 is a random variable and u(n) arandom vector distributed uniformly on the unit sphere in R

nand independent of R;

Page 37: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

3.5. EXAMPLES OF COPULAS 37

2. X ∼ ΓX for all Γ ∈ O(n), O(n) being the set of n× n orthogonal matrices.

3. If X has a density f , then there exists a real function g such that

f(x) = g(‖x‖2) for all x ∈ Rn.

Although uncorrelated, spherically distributed random vectors do not need to be independent. Infact, the multivariate standard normal distribution is the only one in this class whose components areindependent (c.f. Fang and Zhang, 1990, Theorem 2.7.2.).With the above theorem, spherical distributions can be interpreted as mixtures of uniform distributionson spheres of differing radius. This also reflects the density, which is, if existing, constant on thesesurfaces. We close our survey on spherical distributions with the following corollary:

Corollary 3.5.1. (Fang et al., 1990, Theorem 2.3) Suppose X ∼ Ru(n) ∼ Sn(φ) and P [X = 0] = 0.Then ‖X‖ ∼ R and X

‖X‖ ∼ u(n). Moreover, ‖X‖ and X/‖X‖ are independent.

The distribution of X‖X‖ does not depend on X and thus it is often useful to think of X having a

multivariate standard normal distribution.The family of elliptical distributions is a generalization of the multivariate normal distribution

with mean µ and covariance Σ. Since this distribution is an affine transformation of the multivariatestandard normal, elliptical distributions are defined as affine transformations of spherical distributions.

Definition 3.5.2. If the characteristic function of an n-dimensional random vector X has the formexp(ıtTµ)φ(tTΣt), where µ is a n× 1 vector and Σ a positive definite n× n matrix, we say that X isdistributed according to an elliptical distribution with parameters µ,Σ and φ,

(3.13) X ∼ En(µ,Σ, φ).

Copulas corresponding to elliptical distributions are called elliptical copulas.

It should be mentioned that neither one of the parameters Σ and φ is unique. But if X ∼En(µ,Σ, φ) ∼ En(µ, Σ, φ), then µ = µ and there exists a positive constant c such that Σ = cΣ andφ(u) = φ(u/c) for all u (c.f. Fang et al., 1990, Theorem 2.15.). Thus the parameter Σ can always bechosen4 in a way that it corresponds with the covariance matrix of X. In the following theorem, someof the basic properties of elliptical distributions are given (for further details and proofs see Fang andZhang (1990, Section 2.5.) or Fang et al. (1990)):

Theorem 3.5.2. Let X be an n-dimensional random vector with X ∼ En(µ,Σ, φ). Then

1. X has a stochastic representation X ∼ µ+RAu(n), where AA> = Σ, R ≥ 0 is a random variableand u(n) a random vector distributed uniformly on the unit sphere in R

nand independent of R;

2. If Ru(n) has a density f(x) = g(‖x‖2) and Σ is strictly positive definite, then X has a densityh given by

h(x) =1√

det Σg((x − µ)>Σ−1(x − µ)

).

The density is hence constant on ellipsoids.

3. Let B be a k × n matrix and b ∈ Rk. Then

b +BX ∼ En(b+Bµ,BΣB>, φ).

Thus any affine transformation of an elliptically distributed random vector is also ellipticallydistributed with the same characteristic generator φ.

4See Fang et al. (1990, Section 2.5.) for further details.

Page 38: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

38 CHAPTER 3. COPULAS IN STATISTICAL SETTINGS

Remark 3.5.1. As is also well known (see e.g. Fang and Zhang (1990) or Fang et al. (1990)), conditionaldistributions of elliptical distributions are again elliptical, but, in general, with different generators.

With this theorem, it is clear that all univariate margins of elliptical distributions have the samegenerator, which together with the mean vector and covariance matrix uniquely determines the entiredistribution. This is one of the main reasons why elliptical distributions are so popular in practice.On the other hand, there exist various situations where the same type of the margins turns out to berather restrictive (see i.e. Embrechts et al. (2001) and Embrechts et al. (2002)). In such situationsa compromise modeling using the copula approach may be helpful: fitting different types of marginsinto an elliptical copula may relax the model restrictions and yet preserve the dependence structure ofthe elliptical distributions.However reasonable, this approach again has its drawbacks: in general, elliptical copulas have to bedetermined from (3.2) of the Sklar’s theorem directly, i.e. from

C(u1, . . . , un) = H(F(−1)1 (u1), F

(−1)2 (u2), . . . , F

(−1)n (un))

which in most cases does not lead to a close and handy expression. Here we present two of the mostpopular members of the family of elliptical copulas.

Example 3.2. The Gaussian or normal copula is given by

(3.14) CGaΣ (u1, . . . , un) =

∫ Φ−1(u1)

−∞· · ·

∫ Φ−1(un)

−∞

1√(2π)k det Σ

exp

(−1

2t>Σ−1t

)dt1 . . . dtn

where Σ is a positive definite correlation matrix. The Gaussian copula emerges from (3.2) by usingthe multivariate normal distribution with correlation matrix Σ and N (0, 1)-distributed margins. Inthe bivariate case, (3.14) turns into

(3.15) CGaρ (u1, u2) =

∫ Φ−1(u1)

−∞

∫ Φ−1(u2)

−∞

1

2π√

1 − ρ2exp

(−s

2 − 2ρst+ t2

2(1 − ρ2)

)ds dt.

where −1 < ρ < 1 is the correlation coefficient of the bivariate normal distribution. With Corollary3.1.1, the Gaussian copula can also be re-written in the following way:

(3.16) CGaρ (u1, u2) =

∫ u1

0

∫ u2

0

1√1 − ρ2

exp

(−ρ(ρs

2 − 2st+ ρt2)

2(1 − ρ2)

)ds dt.

If ρ tends to ±1, then CGaρ converges to the Frechet-Hoeffding bounds, i.e. to M and W , respectively(c.f. Johnson and Kotz, 1972, Chapter 36, sect. 4).

Page 39: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

3.5. EXAMPLES OF COPULAS 39

0.2 0.4 0.6 0.8x

0.2

0.4

0.6

0.8

y

0.2 0.4 0.6 0.8x

0.2

0.4

0.6

0.8

y

0.2 0.4 0.6 0.8x

0.2

0.4

0.6

0.8

y

ρ = 0.95 ρ = 0.6 ρ = 0.1

0.2 0.4 0.6 0.8x

0.2

0.4

0.6

0.8

y

0.2 0.4 0.6 0.8x

0.2

0.4

0.6

0.8

y

0.2 0.4 0.6 0.8x

0.2

0.4

0.6

0.8

y

ρ = −0.1 ρ = −0.6 ρ = −0.95

Figure 3.2: Contour plots of the Gaussian copula densities

For an algorithm for random variate generation, see for example Embrechts et al. (2001).

Example 3.3. Another well-known copula constructed using (3.2) is the t-copula,

(3.17) Ctν,R(u1, . . . , un) = tnν,R(t−1ν (u1), . . . , t

−1ν (un)

),

where R denotes the (positive definite) correlation matrix (for ν > 2 and the shape parameter other-wise) and tnν,R the joint distribution function of an n-variate tν-distribution with mean 0, ν degrees offreedom and tν -distributed margins. In the bivariate case the t-copula can be rewritten as

(3.18) Ctν,ρ(u1, u2) =

∫ t−1ν (u1)

−∞

∫ t−1ν (u2)

−∞

1

2π√

1 − ρ2

(1 +

s2 − 2ρst+ t2

ν(1 − ρ2)

)−(ν+2)/2

ds dt,

where again the parameter −1 < ρ < 1 is the usual linear correlation coefficient if ν > 2. Again, if ρpasses to ±1, the t-copula converges to M and W respectively (c.f. Johnson and Kotz (1972, Chapter37)). For further details as well as an algorithm for random variate generation see i.e. Embrechts et al.(2001).

3.5.2 Archimedean Copulas

In this section, we introduce the family of Archimedean copulas. The previously discussed ellipticalcopulas result from elliptical distributions, which have a pictorial geometric and stochastic interpre-tation yet no close algebraic expression. With Archimedean copulas, the situation is opposite. SinceC(u, v) can also be viewed as a binary operation on the unit square, it has its own algebraic meaning.As such, copulas also appeared in connection with probabilistic metric spaces introduced by Menger(1942). In such spaces the triangular inequality is described using the so called t-norms, which turnedout to be associative if and only if they are Archimedean copulas (for an introduction on this nowadays

Page 40: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

40 CHAPTER 3. COPULAS IN STATISTICAL SETTINGS

scarcely studied subject see Schweizer (1991) and further literature given therein).In the following, we will define Archimedean copulas, starting with the bivariate case, and discuss someof their basic properties. As we will see, although members of this class do not possess a stochasticrepresentation in general, they have nice expressions and many of their properties can be studied withgreater ease as it is the case in general. Moreover, numerous parametric subfamilies have been con-sidered so far. For our purposes, families which have the Frechet-Hoeffding lower bound as a limitingcase will be of particular interest. Hence, we will reduce the examples to those which comply with thiscondition.

Definition and Theorem 3.5.1. (Nelsen, 1999, Definition 4.4.1 and Theorem 4.1.4)Let ϕ be a continuous, strictly decreasing function from [0, 1] to [0,∞] such that ϕ(1) = 0. Thepseudo-inverse of ϕ is a function ϕ[−1], ϕ[−1] : [0,∞] → [0, 1], given by

(3.19) ϕ[−1](t) =

ϕ−1(t), 0 ≤ t ≤ ϕ(0),

0, ϕ(0) ≤ t ≤ ∞.

The function defined according to

(3.20) C(u, v) = ϕ[−1](ϕ(u) + ϕ(v)

)

is a copula if and only if ϕ is convex. C is then called an Archimedean copula with generator ϕ. Ifϕ(0) = ∞, the generator is called strict and the copula a strict Archimedean copula.

Like in the case of elliptical copulas, the generator of an Archimedean copula is not unique. In fact,for any positive constant c, cϕ is also a generator of C. The following theorem summarizes some of thetypical properties of members of this copula class.

Theorem 3.5.3. (Nelsen, 1999, Theorem 4.1.5.)Let C be an Archimedean copula with generator ϕ. Then

1. C is symmetric ; i.e., C(u, v) = C(v, u) for all u, v in the unit square;

2. C is associative ; i.e., C(C(u, v), w) = C(u, C(v, w)) for all u, v and w in the unit square;

3. the diagonal section of C satisfies δC(u) < u for all u in (0, 1).

Remark 3.5.2. It can also be shown that, in reverse, any associative copula satisfying δC(u) < u for allu in (0, 1) is Archimedean (see Ling (1965)).

In the next chapters, two aspects will play an important role: concordance ordering and the Frechet-Hoeffding bounds as limiting cases. Therefore, we quote two theorems concerning this subject.

Theorem 3.5.4. (Nelsen, 1999, Theorem 4.4.2 and Corollaries 4.4.5. and 4.4.6)Let C1 and C2 be Archimedean copulas generated, respectively, by ϕ1 and ϕ2. Then

1. C1 ≺ C2 if and only if ϕ1 ϕ[−1]2 is sub-additive , i.e.,

ϕ1 ϕ[−1]2 (x + y) ≤ ϕ1 ϕ[−1]

2 (x) + ϕ1 ϕ[−1]2 (y), for all x, y ∈ [0, 1];

2. If ϕ1/ϕ2 is nondecreasing on (0, 1), then C1 ≺ C2;

3. If, under the condition that both generators are continuously differentiable on (0, 1), ϕ′

1/ϕ′

2 isnondecreasing on (0, 1), then C1 ≺ C2.

Page 41: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

3.5. EXAMPLES OF COPULAS 41

Theorem 3.5.5. (Nelsen, 1999, Theorems 4.4.7. and 4.4.8.)Let Θ be some subset of R. Furthermore, let Cθ|θ ∈ Θ be a family of Archimedean copulas withdifferentiable generators ϕθ. Then, for any a in the closure Θ,

1. limθ→a

Cθ = W if and only if

limθ→a

ϕθ(s)

ϕ′

θ(t)= s− 1 for all s, t ∈ (0, 1);

2. limθ→a

Cθ =π if and only if

limθ→a

ϕθ(s)

ϕ′

θ(t)= t ln s for all s, t ∈ (0, 1);

3. limθ→a

Cθ = M if and only if

limθ→a

ϕθ(t)

ϕ′

θ(t)= 0 for all t ∈ (0, 1).

In the following, we present two examples of Archimedean copulas which will be needed in furtherchapters: the Frank family and the Clayton family.

Example 3.4. Frank family The Frank family of copulas, discovered by Frank (1979) when investi-gating the algebraic structure of binary operations on distribution functions (cf. also Frank (1991)),is given by

(3.21) CFraθ (u, v) = −1

θln

(1 +

(e−θu − 1)(e−θv − 1)

e−θ − 1

), θ ∈ (−∞,∞) \ 0,

with point-wise limits of CFraθ for θ converging to −∞, 0 and ∞

(3.22) CFra−∞ = W , CFra0 =π, CFra∞ = M.

This family, generated by

ϕFraθ (t) = − lne−θt − 1

e−θ − 1

is also positively ordered , i.e. CFraθ1≺ CFraθ2

iff θ1 < θ2 and contains the only Archimedean copulaswhich are radially symmetric (cf. Definition 3.4.5). Frank copulas are also absolutely continuous; theirdensity contour plots taking the following shape:

Page 42: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

42 CHAPTER 3. COPULAS IN STATISTICAL SETTINGS

0.2 0.4 0.6 0.8x

0.2

0.4

0.6

0.8

y

0.2 0.4 0.6 0.8x

0.2

0.4

0.6

0.8

y

0.2 0.4 0.6 0.8x

0.2

0.4

0.6

0.8

y

θ = −10 θ = −4 θ = −0.5

0.2 0.4 0.6 0.8x

0.2

0.4

0.6

0.8

y

0.2 0.4 0.6 0.8x

0.2

0.4

0.6

0.8

y

0.2 0.4 0.6 0.8x

0.2

0.4

0.6

0.8

y

θ = 0.5 θ = 4 θ = 10

Figure 3.3: Contour plots of the Frank copula densities

Example 3.5. Clayton Family

The Clayton family introduced by Clayton (1978) is given by

(3.23) CClaθ (u, v) = max([u−θ + v−θ − 1

]−1/θ, 0

), θ ∈ [−1,∞) \ 0.

The limiting cases are

(3.24) CCla−1 = W , CCla0 =π, CCla∞ = M.

The Clayton copulas, generated by

ϕClaθ (t) =1

θ(t−θ − 1)

are also absolutely continuous (cf. Nelsen (1999, Example 4.5.)) and positively ordered (cf. Nelsen(1999, Example 4.13.)).

Multivariate Extensions

In the multivariate case, there is no unique way how to extend the class of Archimedean copulas. Thecommonly considered construction uses the same idea as the bivariate case, i.e.

(3.25) Cn(u) = ϕ[−1](ϕ(u1) + · · · + ϕ(un))

Such functions are serial iterates of the bivariate copulas, since

Cn(u) = C(Cn−1(u1, . . . , un−1), un).

Page 43: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

3.5. EXAMPLES OF COPULAS 43

Unfortunately, such procedure does not always work, recall for instance the function Wn, which failsto be a copula for n ≥ 3, but can be constructed using the above technique with ϕ(t) = 1− t.Therefore, we have to focus on appropriate requirements on the generator ϕ, in order to get a mul-tivariate copula from (3.25). One possible way involves the derivatives of the pseudo-inverse of thegenerator.

Theorem 3.5.6. Let ϕ be a continuous strictly decreasing function on I to [0,∞] such that ϕ(0) = ∞and ϕ(1) = 0, and let ϕ−1 denote the inverse of ϕ. If Cn is the function from In do I given by (3.25),then Cn is an n-copula for all n ≥ 2 if and only if ϕ−1 is completely monotonic on [0,∞), i.e. if it iscontinuous there and has derivatives of all orders which alternate in sign, i.e.

(−1)kdk

dtkϕ[−1](t) ≥ 0

for all t in (0,∞) and all k ∈ N0.

Example 3.6. If θ is restricted to (0,∞), the generator of the Frank family is completely monotonic(c.f. Nelsen, 1999, Example 4.22) and the multivariate extension is given by

(3.26) CFraθ (u) = −1

θ

(1 +

(e−θu1 − 1)(e−θu2 − 1) · · · · · (e−θun − 1)

(e−θ − 1)n−1

), θ > 0.

Example 3.7. The generator of the Clayton family is strict for θ > 0 and completely monotonic on[0,∞) (cf. Nelsen (1999, Example 4.21.)). Hence, the n-dimensional Clayton family is given by

(3.27) CClaθ (u) =(u−θ1 + u−θ2 + · · · + u−θn − n+ 1

)−1/θ, θ > 0.

In both examples, the multivariate extensions lead only to copulas which are greater thenπ. Thefollowing corollary shows that this is unfortunately always the case with copulas obtained from (3.25).

Corollary 3.5.2. If the inverse ϕ−1 of a strict generator ϕ of an Archimedean copula C is completelymonotonic, then C π.

Proof. See Nelsen (1999, Corollary 4.6.3.).

However simple, the extension procedure (3.25) has its drawbacks. By means of the above Corollary,it yields only families whose members are greater than the independence copula and hence do notdescribe negative dependence (cf. Chapter 4). Furthermore, as can immediately be seen with (3.25),all k-margins of the so extended Archimedean copulas are also Archimedean with the same generatorand hence identical. In addition, the extension does not bring in any more parameters. Hence, thenumber of parameters involved does not increase with n (in fact, such families possess in general nomore than one or two parameters) and the space left for modeling becomes rather tight.

3.5.3 Algebraically Constructed Copulas

In this section, we present a family of copulas which is based on the idea of measuring ”dependence” ofcontingency tables using the cross, or odds ratio. Random vectors (X,Y ) whose distributions belongto the so called Plackett’s family of bivariate distributions introduced by Plackett (1965) distinguishthemselves by an ”odds ratio”,

P[X ≤ x, Y ≤ y]P[X > x, Y > y]

P[X ≤ x, Y > y]P[X > x, Y ≤ y],

Page 44: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

44 CHAPTER 3. COPULAS IN STATISTICAL SETTINGS

independent of (x, y). By letting θ describe this ”odds ratio”, this characteristics leads to the Plackett’sfamily of copulas:

CPlaθ (u, v) =[1 − (θ − 1)(u+ v)] −

√[1 + (θ − 1)(u+ v)]2 − 4uvθ(1− θ)

2(θ − 1), for θ > 0 & θ 6= 1

and

CPlaθ (u, v) = u · v, for θ = 1.

(3.28)

If θ passes to 0 and ∞ respectively, the limits of the Plackett’s family are the Frechet-Hoeffding boundsW and M respectively. Moreover, this family is also positively ordered and radially symmetric. Sincethe ”odds ratio” can be estimated directly from the data set, this family has been widely used in praxisand discussed in the literature (for further details, see Nelsen (1999, Section 3.3.1) and references giventherein).With (3.28) it is straightforward that members of this family are absolutely continuous. The followinggraphics illustrate the contour plots of the Plackett copula densities for different values of θ:

0.2 0.4 0.6 0.8x

0.2

0.4

0.6

0.8

y

0.2 0.4 0.6 0.8x

0.2

0.4

0.6

0.8

y

0.2 0.4 0.6 0.8x

0.2

0.4

0.6

0.8

y

θ = 0.1 θ = 0.2 θ = 0.5

0.2 0.4 0.6 0.8x

0.2

0.4

0.6

0.8

y

0.2 0.4 0.6 0.8x

0.2

0.4

0.6

0.8

y

0.2 0.4 0.6 0.8x

0.2

0.4

0.6

0.8

y

θ = 1.5 θ = 4 θ = 20

Figure 3.4: Contour plots of the Plackett copula densities

3.5.4 Shuffles of M

We conclude the examples of copulas by mentioning one more construction method, which is of greattheoretical use. The so called shuffles of M were introduced by Mikusinski et al. (1991) as copulasconstructed in four steps:

Page 45: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

3.5. EXAMPLES OF COPULAS 45

”...First, start with the mass distribution of M, which, recall, has its unit mass spreaduniformly on that diagonal of the unit square having positive slope. Second, cut the unitsquare vertically into a finite number of strips. Third, shuffle the strips with perhaps someof them flipped around their vertical axes of symmetry. Fourth, reassemble the strips toreform the unit square. This construction produces the mass distribution of a copula calleda shuffle of M...(Mikusinski et al., 1991, pp. 98)”

Definition 3.5.3. (Nelsen, 1999, Section 3.2.3.) Let n be a positive integer and J = J1, J2, . . . , Jna finite partition of [0, 1] into n closed subintervals. Furthermore, let π, π = π(1), π(2), . . . , π(n),denote a permutation on Sn = 1, 2, . . . , n and ω : Sn → −1, 1 a function where ω(i) is −1 or 1according to whether or not the strip Ji × [0, 1] is flipped. The resulted shuffle of M is denoted byM(n, J, π, ω). A shuffle of M with ω ≡ 1 is called a straight shuffle and a shuffle of M with ω ≡ −1is called a flipped shuffle. If the width of the Ji equals 1/n each, the shuffle is denoted as a regularshuffle.

The following graphics illustrates a regular shuffle of M,

C = M[6, [0, 1/6], [1/6, 2/6], . . . , [5/6, 6/6], (2, 4, 1, 6, 3, 5), (−1, 1, 1, 1,−1, 1)

].

Figure 3.5: A regular shuffle of M.

One of the most important results concerning shuffles of M is the following one noted by Mikusinskiet al. (1991):

Theorem 3.5.7. (Mikusinski et al., 1991, Theorem 3.1.) The shuffles of M are dense on the setof all copulas endowed with the supremum norm, i.e., for any ε > 0 and any copula C there exists ashuffle of M denoted by Cε, such that

(3.29) supu,v∈[0,1]

|Cε(u, v) − C(u, v)| < ε.

Remark 3.5.3. In particular, by means of Sklar’s Theorem and the above result, any bivariate jointdistribution function of mutually independent random variables can be approximated arbitrarily close

Page 46: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

46 CHAPTER 3. COPULAS IN STATISTICAL SETTINGS

by a joint distribution function of completely dependent5 random variables with the same marginals.Hence, it would be experimentally impossible to distinguish one situation from the other. (cf. Nelsen(1999, Section 3.2.3) and Mikusinski et al. (1991, Section 3.)).

5Complete or perfect dependence will be handled in the upcoming Section 4.1.

Page 47: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

Chapter 4

Dependence Concepts

After the original paper by Sklar (1959) had been published, it took a while until the statistical com-munity became aware of possible application of copulas. Actually, things first began to change in themid-seventies, when Berthold Schweizer came across the axiomatic definition of a measure of depen-dence by Renyi. It struck him then — what is a common knowledge today — that copulas are a verymighty tool for describing dependence between random variables. Together with Wolff he obtainedmeasures of dependence based on copulas and the Lp-distance. Thereafter the interest in copulas roseas it was discovered by various scholars that many dependence concepts can be expressed in termsof copulas only. For example, it was noted by Nelsen (c.f. Nelsen (1991) and further literature giventherein) that, for continuous random variables, the well-known measures of association, Kendall’s tauand Spearman’s rho, are functions of the corresponding copula alone.

The aim of this chapter will now be to give a brief introduction to the interaction between copulas anddependence structures. We will start with a description of perfect dependence, i.e. a counterpart toindependence, and proceed with nonparametric concepts such as quadrant and tail dependence, andmeasures of association.However, there are still two major difficulties: for one, the relationship between copulas and depen-dence holds mostly if and only if the marginals are continuous, and secondly, the results considerpredominantly the bivariate case, for possible multivariate extensions are either not clear or not pos-sible at all. This fact will also affect the contents of this chapter. If possible, we will formulate thedependence concepts in n-dimensions, but, especially when considering measures of association, we willrestrict ourselves to the bivariate case and mention the higher dimensions only briefly. Except for thefirst section, we will also assume the marginals to be continuous throughout this chapter, dedicatingthe upcoming one to the non-continuous case.For further results on the relationship between copulas and dependence, see e.g. Nelsen (1999), Joe(1997), Schweizer (1991) or Embrechts et al. (2002) and literature given therein.

4.1 Perfect Dependence

If we want to describe or measure dependence between random variables, we should know the mutualcounterpart to independence. On one hand, the independence is a clearly defined property, and, as iswell known, can be characterized in terms of copulas as follows:

Theorem 4.1.1. Let X = (X1, . . . , Xn) be a random vector. Then X1, . . . , Xn are independent if andonly if the independence copula πn

belongs to CX .

47

Page 48: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

48 CHAPTER 4. DEPENDENCE CONCEPTS

Yet the question of what shall be expected on the other side remains. By means of the Frechet-Hoeffding inequality (2.9) we know that copulas (and hence the dependence structure) are boundedby W and M. Thus, the Frechet-Hoeffding bounds seem to provide a natural description of thehighest dependence possible. On the other hand, there exist alternative approaches in the literature(cf. Hoeffding (1942)) considering two random variables to be ”perfect” dependent if and only if one isa.s. a (measurable) function of the other. In this section, we will revisit the Frechet-Hoeffding boundsand outline some of the fallacies connected with the rather theoretical alternative approach.As is often the case with copulas, things get more complicated and far less known if the dimensionexceeds two. Therefore, we will concentrate heavily on the bivariate case and carry over only a partof the results into the multivariate one.

4.1.1 Bivariate Perfect Dependence

First, we will give a probabilistic interpretation of the Frechet-Hoeffding bounds. To do so, we firstneed to define the concepts involved.

Definition 4.1.1. (Nelsen, 1999, Definition 2.5.1) A subset A of R2

is called nondecreasing if for any(x, y) and (u, v) in A, x < u implies y ≤ v.

A subset A of R2

is called nonincreasing if for any (x, y) and (u, v) in A, x < u implies y ≥ v.

Theorem 4.1.2. (Nelsen, 1999, Theorems 2.5.4. and 2.5.5.) Let X and Y be random variables withjoint distribution function H. Then

1. the Frechet-Hoeffding upper bound M belongs to CH if and only if the support of H is a nonde-

creasing subset of R2;

2. the Frechet-Hoeffding lower bound W is an element of CH if and only if the support of H is a

nonincreasing subset of R2.

Remark 4.1.1. Under the hypothesis of the above theorem and the assumption that both margins arecontinuous, the support of H has no horizontal or vertical line segments. Thus the underlying copulais the Frechet-Hoeffding upper bound if and only if X and Y are almost surely increasing functions ofeach other. In particular, if F and G denote the corresponding marginal distribution functions,

X = F−1 G(Y ) a.s. and Y = G−1 F (X) a.s.

Similarly, if X and Y are continuous, CH = W if and only if X and Y are a.s. decreasing functions ofeach other. Especially,

X = F−1 (1 −G)(Y ) a.s. and Y = G−1 (1 − F )(X) a.s.

Besides the above geometric approach, there also exists a stochastic one.

Theorem 4.1.3. (Embrechts et al., 2001, Theorem 2) Let (X,Y ) be a random vector with one of thecopulas W or M included in C(X,Y ). Then there exist two monotonic functions g, h : R → R and areal-valued random variable Z so that

(4.1) (X,Y ) ∼ (g(Z), h(Z)),

with g increasing and h decreasing in the former case and with both increasing in the latter. Theconverse result is also true.

Page 49: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

4.1. PERFECT DEPENDENCE 49

With this in mind, the following definition can be made:

Definition 4.1.2. (Yaari, 1987) X and Y are said to be countermonotonic, if a (possible) copula of(X,Y ) is W , and they are said to be comonotonic, if a (possible) copula of (X,Y ) is M.

Thus it is natural to call X and Y perfect dependent, if either W or M belongs to C(X,Y ). Being thisthe case, we will also say that X and Y are maximum negatively and positively dependent, respectively.

We conclude this subsection with some remarks on general functional dependence. To avoid unnec-essary complications, we will, for the rest of this subsection, deal with absolutely continuous randomvariables only. Suppose now that X and Y are random variables satisfying

(4.2) X = f(Y ) a.s.

for some (measurable) function f . To describe the kind of fallacies hidden behind this approach, wefirst need the following definition.

Definition 4.1.3. (Mikusinski et al., 1991) A function f is called strongly piecewise strictly monotone,if and only if there exist ai ∈ R, i = 1, . . . , n and bj ∈ R, j = 1, . . . ,m with

−∞ = a1 < a2 < · · · < an = ∞ and −∞ = b1 < b2 < . . . < bm = ∞

such that for any i, i = 1, . . . , n− 1 there exists exactly one ji such that

ran f |[ai,ai+1] ⊆ [bji , bji+1] and f |[ai,ai+1] strictly monotone

and ji 6= jk whenever 1 ≤ i 6= k ≤ n− 1.

Now, it can be shown that if the function f in (4.2) is strongly piecewise strictly monotone, thenthe copula belonging to X and Y will have a particular form.

Theorem 4.1.4. (Mikusinski et al., 1991) Let (X,Y ) be a pair of continuous real-valued randomvariables. Then the copula of (X,Y ) is a shuffle of M if and only if X and Y are strongly piecewisestrictly monotone functions of each other.

Since shuffles of M are dense in the set of all copulas, we have the following

Corollary 4.1.1. (Mikusinski et al., 1991) Given any ε and any random variables X, Y with jointdistribution function H and copula CH , there exist random variables X∗ ∼ X and Y ∗ ∼ Y with jointdistribution function H∗ and copula CH∗ and a strongly piecewise strictly monotone function f suchthat X∗ = f(Y ∗), and

‖H −H∗‖∞ < ε and ‖CH − CH∗‖∞ < ε

where ‖ · ‖∞ denotes the supremum norm.

Therefore, any measure of dependence based on a distance between the distribution functions orcopulas equivalent to ‖ · ‖∞ will not be able to detect whether this kind of functional dependenceoccurs or not.However, there surely exist numerous situations in practice when a certain kind of functional depen-dence is of interest. Although the task of determining the copula of random variables related to eachother via (4.2) is not easy in general, several special cases have been investigated in the literature sofar. One of such is the situation when X and Y with distribution functions F and G respectively arerelated via

Y = f(X) or X = g(Y ) a.s.

Page 50: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

50 CHAPTER 4. DEPENDENCE CONCEPTS

where f and g are nondecreasing functions on R satisfying

(f g)(y) < y whenever 0 < G(y) < 1,

and

(g f)(x) < x whenever 0 < F (x) < 1.

In this case, their copula is the so called generalized hairpin and vice versa (for further details seeMikusinski et al. (1991)).

4.1.2 Multivariate Perfect Dependence

In the multivariate case, the Frechet-Hoeffding lower bound fails to be a copula in the first place. There-fore, we only provide the multivariate generalization of the aforementioned stochastic interpretationof M (Theorem 4.1.2). Again, we need a definition first.

Definition 4.1.4. A subset A of Rn

is called nondecreasing , if for any x and y in A,

(4.3) ∀ 1 ≤ k ≤ n : xk < yk ⇒ xj ≤ yj , 1 ≤ j 6= k ≤ n.

The following Lemma characterizes nondecreasing sets in a way which will lead us to the desiredprobabilistic interpretation of the Frechet-Hoeffding upper bound.

Lemma 4.1.1. A subset A of Rn

is nondecreasing if and only if for any x in Rn

there exists a k suchthat

(4.4) ∀ u ∈ A : uk ≤ xk ⇒ uj ≤ xj , 1 ≤ j 6= k ≤ n.

Proof. For the ”if” part, suppose the above lemma does not hold although A is nondecreasing, i.e.that there exists a x in R

nsuch that, for any k, 1 ≤ k ≤ n, there exists a uk in A with

ukk ≤ xk ∧ ∃ jk 6= k : xjk < ukjk .

If we fix such a sequence u1, . . . ,un, there always exist uk1 and uk2 among its members which are inthe following position regarding to x:

uk1k1 ≤ xk1 ∧ xk2 < uk1k2 and uk2k2 ≤ xk2 ∧ xk3 < uk2k3

In such situation we get uk2k2 ≤ xk2 < uk1k2 and hence uk2k2 < uk1k2 . Because the vectors uk1 and uk2 lie in

a nondecreasing set, it follows that uk2 ≤ uk1 .If we now could find a sequence of vectors uk1 ,uk2 , . . . ,ukr chained together in a ”closed circle” in thesense of

uk1k1 ≤ xk1 ∧ xk2 < uk1k2 and uk2k2 ≤ xk2 ∧ xk3 < uk2k3 and . . . and ukr

kr≤ xkr

∧ xk1 < ukr

k1

we would get uk1 ≥ uk2 ≥ · · · ≥ ukr , especially uk1 ≥ ukr . But this is a contradiction, because, onthe other hand, we have uk1k1 ≤ xk1 < ukr

k1and thus uk1 ukr .

Hence it only remains to show that such a circle indeed exists. For this purpose, we transform thisquest in a search for a (directed) circle in a directed graph G = (V, ~E) defined as follows:

V := 1, . . . , n,(vi, vj) ∈ ~E ⇔ uvi

vi≤ xvi

∧ xvj< uvi

vj.

Page 51: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

4.1. PERFECT DEPENDENCE 51

Now, since any (directed) circle in this graph yields the one we wish to find, we have to show there existsone. This we prove by starting a directed walk in v1 = 1. From this position on, let p := v1, v2, . . . , vsbe the longest path possible visiting no vertex more than once. Because there always exists just oneedge leading from every vertex, p would not be of maximal length if there wouldn’t lead an edge fromvs into one of the vertices included in p. Hence, there must exist a circle.

For the ”only if” part, assume that A is not nondecreasing. Thus, we can find two points a andb in A such that

∃ k∗ 6= j∗ : ak∗ < bk∗ ∧ aj∗ > bj∗ .

Now if we examine the position of(a1+b1

2 , . . . , an+bn

2

), we can recognize that there exists no k that

would fit (4.4). Because for any k, we have either ak ≤ bk or ak > bk. The former case yields on one

hand ai ≤ ai+bi

2 and aj∗ >aj∗+bj∗

2 on the other, and hence a contradiction with (4.4) (for u = a). In

the latter case, the situation is analogous: we have bj ≤ aj+bj

2 and simultaneously bk∗ >ak∗+bk∗

2 , sothat (4.4) is again not true (for u = b).

With this lemma, the following can be shown:

Theorem 4.1.5. Let X = (X1, . . . , Xn) be a random vector with arbitrary marginals. Then theFrechet-Hoeffding upper bound Mn is included in CX if and only if the support of the joint distributionfunction of X is a nondecreasing set.

Proof. Throughout the proof, let H denote the joint distribution function and Fk the marginal distri-bution functions respectively. Then for any 1 ≤ k ≤ n and any x in R

n,

Fk(xk) = P[Xk ≤ xk] = P[X ≤ x] + P[ n⋃

j= 1j 6= k

Xk ≤ xk , Xj > xj]

= H(x) + P[ n⋃

j= 1j 6= k

Xk ≤ xk , Xj > xj]

Hence, CX is the Frechet-Hoeffding upper bound Mn if and only if, for one k,

P[ n⋃

j=1j 6= k

Xk ≤ xk, Xj > xj]

= 0.

In this case we have for the support of H

supp(H) ∩u|uk ≤ xk and uj > xj for one j, 1 ≤ j 6= k ≤ n

= ∅

and hence supp(H) is a nondecreasing set by means of the above Lemma 4.1.1.

Remark 4.1.2. Under the hypothesis of the above theorem and the assumption that all the univariatemargins are continuous, the underlying copula is the Frechet-Hoeffding upper bound if and only if everyXk is almost surely an increasing function of the remaining Xj ’s . This follows from the fact thatdistribution functions with continuous margins have a.s. no constant segments in direction parallel toeither of the axes.

As mentioned before, W fails to be a copula if the dimension exceeds two. However, when dealingwith multivariate dependence, one possible approach is to look at the dependencies of the bivariatemargins. From this point of view, we may speak of ”perfect” dependence whenever all bivariate marginshave this property. Hence, we will be interested in copulas with all bivariate margins equal to eitherW or M. The following theorem is a straightforward generalization of Theorem 4.1.3:

Page 52: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

52 CHAPTER 4. DEPENDENCE CONCEPTS

Theorem 4.1.6. Let X = (X1, X2, . . . , Xn) be a random vector with continuous margins and a copulaCX having all bivariate margins equal to either W or M. Then there exist a real-valued random variableZ and monotonic functions g1, g2, . . . , gn, gi : ranZ → R so that

(4.5) X ∼ (g1(Z), g2(Z), . . . , gn(Z)),

with gi increasing and gj decreasing or gi decreasing and gj increasing if the corresponding bivariatemargin is W and with both increasing or decreasing if the corresponding bivariate margin is M. Theconverse result is also true1.

In order to show this theorem, we will need the following result due to Dall’Aglio:

Lemma 4.1.2. (Dall’Aglio, 1959, Theorem VII and IX). Suppose X1, X2 and X3 are continuousrandom variables with distribution functions F1, F2 and F3. Then

1. if (X1, X2) ∼ M(F1, F2) and (X1, X3) ∼ M(F1, F3), then (X2, X3) ∼ M(F2, F3);

2. if (X1, X2) ∼ W(F1, F2) and (X1, X3) ∼ W(F1, F3), then (X2, X3) ∼ M(F2, F3).

Proof. [Theorem 4.1.6] The converse result follows readily with Theorem 4.1.3. Indeed, if Z is anumeric random variable and gi, gj some monotonic functions on ranZ, then for gi increasing, thistheorem yields (Xi, Xj) ∼ W(Fi, Fj) if gj decreasing and (Xi, Xj) ∼ M(Fi, Fj) if gj increasing. Nowsuppose that gi is decreasing. Then, by means of Proposition 3.1.2, the copula Cij of (Xi, Xj) is givenby

Cij(u, v) = v − Cij(1 − u, v),

where Cij denotes the copula corresponding to (−Xi, Xj). Because −gi is increasing, the results just

derived yield: if gj is increasing and decreasing, respectively, then Cij is the Frechet-Hoeffding upperand lower bound, respectively. Since

v −M(1 − u, v) =

v + u− 1, if 1 − u ≤ v

0, otherwise, and v −W(1 − u, v) =

u, if v − u ≥ 0

v, otherwise,

we indeed have (Xi, Xj) ∼ W(Fi, Fj) if gj increasing and (Xi, Xj) ∼ M(Fi, Fj) if gj decreasing.

For the ”only if” part, imagine X is a random vector complying with the theorem assumptions.Then F1(X1) =: Z is a uniformly distributed random variable and for any i = 2, . . . , n,

Xi = F−1i [Ti(Z)] a.s.,

where Ti(Z) ≡ Z if X1 and Xi are comonotonic and Ti(Z) ≡ 1 − Z otherwise (c.f. Remark 4.1.1).Hence,

X ∼ X :=(F−1

1 (Z), F−12 (T2(Z)), . . . , F−1

n (Tn(Z))).

With gi := F−1i Ti and g1 := F−1

1 , the only thing we now have to prove is that the relationship betweenthe monotonicity of the marginals and the transformations gi is indeed as stated in the theorem. Todo so, we first denote the bivariate marginals by X ij . Since F−1

i (z) is increasing and F−1i (1 − z)

decreasing, the statement is true for all marginals X1j , j = 2, . . . , n. Now consider a marginal Xij forsome i, j 6= 1 and suppose Xij ∼ M(Fi, Fj). Then, by means of Dall’Aglio’s Lemma 4.1.2, it followsthat either X1k ∼ M(F1, Fk) for k = i, j or X1k ∼ W(F1, Fk) for k = i, j. Hence, we have gi, gj

1The resulting multivariate distribution, whose bivariate margins are either comonotonic or countermonotonic iscalled an extremal distribution. Note that in Rn, there are 2n−1 such distributions with fixed (univariate) marginals.For further information and references dealing with linear correlation of such distributions, see Embrechts et al. (2002).

Page 53: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

4.2. NONPARAMETRIC DEPENDENCE CONCEPTS 53

increasing in the former case, and gi, gj decreasing in the latter. Finally, if Xij ∼ W(Fi, Fj), thenDall’Aglio’s Lemma yields either X1i ∼ M(F1, Fi) and X1j ∼ W(F1, Fj) (which implies gi increasingand gj decreasing) or X1i ∼ W(F1, Fi) and X1j ∼ M(F1, Fj) (which implies gi decreasing and gjincreasing). Hence, the proof is complete.

Remark 4.1.3. Note that for given marginals there exist precisely 2n distributions of X such that allbivariate margins of the corresponding copula are either M or W .

4.2 Nonparametric Dependence Concepts

In this section, we focus on how dependence can be described nonparametrically, i.e. in terms of thejoint distribution function. In addition, we will also see how these concepts can be expressed usingcopulas.Again, the dependence concepts themselves as well as the role played by n-copulas in the study ofsuch is far less well understood in the multivariate case. However, many of the bivariate dependenceproperties can be naturally extended to the multivariate case. Therefore, if sensible multivariateextensions exist and are not unnecessarily complicated, we will present the general case.Something similar is true about discontinuities of the marginals. Since this will be the subject of thefollowing chapter, our point here will be to focus on the dependence concepts as such and the essentialideas behind them. Thus, if necessary, we will restrict the observations to the case when all univariatemargins are continuous.

4.2.1 Orthant Dependence

First, we will examine the weakest univariate dependence concept, the so called orthant dependence,which is a multivariate generalization of the quadrant dependence (cf. Nelsen (1999, Section 5.2.1)).

Definition 4.2.1. (Lehmann, 1966) Let X = (X1, . . . , Xn) be an n-dimensional random vector.

1. X is positively lower orthant dependent (PLOD) if for all x = (x1, . . . , xn) in Rn,

(4.6) P [X ≤ x] ≥n∏

i=1

P [Xi ≤ xi].

2. X is positively upper orthant dependent (PUOD) if for all x = (x1, . . . , xn) in Rn,

(4.7) P [X > x] ≥n∏

i=1

P [Xi > xi].

3. X is positively orthant dependent (POD) if both (4.6) and (4.7) hold.

Negative lower orthant dependence (NLOD), negative upper orthant dependence (NUOD) and negativeorthant dependence (NOD) are defined analogously, by reversing the sense of the inequalities in (4.6)and (4.7).

In other words, X is PLOD and PUOD, respectively, if the joint distribution function is stochas-tically smaler and greater than the joint distribution function corresponding to independence, respec-tively. Now according to the lemma below, the orthant dependence concept can also be expressed interms of copulas only.

Page 54: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

54 CHAPTER 4. DEPENDENCE CONCEPTS

Lemma 4.2.1. Under the assumptions of the above definition and the condition that all margins ofX are continuous,

1. X is PLOD iff CX π and NLOD iff CX ≺π;

2. X is PUOD iff CX π and NUOD iff CX ≺ π.

This lemma together with Corollary 3.5.2 yields an important consequence for multivariate Archi-medean copulas:

Corollary 4.2.1. If C is a multivariate Archimedean copula constructed via (3.25), then C is positivelower orthant dependent.

Remark 4.2.1. In the bivariate case, (4.6) and (4.7) are equivalent and hence the concepts of lowerand upper orthant dependence coincide. POD and NOD, respectively, are then defined by either ofthe equations. By means of the above lemma,

• X is POD iff CX π;

• X is NOD iff CX ≺π.

In higher dimensions (as can easily be seen for example with the sieve formula), lower and upperorthant dependence concepts are no longer the same.

With the orthant dependence concept in mind, the following definition of the multivariate concor-dance can be made:

Definition 4.2.2. Let C1 and C2 be n-copulas, and let C1 and C2 denote their survival counterparts.

1. C1 is more PLOD than C2 if C1 C2;

2. C1 is more PUOD than C2 if C1 C2;

3. C1 is more POD than C2, or C1 is more concordant than C2, if C1 C2 and C1 C2.

If n = 2 we say that C1 is more concordant than C2, if C1 C2.

Example 4.1. With the Examples 3.4, 3.5, 3.2 and the section on Plackett’s distributions we havefor the copula families mentioned earlier, i.e. the Frank family CFraθ , θ ∈ (−∞,∞), the Claytonfamily CClaθ , θ ∈ [−1, 0) ∪ (0,∞), the Plackett family CPlaθ , θ ∈ (0,∞) and the Gauss familyCGaρ , θ ∈ (−1, 1), that two members satisfy Cθ1 Cθ2 (and Cρ1 Cρ2 respectively) if and only ifθ1 ≥ θ2 (and ρ1 ≥ ρ2 respectively).

4.2.2 Tail Dependence

In this section, we present the concept of tail dependence as given in Nelsen (1999, Definition 5.6.3)(this definition has been made earlier however, see e.g. Nelsen (1999, Section 5.6.)). In the following, ifA and B are nonempty disjoint subsets of 1, 2, . . . , n we denote by XA and XB the vectors (Xi|i ∈ A)and (Xi|i ∈ B) respectively. By a phrase such as ”nondecreasing in x” we mean nondecreasing in eachcomponent.

Definition 4.2.3. Let X = (X1, X2, . . . , Xn) be an n-dimensional random vector, and let the sets Aand B partition 1, 2, . . . , n.

1. XB is left tail decreasing in XA, LTD(XB |XA), if P[XB ≤ xB |XA ≤ xA] is nonincreasing inxA for all xB ;

Page 55: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

4.2. NONPARAMETRIC DEPENDENCE CONCEPTS 55

2. XB is right tail increasing in XA, RTI(XB |XA), if P[XB > xB |XA > xA] is nondecreasing inxA for all xB ;

Remark 4.2.2. In the bivariate case, i.e. for a random vector (X,Y ), there are only two ways how tochoose A and B. Hence, we say

• Y is left tail decreasing (LTD(Y |X)) in X , if P[Y ≤ y|X ≤ x] is a nonincreasing function of xfor all y,

• X is left tail decreasing (LTD(X |Y )) in Y , if P[X ≤ x|Y ≤ y] is a nonincreasing function of yfor all x,

• Y is right tail increasing (RTI(Y |X)) in X , if P[Y > y|X > x] is a nondecreasing function of xfor all y,

• X is right tail increasing (RTI(X |Y )) in Y , if P[X > x|Y > y] is a nondecreasing function of yfor all x.

With the above remark in mind, we can define the following kind of dependence:

Definition 4.2.4. Let X and Y be random variables. If LTD(Y |X), LTD(X |Y ), RTI(Y |X) orRTI(X |Y ) holds, then X and Y are called positively tail dependent.

In the bivariate case, it can be shown that tail dependence is stronger than quadrant (i.e. orthant)dependence:

Theorem 4.2.1. Nelsen (1999, Theorem 5.2.4.)Let X and Y be positively tail dependent randomvariables. Then they are also positively quadrant dependent.

Remark 4.2.3. As also mentioned in Nelsen (1999), the conversion of this theorem is not true, i.e. taildependence is not implied by the quadrant dependence.

We close this subsection by expressing tail monotonicity in terms of copulas. Again, this statementcan only be made for continuous random variables (cf. next chapter).

Theorem 4.2.2. Nelsen (1999, Theorem 5.2.5.) Let X and Y be continuous random variables withcopula C. Then

1. LTD(Y |X) holds if and only if for any v in [0, 1], C(u, v)/u is nonincreasing in u,

2. RTI(Y |X) holds if and only if for any v in [0, 1], [1 − u− v + C(u, v)]/(1 − u) is nondecreasingin u, or equivalently, if and only if C(1 − u, 1− v)/(1 − u) is nondecreasing in u.

LTD(X |Y ) and RTI(X |Y ) can be expressed analogously, by interchanging u and v in the above state-ments.

4.2.3 Likelihood Ratio Dependence and Other Concepts

Other nonparametric dependence concepts, mainly based on conditional probabilities, are possible,like for example stochastic monotonicity and corner set monotonicity. As we won’t need them further,we refer again to Nelsen (1999) and further references given therein.We close this section with an alternative dependence concept, based not on the joint distributionfunction but on the joint probability density function instead. The so called likelihood ratio dependence2

is defined as follows:2For further details and references on this dependence concept, see Nelsen (1999, Section 5.6.).

Page 56: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

56 CHAPTER 4. DEPENDENCE CONCEPTS

Definition 4.2.5. Let X be a continuous n-dimensional random vector with joint probability densityfunction fX . Then X is positively likelihood ratio dependent, if its joint probability density function ismultivariate totally positive of order two (MPT2), i.e.

fX(x ∨ y)fX(x ∧ y) ≥ fX(x)fX(y),

for all x and y in Rn, where (x∧y) and (x∨y) denotes the component-wise minimum and maximum,

respectively.

4.3 Measures of Association

In this section, we will give a brief introduction to measures of association, i.e. functions summingup dependence between random variables into a single number. Again, numerous concepts have beenproposed in the bivariate case. When there are more than two variables involved, the situation is farless explored and often questionable. Therefore, the outline of this section will be as follows: we willfocus on two variables first and list some of the typical problems occurring in the multivariate case atthe end. In addition, we will make the continuity restriction, i.e., throughout this section, we assumeall uniform marginals involved to be continuously distributed. How the concepts introduced belowcould be extended to the general case, will be handled in the next chapter.When measuring dependence, we have to make one important decision right at the beginning: either wewish the measure to distinguish between negative and positive dependence (measure of concordance),or not (measure of dependence).

4.3.1 Measures of Dependence

We begin our survey with a summary of properties any suitable measure of dependence should possess(however, they still somewhat differ in the literature).

Definition 4.3.1. (Schweizer and Wolff, 1981) Let L(Ω) denote a set of all real-valued and continuouslydistributed random variables on some probability space (Ω,A,P). A real-valued function δ operatingon L(Ω), i.e. δ : L(Ω) ×L(Ω) → R, satisfying

A1. δ(X,Y ) = δ(Y,X)(symmetry),

A2. 0 ≤ δ(X,Y ) ≤ 1 (normalization),

A3. δ(X,Y ) = 0 if and only if X and Y are independent,

A4. δ(X,Y ) = 1 if and only if X and Y are either comonotonic or countermonotonic,

A5. If f and g are strictly monotone on ranX and ranY respectively, then δ(f(X), g(Y )) = δ(X,Y )(invariance),

A6. If (Xn, Yn) are pairs of continuous random variables converging in law to (X,Y ) (X and Y againbeing continuous), then lim

n→∞δ(Xn, Yn) = δ(X,Y ),

is called a measure of dependence.

This definition differs slightly from the one given in (Schweizer and Wolff, 1981) (cf. also Embrechtset al. (2002)): Schweizer and Wolff include one additional axiom concerning the bivariate normal dis-tribution, i.e., if the joint distribution of X and Y is bivariate normal, then δ should be a strictlyincreasing function of the absolute value of the corresponding correlation coefficient %. We skip thisrequirement for two reasons. For one, such function has not been determined for the dependence

Page 57: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

4.3. MEASURES OF ASSOCIATION 57

measures proposed in the literature so far, except few. On the other hand, the correlation coefficientis a natural dependence parameter not only for the bivariate normal distribution (cf. elliptical distri-butions).The above axioms are also a modified version of those introduced by Renyi (1959). Perhaps the majordifference between his conditions and those stated above is A4. Originally, Renyi required δ equal oneif either X = f(Y ) or Y = g(X) for some Borel-measurable functions f and g. But as has been shownby various scholars (including Renyi himself), this axiom is too strong (cf. Schweizer and Wolff (1981)and further literature given therein). As to the general functional dependence, there exist one otherapproach by Hoeffding (1942). He too states axioms of a ”measure of dependence” and develops aconcept of the ”c-Abhangigkeit”, which is a kind of dependence based on conditional expectations andis, in a way, a theoretical opposite to independence. However, Hoeffding’s concepts nowadays seem tobe of rather theoretical use only.

When searching for measures which would comply with the above axioms, copulas prove themselvesto be an extremely useful tool. If the measure of dependence depends on X and Y throughout theircopula C only, i.e. δ(X,Y ) = δ(C), then A5 is satisfied whenever f and g are increasing and is easyto handle otherwise by Propositions 3.1.1 and 3.1.2. In addition, the third and the fourth axiom canalso be re-written in terms of copulas. As to the last axiom, under the continuity restriction, it isstraightforward that (Xn, Yn) converges weakly to (X,Y ) if and only if the corresponding copulas Cnconverge point-wise to the copula of (X,Y ) (c.f. Theorem 5.4.1). Hence, if we focus on the set of allbivariate copulas, C2, and find a metric d which is defined there, then, under some mild conditions,the normalized distance between the underlying copula and the independence copula yields a measureof dependence.

Proposition 4.3.1. If (X,Y ) is a random vector with continuous marginals and a copula C and d isa metric on the set of all bivariate copulas C2 satisfying

1. d(W ,π) and d(M,π) are equal;

2. d(C,π) as a function of C is maximal if and only if C equals W or M;

3. for any C ∈ C2, d(C,π) = d(C>,π) with C>(u, v) = C(v, u);

4. for any C ∈ C2, d(C,π) = d(C,π) with C(u, v) = v − C(1− u, v);

5. for any sequence of copulas Cn converging point-wise to C, d(Cn,π) converges to d(C,π)(which is the case especially if the point-wise convergence Cn → C implies d(Cn, C) → 0),

then a normalized distance from the independence, i.e. a function δd(C) given by

(4.8) δd(X,Y ) = δd(C) =d(C,π)

d(M,π),

yields a measure of dependence.

Proof. Since the copula of (Y,X) is C>, the symmetry condition is obviously satisfied by 3. Further-more, with 2. , d(C,π) ≤ d(M,π) and hence the range of δd is indeed [0, 1]. Since d is a metric,d(C,π) = 0 if and only if C = π and hence A3. By means of 2. the numerator of δd is maximal ifand only if C equals either M or W which, together with 1., yields A4. Because copulas are invariantunder strictly increasing transformations and the metric satisfies 4.,

(†) δd(X,Y ) = δd(f(X), g(Y )) for f strictly monotone and g increasing

Page 58: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

58 CHAPTER 4. DEPENDENCE CONCEPTS

by means of Propositions 3.1.1 and 3.1.2. If g is decreasing, then

δp(X,Y )symmetry

= δp(Y,X)(†)= δp(g(Y ), X)

symmetry= δp(X, g(Y ))

(†)= δp(f(X), g(Y )).

Finally, A6 follows straightforwardly with 5.

Remark 4.3.1. Note that since

(4.9)

v −W(1 − u, v) = v − max(1 − u+ v − 1, 0) = v − max(v − u, 0) =

v − (v − u), if v − u ≥ 0

v, otherwise=

=

u, if u ≤ v

v, otherwise= min(u, v) = M(u, v)

holds, the first requirement is implied by the fourth and hence can be omitted.

Still, Proposition 4.3.1 is not of much use as long as we don’t find a metric on C2 which wouldmeet its requirements. The following result will allow us to recognize that the Lp-distance is one suchmetric.

Lemma 4.3.1. Let C and C be bivariate copulas related to each other by

C(u, v) = v − C(1 − u, v) for any u, v ∈ I2.

Then, for any u, v in I2,

(4.10) M(u, v) − C(u, v) = C(1 − u, v) −W(1 − u, v).

Proof. The right hand side equals

C(1 − u, v) −W(1 − u, v) = v − C(1 − (1 − u), v

)−W(1 − u, v)

(4.9)= M(u, v) − C(u, v).

Example 4.2. Measures of Dependence Based on the Lp-distance

For 1 ≤ p <∞, a function δp(C) given by

(4.11) δp(C) =‖C −π‖p‖M−π‖p

where ‖ · ‖p denotes the Lp-norm is a measure of dependence by means of Proposition 4.3.1. Indeed,

the condition 1. is satisfied by (4.10) since π(1 − u, v) = π(1 − u, v). Moreover, since M> = Mand π>

=π, 3. holds as well. Finally, 2. is a straightforward consequence of the Frechet-Hoeffdinginequality, as is 5. by means of the Lebesgue Dominated Convergence Theorem. For 4., see Schweizerand Wolff (1981) and references given therein.δp has also been introduced and studied by Schweizer and Wolff (1981), who showed directly that itsatisfies all the axioms required from a measure of dependence. Moreover, for p = 1, 2, the normalizingconstants ‖M−π‖p have been determined according to

δ1(C) = 12

∫ 1

0

∫ 1

0

|C(u, v) − uv| du dv,(4.12)

δ2(C) =(90

∫ 1

0

∫ 1

0

(C(u, v) − uv)2 du dv)1/2

.(4.13)

Note that the quantity δ2 has also been introduced independently by Blum et al. (1961) and Hoeffding(1940a).

Page 59: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

4.3. MEASURES OF ASSOCIATION 59

Example 4.3. Measures of Dependence Based on the L∞-distance

In the same paper, Schweizer and Wolff (1981) also investigated the measure based on the L∞ distance.Determining the normalizing constant yields the quantity κ(C),

κ(C) = 4 sup(u,v)∈I2

|C(u, v) − uv|.

However, κ is not a measure of dependence in our sense, since it fails to satisfy the fourth axiom (seeSchweizer and Wolff (1981) and references given therein).

Since any dependence measure given by (4.8) is based on a distance between the correspondingcopula and the independence copula, it can also be used for testing independence (cf. e.g. Blum et al.(1961), Tjøstheim (1996), Deheuvels (1981) and Deheuvels (1979)). However, an estimation of suchquantities is generally complex3 and hence their use in practical applications is limited.

4.3.2 Measures of Concordance

If the quantity for measuring association between two random variables tells apart positive and negativedependence, then it is referred to as a measure of concordance (in this context, two pairs (x1, x2)and (y1, y2) (with distinct components) are said to be concordant, if (x1 − y1)(x2 − y2) > 0 anddiscordant otherwise). In order to describe concordance, we can proceed similarly as with dependencemeasures, i.e. provide an axiomatic definition by altering some of the dependence measure axioms.Since W and M represent maximum achievable negative and positive dependence, respectively, itseems reasonable to relax the condition A2 and interpret negative and positive values as ”negative”and ”positive” dependence, respectively. Furthermore, we should require the concordance measure toattain its minimum if and only if X and Y are countermonotonic; its maximum if and only if X andY are comonotonic. However, there is one major problem which must be taken into account:

Theorem 4.3.1. (Embrechts et al., 2002) Let X and Y denote real-valued random variables and T astrictly monotonic function on ranX. Then there is no quantity ρ satisfying simultaneously

• ρ(X,Y ) = 0 whenever X and Y are independent,

• ρ(X,Y ) = ρ(T (X), Y ) whenever T is increasing and ρ(X,Y ) = −ρ(T (X), Y ) whenever T isdecreasing.

It is now common to relax the first condition.

Definition 4.3.2. (Scarsini, 1984) Let L(Ω) denote a set of all real-valued continuous random variableson some probability space (Ω,A,P). A real-valued function ρ operating on L(Ω), i.e. ρ : L(Ω)×L(Ω) →R, satisfying

A1. ρ(X,Y ) = ρ(Y,X)(symmetry),

A2. −1 ≤ ρ(X,Y ) ≤ 1 (normalization),

A3. ρ(X,Y ) = 0 if X and Y are independent,

A4. ρ(X,Y ) = 1 if and only if X and Y are comonotonic and ρ(X,Y ) = −1 if and only if X and Yare countermonotonic,

A5. If T is strictly monotone on ranX , then

ρ(T (X), Y ) =

ρ(X,Y ), if T increasing,

−ρ(X,Y ), if T decreasing,

3For some asymptotic and nonparametric results, see Tjøstheim (1996).

Page 60: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

60 CHAPTER 4. DEPENDENCE CONCEPTS

A6. If (Xn, Yn) are pairs of (continuous) random variables converging in law to (X,Y ) (X and Ybeing continuous), then lim

n→∞ρ(Xn, Yn) = ρ(X,Y ),

A7. If (X1, Y1) is stoch. smaller than (X2, Y2), i.e. if the corresponding copulas C1 and C2 satisfyC1 C2, then

ρ(X1, Y1) ≥ ρ(X2, Y2),

is called a measure of concordance.

Remark 4.3.2. Note that if ρ is a measure of concordance, then, however reasonable, ‖ρ‖ is never ameasure of dependence because of the axiom A3 and Theorem 4.3.1.

One way to obtain measures of concordance is to focus on the differences Q,

Q = P[concordance] − P[discordance],

where the concordance and discordance, respectively, is measured between (X,Y ) and some othersuitably chosen random vector (X∗, Y ∗). As was the case with dependence measures, copulas playan important role here. As soon as the measure depends on X and Y through the copula only, therequirements A4-A5 and A7 will be particularly easy to handle. Therefore, we now focus on howconcordance of random variables can be expressed in terms of copulas (recall that only continuousrandom variables will be considered throughout this section).

Theorem 4.3.2. (Nelsen, 1999, Theorem 5.1.1.) Let (X1, Y1) and (X2, Y2) be independent and con-tinuously distributed random vectors with common margins F (for X1 and X2) and G (for Y1 andY2). Furthermore, let C1 and C2 denote the copulas (not necessarily equal) of (X1, Y1) and (X2, Y2),respectively. Let Q denote the difference between the probabilities of concordance and discordance of(X1, Y1) and (X2, Y2), i.e. let

(4.14) Q = P[(X1 −X2)(Y1 − Y2) > 0] − P[(X1 −X2)(Y1 − Y2) < 0].

Then Q depends on the random vectors through their copulas only, in particular

(4.15) Q = Q(C1, C2) = 4

∫ 1

0

∫ 1

0

C2(u, v) dC1(u, v) − 1 = 4

∫ 1

0

∫ 1

0

C1(u, v) dC2(u, v) − 1.

With (4.15) it is straightforward to see that Q is symmetric, i.e. Q(C1, C2) = Q(C2, C1) and satisfies

Q(C1, C2) ≤ Q(C∗1 , C∗

2) whenever C1 ≺ C∗1 & C2 ≺ C∗

2 .

Kendall’s Tau

By setting C1 and C2 equal, we obtain a widely studied quantity called Kendall’s tau:

Definition 4.3.3. Let (X,Y ) be a real-valued random vector and (X∗, Y ∗) its independent copy.Then Kendall’s tau for X and Y is defined by

(4.16) ρτ (X,Y ) = P [(X −X∗)(Y − Y ∗) > 0] − P [(X −X∗)(Y − Y ∗) < 0]

With Theorem 4.3.2, we have the following expression for Kendall’s tau:

Corollary 4.3.1. Let X and Y be continuous random variables with copula C. Then Kendall’s taufor X and Y is given by

(4.17) ρτ (X,Y ) = ρτ (C) = 4

I2

C(u, v)dC(u, v) − 1.

Page 61: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

4.3. MEASURES OF ASSOCIATION 61

According to (4.17), ρτ can also be viewed as 4 E(C(U, V ))−1 where (U, V ) is a random vector withuniform margins and copula C. The following theorem shows that Kendall’s tau is indeed in agreementwith the concordance measure axioms:

Theorem 4.3.3. (Embrechts et al., 2002)Kendall’s tau, ρτ , is a measure of concordance in the senseof Definition 4.3.2.

Example 4.4. Kendall’s tau for Archmimedean copulas

Let X and Y be continuous random variables with an Archimedean copula C generated by ϕ. ThenKendall’s tau for X and Y is given by

(4.18) ρτ (C) = 1 + 4

∫ 1

0

ϕ(t)

ϕ′(t)dt.

For a proof of this useful statement, see Nelsen (1999, Corollary 5.1.4.). For example, for members ofthe Clayton family, CClaθ with θ ∈ [−1,∞) \ 0, we have (c.f. Nelsen, 1999, Example 5.4.)):

ρτ (CClaθ ) =θ

θ + 2.

Example 4.5. Kendall’s tau for Elliptical copulas

Let X and Y be continuous random variables distributed according to an elliptical distribution, i.e.(X,Y ) ∼ En(µ, %, φ), % being the correlation coefficient of X and Y . Then

(4.19) ρτ (X,Y ) =2

πarcsin %.

This result, due to Lindskog et al. (2001), is a generalization of the well-known result on how Kendall’stau and the linear correlation coefficient are related for bivariate normal distributions.

Spearman’s Rho

Another measure of concordance can be obtained by setting C2 = π in (4.15). As we will see later,this quantity equals the well-known Spearman’s rank correlation coefficient.

Definition 4.3.4. Let (X,Y ) and X∗, Y ∗ be independent copies of X and Y , respectively. Fur-thermore, let X∗ and Y ∗ themselves be independent. Then Spearman’s rho for X and Y is definedby

(4.20) ρS(X,Y ) = 3(P [(X −X∗)(Y − Y ∗) > 0] − P [(X −X∗)(Y − Y ∗) < 0]

)

Remark 4.3.3. The factor ”3” in the above definition is a normalization constant. It is due to the factthat ρS belongs to [− 1

3 ,13 ] for continuous random variables (cf. Nelsen (1999, Example 5.1.)).

As is the case with Kendall’s tau, ρS can be expressed in terms of copulas only.

Corollary 4.3.2. Let X and Y be continuous random variables with copula C. Then Spearman’s rhofor X and Y is given by

(4.21) ρS(X,Y ) = ρS(C) = 12

I2

C(u, v) du dv − 4 = 12

∫ ∫

I2

[C(uv) − uv] du dv.

Page 62: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

62 CHAPTER 4. DEPENDENCE CONCEPTS

Hence, Spearman’s rho can be interpreted as the volume between the copula C and the independenceon one hand, and as a volume under the graph of the copula scaled to lie in the interval [−1, 1] on theother. Furthermore, ρS for a pair of continuous random variables X and Y (with distribution functionsF and G, respectively) can be re-written as

ρS =E(UV ) − E(U) E(V )√

VarU√

VarV,

with U = F (X) and V = G(Y ). Hence, ρS is identical with the common Spearman’s rank correlationcoefficient. Again, Spearman’s rho is a measure of concordance in the sense of our axiomatic definition.

Theorem 4.3.4. (Embrechts et al., 2002) Spearman’s rho, ρS, is a measure of concordance in thesense of Definition 4.3.2.

Although both Kendall’s tau and Spearman’s rho are constructed similarly, their values for thesame pair of random variables can be quite different. In addition, there is no function which wouldrelate one measure to the other in general. For example Kendall’s tau is invariant in a class of ellipticaldistributions with fixed correlation coefficient. But on the contrary, this is no longer true for Spearman’srho (for a counterexample, see Hult and Lindskog (2001)). However, there exist several inequalitieswhich estimate the difference between Kendall’s tau and Spearman’s rho. For more information onthis subject, see Nelsen (1999, Section 5.1.3.) and references given therein.

Distance-based Measures of Concordance

When discussing dependence measures, we showed how they can be generated from some (suitable)metric on the set C2. It is now possible to use a similar approach in order to obtain concordancemeasures.

Proposition 4.3.2. If d is a metric on the set of all bivariate copulas, C2, satisfying

1. d(W ,π) and d(M,π) are equal;

2. for any C1 and C2 in C2, d(C1, C2) ≤ d(M,W);

3. for any C ∈ C2, d(C,W) = d(C>,W) and d(C,M) = d(C>,M) with C>(u, v) = C(v, u);

4. for any C ∈ C2, d(C,W) = d(C,M) and d(C,M) = d(C,W) with C(u, v) = v − C(1 − u, v);

5. for any sequence of copulas Cn converging point-wise to C, d(Cn,W) converges to d(C,W) andd(Cn,M) to d(C,M) (which is the case especially when Cn → C implies d(Cn, C) → 0);

6. for any C1 and C2,

d(C1W) ≤ d(C2,W) and d(C1M) ≥ d(C2,M)

whenever C1 ≺ C2.

then the quantity ρd(C) given by

(4.22) ρd(C) =d(C,W) − d(C,M)

d(M,W)

is a measure of concordance in the the sense of Definition 4.3.2.

Page 63: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

4.3. MEASURES OF ASSOCIATION 63

Proof. If (X,Y ) has copula C, then the copula of (Y,X) is C> and hence A1 is satisfied by 3. Secondly,the triangle inequality yields

|d(C,M) − d(C,W)| ≤ d(M,W)

and thus the range of ρd is indeed [−1, 1]. If X and Y are independent, then, because of 1., ρd iszero. In addition, if C = M and W , respectively, then obviously ρd = 1 and −1, respectively. On theother hand, if C 6= M, then, because d is a metric, d(C,M) > 0 and hence, by means of 6., ρd(C) < 1.Similarly, if C 6= W , then d(C,W) > 0 and hence, by means of 6., ρd(C) > −1. If T is strictlyincreasing, then the copula of (T (X), Y ) is again C which in turn implies ρd(T (X), Y ) = ρd(X,Y ). Onthe other hand, if T is decreasing, then the copula of (T (X), Y ) is C by means of Proposition 3.1.2and ρd(T (X), Y ) = ρd(C) is equal to −ρd(C) because of 4. Finally, A6 and A7 follow straightforwardlywith 5. and 6, respectively.

Remark 4.3.4. Note that since

(4.23) π(u, v) = v −π(1 − u, v) = v − (1 − u)v = uv =π(u, v)

holds, the first requirement of Proposition 4.3.2 is implied by the fourth and hence can be omitted.Furthermore, Proposition 4.3.2 does not require the second assumption. The reason why we included ithere is because the metric d should also yield a measure which possesses a meaningfull interpretation.Hence, it seems reasonable to require that the ”distance” between any two copulas does not exceedthe ”distance” between positive and negative dependence.

Hence, ρd is based on the difference between the distance from the Frechet-Hoeffding bounds. But,in order to make use of Proposition 4.3.2, we need to find a fitting metric on C2. As was the case withdependence measures, it is possible to show that the Lp distance is again a suitable choice by meansof Lemma 4.3.1.

Example 4.6. Measures of Concordance Based on the Lp-distance

With Lemma 4.3.1 and Proposition 4.3.2 it is now easy to see that any Lp distance, 1 ≤ p <∞, i.e.

d(C1, C2) = ‖C1 − C2‖p =(∫

I2

|C1(u, v) − C2(u, v)|p du dv)1/p

,

yields a measure of concordance ρp(C),

ρp(C) =‖W − C‖p − ‖M− C‖p

‖M−W‖p.

1. For any u and v in I2, by means of Lemma 4.3.1,

M(u, v) −π(u, v) = π(1 − u, v) −W(1 − u, v) = v − (1 − (1 − u))v −W(1 − u, v)

= v(1 − u) −W(1 − u, v) =π(1 − u, v) −W(1 − u, v)

and hence ‖W −π‖p = ‖M−π‖p.

2. For any C1 and C2 in C2 and any u, v in the unit square,

|C1(u, v) − C2(u, v)| ≤ M(u, v) −W(u, v)

and hence 2. is satisfied.

3. 4. is satisfied by means of Lemma 4.3.1,

Page 64: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

64 CHAPTER 4. DEPENDENCE CONCEPTS

4. 3. and 6. are trivial, whereas 5. follows with the Lebesgue Dominated Convergence Theorem.

We will now show that for the L1-distance, this approach yields Spearman’s rho. For this purpose, wewill need the following three integrals, which follow as special cases of the upcoming Theorem 4.3.5.

I2

M(u, v) du dv =1

3,(4.24)

I2

π(u, v) du dv =1

4,(4.25)

I2

W(u, v) du dv =1

6.(4.26)

With those results, we have on one hand ‖M−W‖1 = 1/3 − 1/6 = 1/6. On the other, ‖C −W‖1 −‖C −M‖1 simplifies as follows:

‖C −W‖1 − ‖C −M‖1 =

I2

(C(u, v) −W(u, v)

)du dv −

I2

(M(u, v) − C(u, v)

)du dv

=

I2

(C(u, v) −W(u, v)

)du dv +

I2

(C(u, v) −M(u, v)

)du dv

=

I2

(2C(u, v) −W(u, v) −M(u, v)

)du dv

=

I2

2C(u, v) du dv −(1

3+

1

6

)

= 2(∫

I2

C(u, v) du dv − 1

4

)= 2

(∫

I2

(C(u, v) −π(u, v)

)du dv

)

And hence,

(4.27) ρL1=

‖C −W‖1 − ‖C −M‖1

‖M−W‖1= 12(

I2

(C(u, v) −π(u, v)

)du dv

)= ρS .

A similar idea is also used by another well-known concordance measure, the so called Gini’s gamma;see Nelsen (1999, Section 5.1.4) for further details. We close this section with one more family ofconcordance measures obtained by Scarsini (1984), featuring Spearman’s rho and Blomquist’s medialcorrelation coefficient4 as special cases.

Example 4.7. Scarsini’s Measures of Concordance

Let ψ be a bounded, monotone and odd function defined on [−1/2, 1/2]. Then δψ(X,Y ) given by

(4.28) δψ(X,Y ) = k

1∫

0

1∫

0

ψ(u− 1

2

(v − 1

2

)dCXY (u, v), k =

1∫ 1

0ψ2(u− 1

2 ) du

is a measure of concordance in the sense of Definition 4.3.2 (for a proof, see Scarsini (1984, Theorem 4)).Now, ψ(x) ≡ x yields Spearman’s rho and ψ(x) ≡ sign(x) Blomquist’s medial correlation coefficient.But neither Kendall’s tau nor Gini’s gamma can be expressed according to (4.28) (cf. Scarsini (1984)).

4For a definition and further references on this quantity, see e.g. Nelsen (1999).

Page 65: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

4.3. MEASURES OF ASSOCIATION 65

4.3.3 Multivariate Extensions

There are several possibilities how to carry over concordance and dependence measures to higher di-mensions. For one part, we can concentrate on the bivariate margins and examine their dependenceusing the dependence concepts and measures mentioned above. Thereafter, it is surely possible toproceed analogously as with multivariate correlation, i.e. to store the resulting quantities into a n× ndependence or concordance matrix.On the other hand, multivariate counterparts of several of the aforementioned measures exist. Theaxioms can be formulated for multivariate random vectors5, but, however, this is not as straightforwardas it may seem. The main difficulty is due to the fact, that the Frechet-Hoeffding lower bound is nolonger a copula. Hence, we have to alter axiom A4, i.e. determine in what cases the measure attainsits bounds (if ever).

Theoretically, the construction principle of distance-based dependence and concordance measures cananalogously be used in the multivariate case. But this approach is again not without inconsistencies.One of the reasons why the multivariate case is so ambiguous is given by the following result.

Theorem 4.3.5. If n ≥ 2, then

In

Mn(u) du1 du2 · · · dun =1

n+ 1,(4.29)

In

πn(u) du1 du2 · · · dun =

1

2n,(4.30)

In

Wn(u) du1 du2 · · · dun =1

(n+ 1)!.(4.31)

Proof. The proof of this theorem uses a standard calculus and can be found in the Appendix on page139.

This theorem has an important consequence, which, among other things, shows that the set C ofall copulas ”shrinks” with rising dimension.

Corollary 4.3.3. For n ≥ 3,

(4.32)

In

(Mn(u) −πn

(u))du1 du2 · · · dun 6=

In

(πn(u) −Wn(u)

)du1 du2 · · · dun

Furthermore,

limn→∞

In

(Mn(u) −Wn(u)

)du1 du2 · · · dun = 0,(4.33)

limn→∞

∫In

(πn(u) −Wn(u)

)du1 du2 · · · dun∫

In

(Mn(u) −πn

(u))du1 du2 · · · dun

= 0.(4.34)

Proof. Theorem 4.3.5 yields:

In

(Mn(u) −πn

(u))du1 du2 · · · dun =

1

n+ 1− 1

2n,

In

(πn(u) −Wn(u)

)du1 du2 · · · dun =

1

2n− 1

(n+ 1)!.

5See e.g. Wolff (1981)

Page 66: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

66 CHAPTER 4. DEPENDENCE CONCEPTS

Hence, (4.32) and (4.33) follow straightforwardly. (4.34) is easy-going as well:

limn→∞

∫In

(πn(u) −Wn(u)

)du1 du2 · · · dun∫

In

(Mn(u) −πn

(u))du1 du2 · · · dun

= limn→∞

12n − 1

(n+1)!

1n+1 − 1

2n

= limn→∞

n+12n − 1

n!

1 − n+12n

= 0

By means of this corollary, the ”distance” between Mn and πnis much bigger than between πn

and Wn. Hence, none of the Lp measures, 1 ≤ p ≤ ∞, will satisfy the condition

• d(W ,π) and d(M,π) are equal.

With (4.32), a suitable metric on C will possibly not meet this requirement either (and if it does,it is a question whether it possesses a meaningful statistical interpretation). Hence, the definitionof a concordance measure as a distance between the Frechet-Hoeffding bounds may be questionable.Furthermore, the standardization in (4.8) is not that clear either. However, since d(M,π) wouldbe bigger than d(W ,π), it is still common to construct the dependence measures according to (4.8).For example, the L1-distance yields the n-dimensional version of Schweizer and Wolff’s dependencemeasure δ1:

(4.35) δn1 (C) =2n(n+ 1)

2n − (n+ 1)

In

|C(u) − u1u2 · · ·un|du1du2 . . . dun.

See Wolff (1981) for more details.

4.3.4 Tail Dependence Index

Tail dependence index is a popular quantity for measuring dependence especially in insurance andfinance. Since it captures dependencies in the upper-right-quadrant tail and the lower-left-quadranttail of a bivariate distribution respectively, it is of particular use when studying dependence structureof extreme values.

Definition 4.3.5. Let X and Y be continuous random variables with distribution functions F and Grespectively. The coefficient of upper tail dependence λU and the coefficient of lower tail dependenceλL are given by

λU = limu1

P [Y > G−1(u)|X > F−1(u)],(4.36)

λL = limu0

P [Y < G−1(u)|X < F−1(u)].(4.37)

provided the limits exist.

It can be seen that these quantities are again based on the underlying copula. Since the conditionalprobabilities can be expressed in terms of copulas only, we have the following

Lemma 4.3.2. (Embrechts et al., 2002) If C is a bivariate copula such that the upper and lower taildependence indices exist, then

(4.38) λU = limu1

1 − 2u+ C(u, u)

1 − u= lim

u0

C(u, u)

u

and

(4.39) λL = limu0

C(u, u)

u

Page 67: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

4.3. MEASURES OF ASSOCIATION 67

From the above Lemma it clearly follows that the coefficient of upper tail dependence of C is thecoefficient of lower tail dependence of C and vice versa. Hence, if C is radially symmetric, λL andλU coincide. Because the tail dependence index depends on X and Y throughout the correspondingcopula only, the axioms A1 and A2 of Definition 4.3.1 are satisfied (provided the dependence indexexists). The tail dependence index also remains invariant under strictly increasing transformationsaccording to Proposition 3.1.1. Since both λU and λL depend on C only locally, A3 and A4 are notnecessarily true. A3 holds in one direction, i.e. if X and Y are independent, then they are upper andlower tail independent. The contrary, however, is not true as can be seen with the following example.

Example 4.8. Tail Dependence of Gaussian and t-copulas It is well-known (see e.g. Embrechtset al. (2002) and references given therein) that Gaussian copulas show no dependence in the tail:

λU (CGaρ ) = 2 limx→∞

Φ(x√

1 − ρ√1 + ρ

) = 0 for ρ < 1,

where Φ denotes the survival function of the standard normal distribution function. With t-copulas,the situation is different. It can be shown (see again Embrechts et al. (2002) and references mentionedtherein) that

λU (Ctρ) = 2tν+1

(√ν + 1

√1 − ρ√

1 + ρ

), for ρ > −1,

where tν+1 denotes the survival function of the t distribution with ν + 1 degrees of freedom. Hence,t-copulas are tail dependent. The different behavior in the tail also enables to distinguish the familyof Gaussian and t-copulas.

As to the axiom A4, for M the tail upper and lower tail dependence index equal 1. For W , however,λU and λL are zero. This also shows that neither of the tail dependence indexes is invariant undermonotone transformations in general.

4.3.5 Linear Correlation

Another popular quantity for measuring dependence between two random variables is Pearson’s linearcorrelation coefficient defined as follows:

Definition 4.3.6. Let X and Y be random variables such that Var(X) and Var(Y ) exist and aregreater than zero. The linear correlation coefficient is given by

(4.40) %(X,Y ) =Cov(X,Y )√

Var(X) Var(Y ).

It is well-known that

• % is zero for independent random variables;

• % ∈ [−1, 1] and equals ±1 if and only if X and Y are perfect linear dependent, i.e. Y = aX + balmost surely for some a, b ∈ R, a 6= 0 (it is worth noting that in the case of perfect lineardependence the distribution of Y is fully determined by that of X);

• % is invariant under increasing linear transformations, i.e. if X∗ = aX + b and Y ∗ = cY + d forsome b, d ∈ R and a, c ∈ (0,∞), then %(X,Y ) = %(X∗, Y ∗).

It is easily seen that % does not depend on the underlying copula alone and is therefore influenced bythe marginal distributions as well. However, the following result due to Hoeffding (1940a) suggeststhat the role played by copulas in this setting will nevertheless be important:

Page 68: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

68 CHAPTER 4. DEPENDENCE CONCEPTS

Theorem 4.3.6. Hoeffding’s Lemma. Let (X,Y ) be a bivariate random vector with joint distri-bution function H, copula6 C and marginal distribution functions F and G respectively. Furthermore,assume that the expectations E |XY |, E |X | and E |Y | are all finite. Then the covariance between Xand Y can be expressed in the following way:

Cov(X,Y ) =

R2

(H(x, y) − F (x)G(y)

)dx dy

=

R2

(C(F (x), G(y)) −π(F (x), G(y))

)dx dy

(4.41)

Proof. For a proof, see e.g. Dhaene and Groovaerts (1996).

With this representation, the following properties of % follow:

Theorem 4.3.7. (Hoeffding, 1940a),(Embrechts et al., 2002) Let (X,Y ) be a bivariate random vectorwith fixed marginal distribution functions, F and G respectively, and unspecified copula. Furthermore,suppose that 0 < Var(X),Var(Y ) <∞ holds. Then

1. The set of all possible correlations is a closed interval [%min, %max] with %min < 0 < %max.

2. If C1 and C2 are copulas, then the relation C1 ≺ C2 yields %C1(X,Y ) ≤ %C2

(X,Y ) (where %Ci(X,Y )

denotes the correlation coefficient of X and Y if the underlying copula is Ci);

3. The extremal correlations %min and %max are attained if and only if X and Y are countermono-tonic and comonotonic respectively.

4. %max = 1 if and only if X and Y are increasing linear transformations of each other, and%min = −1 if and only if X and Y are decreasing linear transformations of each other.

Proof. The second statement follows with the Hoeffding’s lemma and the Frechet-Hoeffding inequality;the proof of the remaining results can be found in Embrechts et al. (2002).

Remark 4.3.5. Note that %max is not necessarily equal to −%min. To see this, assume X and Y areBernoulli distributed random variables, i.e. X ∼ B(p) and Y ∼ B(q). As we will calculate in Example7.3, the correlation coefficient equals

%M =

√p(1−q)√q(1−p)

if p ≤ q,√q(1−p√p(1−q)

otherwise.and %W =

√pq√

(1−p)(1−q)if p ≤ 1 − q,

√(1−p)(1−q)√

pq otherwise.

if X and Y are comonotonic and countermonotonic respectively. Setting p = 1/4 and q = 2/3 yields:

%M = %max =

√1/4 · 1/3√2/3 · 3/4

%W = %min = −√

1/4 · 2/3√1/3 · 3/4

=

√1/12√6/12

= −√

2/12√3/12

=

√1

6= −

√2

3

so that we clearly have %max 6= −%min.However, %max(X,Y ) corresponds to −%min(X,−Y ). Indeed, by the properties of linear correlation, we

6If X and Y are arbitrary, then C means some copula in CH .

Page 69: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

4.3. MEASURES OF ASSOCIATION 69

have on one hand side that %(X,−Y ) is equal to −%(X,Y ). On the other, if X and Y are comonotonic,then (e.g. according to Theorem 4.1.3), X and −Y are countermonotonic and hence

%min(X,−Y ) = %(X,−Y ) = −%(X,Y ) = −%max(X,Y ).

It is now obvious with Theorem 4.3.7 that % does not satisfy all axioms for a concordance measure,in fact only A1, A2, A3 and A7, whereas A4 and A5 hold only for linear transformations. A6 does nothold either (cf. Schweizer and Wolff (1981)).

With Sklar’s Theorem, it is now possible to obtain families of distributions (by choosing a suitablecopula family) such that the correlation coefficient ranges over [%min, %max]. Moreover, some of themhave the property that their parameter and % are simply related. We will mention the following few7.

Example 4.9. The Frechet and Mardia family. Let α, β be in [0,1] with α + β ≤ 1. Thetwo-parameter family of copulas

(4.42) CFreα,β (u, v) = αM(u, v) + (1 − α− β)π(u, v) + βW(u, v)

is called the Frechet family. This family includes the Frechet-Hoeffding bounds as well as the inde-pendence copula and yields, upon setting α = 0 and β > 1 a one-parameter family with negativecorrelation, and, upon setting β = 0 and α > 0 a one-parameter family with positive correlation. Inboth cases the correlation is a function of the corresponding parameters, i.e. %α,β = α%max + β%min.

Another one-parametric family can be built by substituting α and β according to α = θ2

2 (1 − θ) and

β = θ2

2 (1 + θ) for θ ∈ [−1, 1]:

(4.43) CMarθ (u, v) =

θ2

2(1 − θ)M(u, v) + (1 − θ2)π(u, v) +

θ2

2(1 + θ)W(u, v).

This so-called Mardia family (cf. Nelsen (1999), p. 12) again includes the Frechet-Hoeffding boundsas well as the independence copula. The correlation of its members ranges over the whole interval

[%min, %max] and is given by %α,β = θ2

2 (1 − θ)%max + θ2

2 (1 + θ)%min. Note that the members of neitherthe Frechet nor the Mardia family are ordered or possess densities.

Example 4.10. The Frank family (cf. Example 3.4), the Plackett’s family (cf. Section 3.5.3), theClayton family (cf. Example 3.5) as well as the Gauss and t-copula families (cf. Examples 3.2 and 3.3)are all positively ordered and include (possibly as limiting cases) the Frechet-Hoeffding bounds. Hence,the correlation coefficient ranges over [%min, %max] (and attains the bounds at least asymptotically).Furthermore, if C1 and C2 are members of one of the aforementioned families satisfying C1 ≺ C2,then %C1

(X,Y ) ≤ %C2(X,Y ). However, the way how the correlation coefficient depends on the copula

parameters is in general not easy to determine (if there exists a close expression at all). Note thateven if the Gauss or the t-copula family is chosen and the marginals are not normally or t-distributedrespectively, the correlation coefficient and the copula parameter ρ are not necessarily equal (or relatedby a close expression).

7Note that the upcoming Frechet and Mardia families are special examples of mixture distributions. Such distributionsare in general characterized by the simple relation between the linear correlation coefficient and copula parameters. Formore details on this subject, see e.g. Embrechts et al. (2002).

Page 70: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

70 CHAPTER 4. DEPENDENCE CONCEPTS

Page 71: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

Chapter 5

Multivariate Discrete Distributions

With the previous chapter, we saw that many important dependence concepts and measures of as-sociation can be expressed in terms of the corresponding copula only, and are thus independent ofthe marginal distributions. Most of the results, however, have been stated under the assumption thatthe (one dimensional) marginals are continuous. This is, unfortunately, also the case with most ofthe literature written on copulas. As we will soon see, things not only become more complicated inthe non-continuous case, but also even doubtful whether it is possible to determine the dependencestructure without involving the marginals at all.

In this chapter, we will discuss dependence concepts and measures of association when discontinuitiesare allowed. We also mention the interrelation between the weak convergence of the joint distributionfunction and the underlying copula. Our survey will be based on Marshall (1996), one of the rare pa-pers dealing with this topic. Because of the ambiguities connected with non-continuity, we will focusheavily on just one special case: bivariate random variables with discrete marginals and finite support.

Notation 5.0.1. Let H be a bivariate distribution function with discrete marginals F and G both hav-ing a finite support. Then the supports will be denoted by ξ1, . . . , ξm and η1, . . . , ηn, respectively,with

a < ξ1 < ξ2 < · · · < ξm and c < η1 < η2 < · · · < ηn, a, c ∈ R

Moreover, we set ξ0 := a and c := η0 and denote by hij , i = 1, . . . ,m and j = 1, . . . , n the jointprobability densities, i.e., the mass assigned by H to the rectangle (ξi−1, ξi] × (ηj−1, ηj ].

When the marginals are not necessarily continuous, then the underlying copula is not unique. Recallthat for an n-dimensional vector X and an n-dimensional joint distribution function H , we denotedby CX and CH , respectively, the class of all possible copulas. Thus, when investigating the role playedby copulas in the ”non-continuous” case, a natural starting point is to focus on just this class. Inorder to do so, there is one important case which should be considered first: the bivariate Bernoullidistribution.

5.1 Bivariate Bernoulli Distribution

To get an impression of where we are heading, we start with the simplest case, bivariate discrete dis-tributions with Bernoulli distributed margins.

Suppose F and G are distribution functions of two possibly dependent Bernoulli distributed random

71

Page 72: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

72 CHAPTER 5. MULTIVARIATE DISCRETE DISTRIBUTIONS

variables X and Y withF (0) = p and G(0) = q.

Furthermore, assume that the random vector (X,Y ) has a distribution function Hpq and a copula classCpq . The joint distribution of X and Y is then given by

P(X = 0, Y = 0) = H(0, 0) = C(F (0), G(0)) = C(p, q),

P(X = 0, Y = 1) = H(0, 1)−H(0, 0) = C(F (0), G(1)) − C(F (0), G(0)) = p− C(p, q),

P(X = 1, Y = 0) = H(1, 0)−H(0, 0) = C(F (1), G(0)) − C(F (0), G(0)) = q − C(p, q),

P(X = 1, Y = 1) = 1 − C(p, q) − (p− C(p, q)) − (q − C(p, q)) = 1 − p− q + C(p, q),

(5.1)

for any C ∈ Cpq . In this case, the underlying copula is uniquely determined in a single point of theinterior of the unit square and Cpq consists of all copulas in C satisfying C(p, q) = H(0, 0). Hence,there exist numerous copulas which lead to one and the same joint distribution.This on one hand side has the consequence that the value of C(p, q) represents the only possibility forthe copula to take influence upon the joint distribution. On the other, Cpq depends crucially on themarginal distributions through the point (p, q) = (F (0), G(0)) where any member of Cpq must take oneand the same value.Although the class Cpq is quite large, it is possible to determine its upper and lower bound (note thatthese are not necessarily the Frechet-Hoeffding bounds, because C(p, q) need not to coincide with eitherM(p, q) or W(p, q)).

Theorem 5.1.1. (Nelsen, 1999, Theorem 3.2.2) Let C be a copula and suppose C(p, q) = θ, where(p, q) is in (0, 1)2 and θ satisfies max(p+ q − 1, 0) ≤ θ ≤ min(p, q). Then

(5.2) CL(u, v) ≤ C(u, v) ≤ CU (u, v) for any u, v ∈ [0, 1],

where CU and CL are shuffles of M defined by

CU = M[4, [0, θ], [θ, a], [a, a+ (b− θ)], [a+ (b− θ), 1], (1, 3, 2, 4), (1, 1, 1, 1)

],(5.3)

CL = M[4, [0, a− θ], [a− θ, a], [a, 1− b+ θ], [1 − b+ θ, 1], (4, 2, 3, 1), (−1,−1,−1,−1)

].(5.4)

Since CL(p, q) = CU (p, q) = θ, the bounds are best-possible.

θ a

b

a+ (b− θ) a− θ a

b

1 − b+ θ

(a) (b)

Figure 5.1: Supports of CU (a) and CL (b).

Page 73: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

5.2. THE CLASS CX 73

5.2 The Class CX

Throughout this section, we consider, for the sake of simplicity, bivariate joint distribution functionsH with marginals F and G. According to Sklar’s Theorem, there exists at least one copula uniquelydetermined on ranF × ranG satisfying H(x, y) = C(F (x), G(y)) for all (x, y) (i.e. CH is never empty).As is well known, one such copula can be constructed by a standard linear interpolation of the uniquesubcopula. Because it will play a crucial role in the upcoming results, we will start our survey byintroducing this so-called standard extension. Thereafter, we will focus on how big CH possibly is andsee that an extension of Theorem 5.1.1 is possible.

5.2.1 The Standard Extension

Theorem 5.2.1. Let F and G be marginal distribution functions of the joint distribution function Hand S an unique subcopula with domain ranF × ranG satisfying H(x, y) = S(F (x), G(y)). Further-more, let S∗ denote an extension of S on the closure ranF × ranG. For any (u, v) ∈ I2, let a1 and a2

denote the least and the greatest point in ranF that satisfy a1 ≤ u ≤ a2 and b1 and b2, respectively, bethe least and the greatest point in ranG that satisfy b1 ≤ v ≤ b2. Then the function defined by

(5.5) CS(u, v) = (1 − λ)(1 − µ)S∗(a1, b1) + (1 − λ)µS∗(a1, b2) + λ(1 − µ)S∗(a2, b1) + λµS∗(a2, b2)

with

λ =

u−a1

a2−a1, if a1 < a2,

1, if a1 = a2;

µ =

v−b1b2−b1 , if b1 < b2,

1, if b1 = b2;

is a copula corresponding with S∗ on its domain. CS is called the standard extension copula of S.

Proof. See Nelsen (1999, Lemma 2.3.5).

Remark 5.2.1. If both margins have finite supports, i.e., ξ1, ξ2, . . . , ξm and η1, η2, . . . , ηn, respectively(cf. Notation 5.0.1), then the above extension is absolutely continuous w.r.t. λ2. Since

∂u∂vCS(u, v) =

1

[F (ξi) − F (ξi−1)][G(ηj) −G(ηj−1)]

[C(F (ξi), G(ηj)

)− C

(F (ξi−1), G(ηj)

)−

− C(F (ξi), G(ηj−1)

)+ C

(F (ξi−1), G(ηj−1)

)]

for any u in (F (ξi−1), F (ξi)) and v in (G(ηj−1), G(ηj)), 1 ≤ i ≤ m, 1 ≤ j ≤ n, the density is given by

c(u, v) =

m∑

i=1

n∑

j=1

1u∈(F (ξi−1),F (ξi)]1v∈(G(ηj−1),G(ηj)]

∆(F (ξi),G(ηj))

(F (ξi−1),G(ηj−1))CS(F (ξi) − F (ξi−1))(G(ηj) −G(ηj−1))

(5.6a)

=

m∑

i=1

n∑

j=1

1u∈(F (ξi−1),F (ξi)]1v∈(G(ηj−1),G(ηj)]P[X = ξi, Y = ηj ]

P[X = ξi]P[Y = ηj ](5.6b)

=

m∑

i=1

n∑

j=1

1u∈(F (ξi−1),F (ξi)]1v∈(G(ηj−1),G(ηj)]

PCS

((F (ξi−1), F (ξi)] × (G(ηj−1), G(ηj)]

)

λ2((F (ξi−1), F (ξi)] × (G(ηj−1), G(ηj)]

)(5.6c)

With (5.6b) it is easy to obtain that CS is the independence copula π if and only if X and Y areindependent. If, however, X and Y are either comonotonic or countermonotonic, CS cannot coincidewith either M or W , since CS is never singular.

Page 74: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

74 CHAPTER 5. MULTIVARIATE DISCRETE DISTRIBUTIONS

Propositions 3.1.1 and 3.1.2 described how copulas and subcopulas react on transformations of themarginals. The following proposition shows how such transformations alter the standard extensioncopula CS .

Proposition 5.2.1. Let X and Y be discrete random variables with finite support and standard ex-tension copula CS. Then the standard extension copula C>

S of (Y,X) is given by

(5.7) C>S (u, v) = CS(v, u) for all u, v ∈ [0, 1].

Furthermore, let T be a continuous and strictly monotone transformation on ranX. Then

1. if T is increasing, the standard extension copula of (T (X), Y ) is equal to CS;

2. if T is decreasing, the standard extension copula of (T (X), Y ) is given by

CS(u, v) = v − CS(1 − u, v) for all u, v ∈ [0, 1].

Proof. The proof is based upon standard calculus and is therefore given in the Appendix on page140.

We close this section with an alternative construction of the standard extension copula. Again,suppose that F and G are discrete with a finite support (recall Notation 5.0.1). In order to transformthe joint distribution function into a continuous one, the jumps of H can be smoothed out by a linearinterpolation:

H(x, y) = H(ξi−1, ηj−1) +x− ξi−1

ξi − ξi−1

(H(ξi, ηj−1) −H(ξi−1, ηj−1)

)

+y − ηj−1

ηj − ηj−1

(H(ξi−1, ηj) −H(ξi−1, ηj−1)

)+

(x− ξi−1)(y − ηj−1)

(ξi − ξi−1)(ηj − ηj−1)hij ,

for

ξi−1 < x ≤ ξi, ηj−1 < y ≤ ηj , (i = 1, . . . ,m; j = 1, . . . , n).

As is well-known, this smoothed distribution function has marginals given by

F (x) = F (ξi−1) +x− ξi−1

ξi − ξi−1

(F (ξi) − F (ξi−1)

), for ξi−1 < x ≤ ξi; (i = 1, . . . ,m),

G(y) = G(ηj−1) +y − ηj−1

ηj − ηj−1

(G(ηj) −G(ηj−1)

), for ηj−1 < y ≤ ηj ; (j = 1, . . . , n),

which both equal the marginal distribution functions smoothed out in an analogous way the jointdistribution function has been. Now, since the standard extension CS results from exactly the samesmoothing process, we can expect the following:

Proposition 5.2.2. With the above assumptions, the unique copula corresponding to H is the standardextension copula CS, i.e.

(5.8) H(x, y) = CS(F (x), G(y)), for all x, y ∈ R.

Proof. This statement can be proved by a straightforward calculus and is again given in the Appendixon page 142.

Page 75: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

5.2. THE CLASS CX 75

5.2.2 Carley’s Extensions

Throughout this section, we consider a bivariate discrete distribution H with discrete marginals F andG both having a finite support. Besides Notation 5.0.1, we will set ai := F (ξi) and bj := G(ηj).Recall that in the interior of the unit square, the corresponding copula is uniquely determined only inthe points

(F (ξi), G(ηj)), i = 0, . . . ,m and j = 0, . . . , n.

Furthermore, hij is equal to the mass any C ∈ CH assigns to the rectangle (F (ξi−1), F (ξi)] ×(G(ηj−1), G(ηj)] for i = 1, . . . ,m and j = 1, . . . , n. In this situation, CH can be bounded as fol-lows:

Theorem 5.2.2. (Carley, 2002, Theorem 1, Theorem 2) Let F and G be discrete distributions withfinite support and joint distribution function H. Then, for any C in CH ,

(5.9) CL(u, v) ≤ C(u, v) ≤ CU (u, v) for any u, v ∈ [0, 1]

where CU , CL are shuffles of M defined by

(5.10) CU (u, v) =

m∑

i=1

n∑

j=1

max(0,min(u− αij , v − βij , hij)

)

where

α11 = 0 β11 = 0

αi j+1 = αij + hij βi+1 j = βij + hij

αi+1 1 = ai β1 j+1 = bj

and

(5.11) CL(u, v) =

m∑

i=1

n∑

j=1

max(min(u− γij , hij) + min(v − δij , hij) − hij , 0

),

with

γ1n = 0 δm1 = 0

γi n−j = γi n−j+1 + hi n−j+1 δm−i j = δm−i+1 j + hm−i+1 j

γi+1 n = ai δm j+1 = bj

Because CU and CL belong to CH , the bounds are best possible.

Remark 5.2.2. If X and Y are comonotonic, i.e. if the Frechet-Hoeffding upper bound M is includedin CXY , then, by means of (5.9) and the Frechet-Hoeffding inequality, Carley’s extension CU coincideswith M. Similarly, if X and Y are countermonotonic, then CL = W .

In order to see that Carley’s extensions are indeed shuffles of M, we examine them more closely.For i = 1, . . . ,m and j = 1, . . . , n, the indizes from Theorem 5.2.2 can be re-written as follows

αij =

i−1∑

k=1

n∑

l=1

hkl +

j−1∑

l=1

hil γij =

i−1∑

k=1

n∑

l=1

hkl +

n∑

l=j+1

hil(5.12a)

βij =

m∑

k=1

j−1∑

l=1

hkl +

i−1∑

k=1

hkj δij =

m∑

k=1

j−1∑

l=1

hkl +

m∑

k=i+1

hkj(5.12b)

Page 76: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

76 CHAPTER 5. MULTIVARIATE DISCRETE DISTRIBUTIONS

and hence, for i = 1, . . . ,m and j = 1, . . . , n, each interval [ai−1, ai] and [bj−1, bj ], respectively, isdivided into n and m subintervals, respectively, according to

ai−1 = αi1 ≤ αi2 ≤ · · · ≤ αin ≤ ai = αi+1 1 for CU(5.13a)

bj−1 = β1j ≤ β2j ≤ · · · ≤ βmj ≤ bj = βj+1 1 for CU(5.13b)

ai−1 = γin ≤ γi n−1 ≤ · · · ≤ γi1 ≤ ai = γi+1n for CL(5.13c)

bj−1 = δmj ≤ δm−1 j ≤ · · · ≤ δ1j ≤ bj = δmj+1 for CL(5.13d)

Now, each summand in (5.10) is equal to

(5.14) max(0,min(u− αij , v − βij , hij)

)=

0, if u < αij or v < βij

min(u− αij , v − βij), if u ∈ [αij , αi j+1] & v ≥ βij

or u ≥ αij & v ∈ [βij , βi+1 j ]

hij , u ≥ αi j+1 & v ≥ βi+1 j

and in (5.11) to

max(min(u−γij , hij)+min(v−δij , hij)−hij , 0

)=

0, if u < γij or v < δij

(u− γij) + (v − δij) − hij , if u ∈ [γij , γi j−1]

& v ∈ [δij , δi−1 j ]

(u− γij), if u ∈ [γij , γi j−1]

& v ≥ δi−1 j

(v − δij), if u ≥ γi−1 j

& v ∈ [δij , δi−1 j ]

hij , u ≥ γij−1 & v ≥ δi−1j

In either case, the summand describes the mass placed by the extension into the intersection of [0, u]×[0, v] with (ai−1, ai]× (bj−1, bj ]. Hence, the part of the support of CU which lies in (ai−1, ai]× (bj−1, bj ]is the line segment connecting the points (αij , βij) and (αi j+1, βi+1 j). Analogously, the intersectionof the support of CL and the rectangle (ai−1, ai]× (bj−1, bj ] consists of the line segment connecting thepoints (γij , δi−1 j) and (γj−1 i, δij):

ai−1 aiαij αi j+1

bi−1

βij

βi+1j

bj

ai−1 aiγij γi j−1

bi−1

δij

δi−1j

bj

(a) (b)

Figure 5.2: Intersections of the supports of CU (a) and CL (b) and the rectangle (ai−1, ai] × (bj−1, bj ].

Page 77: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

5.3. DEPENDENCE CONCEPTS 77

Moreover, because CU and CL belong to CH , the mass placed by either of the extensions into therectangle (ai−1, ai] × (bj−1, bj ] must be equal to hij . On the other hand, by means of (5.12),

αi j+1 − αij = βi+1 j − βij = γi j−1 − γij = δi−1 j − δij = hij .

Hence, no mass lies in the shadowed regions in Figure 5.2.2 (a) and (b). From this it is obvious thatCarley’s extensions are indeed shuffles of M.We conclude the introduction on Carley’s extensions with one final observation (for a proof, see Carley(2002)):

Corollary 5.2.1. (Carley, 2002, Theorem 2) Let S be a subcopula and CL and CU the correspondingCarley’s extensions. Furthermore, let S be a subcopula given by

S(u, v) := v − S(1 − u, v)

and CL and CU Carley’s extensions corresponding to S. Then CU and CL satisfy

(5.15) CL(u, v) = v − CU (1 − u, v) for any u, v ∈ [0, 1]

and

(5.16) CU (u, v) = v − CL(1 − u, v) for any u, v ∈ [0, 1].

The above-mentioned extensions concern only the bivariate case. At first sight, it may seem possibleto extend the strategy into higher dimensions. This, unfortunately, is not true. In the first place, sinceWn fails to be a copula if n ≥ 3, the minimum will not be a copula in general. But as illustrated inCarley (2002), even for n = 3, to obtain a close formula for the bounds seems to be quite difficult (ifpossible at all).

5.3 Dependence Concepts

As we saw with Section 4.2, various dependence concepts, such as orthant or tail dependence, areproperties of the corresponding copula alone if the marginals are continuous. Since the copula is notuniquely determined on the whole unit n-cube in the general case, these results cannot be extended, i.e.not all copulas in CH reflect the dependence structure of H . However, we will show in this section thatfor the orthant and the tail dependence at least one member of CH does so: the standard extensioncopula.

Theorem 5.3.1. If H is positive or negative quadrant dependent respectively, then the standard ex-tension copula CS satisfies CS ≺π and CS π respectively.

Proof. See Marshall (1996, Proposition 2.1.).

Moreover, with the definition of the standard extension, this result can be extended in the followingway:

Proposition 5.3.1. Let H1 and H2 be bivariate distribution functions with common marginals (notnecessarily continuous). Then

H1(x, y) ≤ H2(x, y) for any x, y ∈ R

if and only if C1S ≺ C2

S, where CkS denotes the standard extension copula of Hk, k = 1, 2.

Page 78: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

78 CHAPTER 5. MULTIVARIATE DISCRETE DISTRIBUTIONS

The next theorem shows that the standard extension copula also reflects tail dependence of thecorresponding joint distribution.

Theorem 5.3.2. Let X and Y be random variables, not necessarily continuous, and let CS be thestandard extension copula of the corresponding unique subcopula S. Then

1. Y is left tail decreasing in X if and only if for any v in I, CS(u, v)/u is nonincreasing in u;

2. X is left tail decreasing in Y if and only if for any u in I, CS(u, v)/v is nonincreasing in v;

3. Y is right tail increasing in X if and only if for any v in I, CS(1−u, 1−v)/(1−u) is nonincreasingin 1 − u;

4. X is right tail increasing in Y if and only if for any u in I, CS(1−u, 1−v)/(1−v) is nonincreasingin 1 − v;

where CS denotes the survival copula corresponding to CS.

Proof. Since all four statements can be proved along the same lines, we show just the first one here.By definition, Y is left tail decreasing in X if P[Y ≤ y|X ≤ x] is a nonincreasing function of x for ally. Now, this condition probability can be re-written according to

(†) P[Y ≤ y|X ≤ x] =H(x, y)

F (x)=

S(F (x), G(y))

F (x)=

CS(F (x), G(y))

F (x).

Hence, because CS(u, v)/u nonincreasing in u for all v implies S(u, v)/u nonincreasing in u on ranFfor all v ∈ ranG and because F is nondecreasing, the ”if” part of the statement follows.For the ”only if” part, we have to show that LTD(Y |X) yields

(‡) ∀ v ∈ [0, 1] : 0 ≤ u1 ≤ u2 ≤ 1 =⇒ CS(u1, v)

u1≥ CS(u2, v)

u2.

Since F is nondecreasing, (‡) follows from (†) for all (u1, v), (u2, v) ∈ ranF × ranG and, consequently,for all (u1, v), (u2, v) ∈ ranF × ranG. As to the remaining points (u, v) in I , we have to distinguishseveral cases:

1. v ∈ ranG:If u1 and u2 are numbers in the unit interval such that u1 ≤ u2, then there are again severalsituations, which have to be discussed separately:

(a) u1 ∈ ranF and u2 6∈ ranF . Here the least and the greatest point in ranF which satisfya1 ≤ u2 ≤ a2 also comply with u1 ≤ a1 ≤ u2 ≤ a2. Hence we have

CS(u2, v)

u2=

a2−u2

a2−a1CS(a1, v) + u2−a1

a2−a1CS(a2, v)

u2

=a2 − u2

a2 − a1

a1

u2

CS(a1, v)

a1+u2 − a1

a2 − a1

a2

u2

CS(a2, v)

a2.

Since CS(u, v)/v is nonincreasing on ranF for any v ∈ ranG, we have

CS(u2, v)

u2≤

[a2 − u2

a2 − a1

a1

u2+u2 − a1

a2 − a1

a2

u2

] CS(u1, v)

u1=

CS(u1, v)

u1.

Page 79: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

5.3. DEPENDENCE CONCEPTS 79

(b) u1 6∈ ranF and u2 ∈ ranF . Here the least and the greatest point in ranF which satisfya1 ≤ u1 ≤ a2 also comply with a1 ≤ u1 ≤ a2 ≤ u2. By the same argument as in thepreceding case,

CS(u1, v)

u1≥ CS(u2, v)

u2.

(c) Finally, suppose that neither u1 nor u2 lie in ranF . The greatest and the least points in ranFwhich fulfill a1 ≤ u1 ≤ a2 and a∗1 ≤ u2 ≤ a∗2, respectively, satisfy either u1 ≤ a2 ≤ a∗1 ≤ u2

or a1 = a∗1 & a2 = a∗2. In the former case the statement immediately follows with the casesdiscussed previously,

CS(u2, v)

u2≤ CS(a∗1, v)

a∗1≤ CS(u1, v)

u1.

In the latter, we show that CS(u2, v)/u2 − CS(u1, v)/u1 is not greater than zero:

CS(u2, v)

u2− CS(u1, v)

u1=

1

(a2 − a1)u1u2([u1(a2 − u2) − u2(a2 − u1)]CS(a1, v)

+[u1(u2 − a1) − u2(u1 − a1)]CS(a2, v))

=1

(a2 − a1)u1u2(a2(u1 − u2)CS(a1, v) + a1(u2 − u1)CS(a2, v))

=(u2 − u1)a1a2

(a2 − a1)u1u2︸ ︷︷ ︸≥0

(CS(a2, v)

a2− CS(a1, v)

a1

)

︸ ︷︷ ︸≤0

≤ 0.

2. v 6∈ ranG:Similarly, if u1 and u2 are in the unit interval such that u1 ≤ u2, then we have to distinguish thefollowing cases:

(a) u2 ∈ ranF . Then

CS(u2, v)

u2=

b2 − v

b2 − b1︸ ︷︷ ︸≥0

CS(u2, b1)

u2+v − b1b2 − b1︸ ︷︷ ︸

≥0

CS(u2, b2)

u2,

where b1 and b2 denote the least and the greatest element of ranG, respectively, satisfyingb1 ≤ v ≤ b2. With the previously discussed cases we have that CS(u, v)/u is nonincreasingin u for all v ∈ ranG, in particular for v = b1 as well as for v = b2. Hence we get

CS(u2, v)

u2≤ b2 − v

b2 − b1

CS(u1, b1)

u1+

v − b1b2 − b1

CS(u1, b2)

u1=

CS(u1, v)

u1.

The last equation follows straightforwardly if u1 ∈ ranF and with (5.5) otherwise.

(b) If u1 ∈ ranF , then, along the same lines as in the case before, we get

CS(u1, v)

u1≥ CS(u2, v)

u2.

(c) If neither u1 nor u2 lie in ranF , the least and the greatest elements in ranF , a1, a2 anda∗1, a

∗2 satisfying a1 ≤ u1 ≤ a2 and a∗1 ≤ u2 ≤ a∗2, respectively, are related either by

Page 80: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

80 CHAPTER 5. MULTIVARIATE DISCRETE DISTRIBUTIONS

a1 ≤ u1 ≤ a2 ≤ a∗1 ≤ u2 ≤ a∗2 or by a1 = a∗1 & a2 = a∗2. In the former situation we havewith the previous cases

CS(u2, v)

u2≤ CS(a2, v)

a2≤ C(u1, v)

u1.

In the latter,

CS(u2, v)

u2− CS(u1, v)

u1

=1

(a2 − a1)(b2 − b1)u1u2·[CS(a1, b1)

((b2 − v)(a2 − u2)u1 − (b2 − v)(a2 − u1)u2

)

+ CS(a1, b2)((v − b1)(a2 − u2)u1 − (v − b1)(a2 − u1)u2

)

+ CS(a2, b1)((b2 − v)(u2 − a1)u1 − (b2 − v)(u1 − a1)u2

)

+ CS(a2, b2)((v − b1)(u2 − a1)u1 − (v − b1)(u1 − a1)u2

)]

=(b2 − v)(u2 − u1)

(a2 − a1)(b2 − b1)u1u2·[a1CS(a2, b1) − a2CS(a1, b1)

]

+(v − b1)(u2 − u1)

(a2 − a1)(b2 − b1)u1u2·[a1CS(a2, b2) − a2CS(a1, b2)

]

=(b2 − v)(u2 − u1)a1a2

(a2 − a1)(b2 − b1)u1u2︸ ︷︷ ︸≥0

·[CS(a2, b1)

a2− CS(a1, b1)

a1

]

+(v − b1)(u2 − u1)a1a2

(a2 − a1)(b2 − b1)u1u2︸ ︷︷ ︸≥0

·[CS(a2, b2)

a2− CS(a1, b2)

a1

].

Since both CS(u, b1)/u and C(u, b2)/u are nonincreasing in u, it follows that

CS(a2, b1)

a2− CS(a1, b1)

a1≤ 0 and

CS(a2, b2)

a2− CS(a1, b2)

a1≤ 0

and henceCS(u2, v)

u2− CS(u1, v)

u1≤ 0.

5.4 Weak Convergence

As is well-known, a sequence (Xn, Yn) of bivariate real-valued random vectors converges weaklyto a random vector (X,Y ) if and only if the corresponding joint distribution functions Hn convergepoint-wise to the joint distribution function H of (X,Y ) at all continuity points of H . Hence we have:

Theorem 5.4.1. Let (Xn, Yn) be bivariate random vectors with continuous margins converging weaklyto (X,Y ). Further suppose that the margins of (X,Y ) are continuous. Then if Cn denotes the copulaof (Xn, Yn) and C the copula of (X,Y ), respectively,

(5.17) ∀ (u, v) ∈ I : Cn(u, v) → C(u, v) for n→ ∞.

Vice versa, if Xn and Yn are continuous random variables with distribution functions Fn and Gn,respectively, satisfying

L(Xn) → L(X) and L(Yn) → L(Y ),

Page 81: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

5.4. WEAK CONVERGENCE 81

for some random variables X and Y with continuous distribution functions F and G respectively, thenfor any sequence Cn of copulas converging to some copula C point-wise, the random vector

(Xn, Yn) ∼ Cn(Fn, Gn)

converges weakly to(X,Y ) ∼ C(F,G).

Unfortunately, the continuity of the limiting random variables X and Y cannot be omitted.

Theorem 5.4.2. There exists a sequence of bivariate random vectors (Xn, Yn) with continuous marginsand a random vector (X,Y ) such that

L(Xn, Yn) → L(X,Y )

and C(Xn,Yn) does not converge point-wise.

We prove this theorem by constructing such a sequence.

Example 5.1. Let Xn and Yn be continuous random variables distributed identically according toa uniform distribution on [1 − 1

n , 1]. The corresponding distribution function will be denoted by Fn.Furthermore, let Hn be a joint distribution of (Xn, Yn) defined by

(5.18) Hn(x, y) =

M(Fn(x), Fn(y)) if n is even,

W(Fn(x), Fn(y)) if n is odd.

Clearly, the sequence of the corresponding copulas featuring alternately the Frechet-Hoeffding lowerand upper bound, does not converge point-wise. But on the other hand, the distribution functions

Fn(x) =

0, x ≤ 1 − 1n

x−1+1/n1/n , x ∈ [1 − 1

n , 1]

1, x > 1

converge point-wise to the distribution function of the Dirac distribution δ(1). Indeed, if x is smallerthan 1, an n can always be found such that x 6∈ [1 − 1

n , 1] and hence Fn(x) vanishes from this n on.By a similar argument, Hn converges point-wise to the distribution function of the bivariate Diracdistribution δ

((1, 1)

). Indeed, for a point (x, y) we have

Hn(x, y) =

0, (x, y) ∈ (−∞, 1 − 1/n)× R ∪ R × (−∞, 1 − 1/n)

Fn(x), (x, y) ∈ (1 − 1/n, 1)× [1,∞)

Fn(y), (x, y) ∈ [1,∞) × (1 − 1/n, 1)

M(Fn(x), Fn(y) or W(Fn(x), Fn(y)), (x, y) ∈ (1 − 1/n, 1)2

1, (x, y) ∈ [1,∞) × [1,∞)

so that if (x, y) does not lie in [1,∞) × [1,∞), an n can always be found such that (x, y) belongs toeither (−∞, 1− 1/n)×R or R× (−∞, 1− 1/n). Hence, Hn(x, y) equals zero for this n and any higher.We so have L(Xn, Yn) → δ

((1, 1)

)although the corresponding copulas fail to converge.

The above example also reveals the reason why the sequence of the corresponding copulas is notnecessarily convergent. For one the situation is caused by the fact that if the limiting margins areno longer continuous, no unique copula exists. On the other hand, the example enlightens how muchinfluence can the marginals take upon the weak convergence: the ranges of the marginal distributionfunctions change with n in general, and do not need to agree with (or converge to) the ranges of thelimiting marginal distribution functions.Nevertheless, if the marginal distribution functions are discrete, the following holds.

Page 82: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

82 CHAPTER 5. MULTIVARIATE DISCRETE DISTRIBUTIONS

Theorem 5.4.3. Let Xn and Yn be random variables with discrete distribution functions Fn and Gnrespectively, converging weakly to some random variables X and Y with (not necessarily continuous)distribution functions F and G. Furthermore, suppose that the supports of Fn and Gn do not changewith n. Then if the joint distribution function Hn of (Xn, Yn) converges to the distribution function Hof (X,Y ) at any continuity point of H, then the standard extensions of the corresponding subcopulasSn converge point-wise to the standard extension of the subcopula S of H.

Proof. Throughout the proof, let the support of Fn and Gn be denoted by

ξ0 ≤ ξ1 ≤ ξ2 ≤ . . . , and η0 ≤ η1 ≤ η2 ≤ . . . ,

respectively. Then, as is well known, the distribution of X is discrete with the same support as Xn

and Xn converges weakly to X if and only if limn→∞

P[Xn = ξi] is P[X = ξi]. With Yn, the situation

is similar, i.e. Yn converges weakly to Y if and only if P[Yn = ηj ] tends to P[Y = ηj ]. Consequently,Fn(ξi) converges to F (ξi) and Gn(ηj) to G(ηj) for any i and j.Moreover, (Xn, Yn) converges weakly to (X,Y ) if and only if, for any i and j,

(5.19) P[Xn = ξi, Yn = ηj ] → P[X = ξi, Y = ξj ].

Now, we consider a point (u, v) ∈ (0, 1)2 and denote by a1 and a2 the least and the greatest elementin ranF satisfying a1 ≤ u ≤ a2 and by b1 and b2 the least and the greatest element in ranG fulfillingb1 ≤ v ≤ b2. Without loss of generality, suppose that a1 6= a2 and b1 6= b2. In this case, we clearlyhave a1 = F (ξi), a2 = F (ξi+1), b1 = G(ηj) and b2 = G(ηj+1) for some i and j. Since Fn(ξi) convergesto F (ξi) and Gn(ηj) to G(ηj), an N can be found such that, for all n ≥ N ,

an1 = Fn(ξi) ≤ u ≤ Fn(ξi+1) = an2 and bn1 = Gn(ηj) ≤ v ≤ Gn(ηj+1) = bn2 .

Moreover, for any such n an1 and an2 is the least and the greatest element in ranFn, respectively,satisfying an1 ≤ u ≤ an2 . Similarly, bn1 and bn2 is the least and the greatest element in ranG, respectively,fulfilling bn1 ≤ v ≤ bn2 . Hence, the standard extension of Sn, n ≥ N , in the point (u, v) is given by

CnS(u, v) =1

(bn2 − bn1 )(an2 − an1 )

[(an2 − u)(bn2 − v)Sn(an1 , b

n1 ) + (an2 − u)(v − an1 )Sn(an1 , bn2 )+

(u− an1 )(bn2 − v)Sn(an2 , bn1 ) + (u− an1 )(v − bn1 )Sn(an2 , b

n2 )

].

Now, according to (5.19), Sn(ank , bnl ) converges to S(ak , bl) for k, l = 1, 2, from which the theoremeasily follows.

5.5 Measures of Association

At the end of their paper, Schweizer and Wolff (1981) suggest that, if the margins of X are notnecessarily continuous, one can work with one of the copulas from CX in order to obtain dependencemeasures. However, if we construct measures of dependence or concordance by choosing some arbitrarymember of CX , we have to be aware of the fact that the quantity will probably depend on this choice.But recall that there also exist measures of association, which depend on the copula only in points inranF × ranG. Hence, these quantities remain constant on CX on one hand side, but are dependenton the marginals on the other. One such quantity is for example the linear correlation coefficient.

In this section, we focus on Schweizer and Wolff’s idea and show that for discrete distributions withfinite support, the standard extension copula seems to be a natural choice. On the other hand, we willalso see that, unfortunately, this procedure yields quantities which depend on the marginal distributionfunctions.

Page 83: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

5.5. MEASURES OF ASSOCIATION 83

5.5.1 Basic Characteristics

Before we start to think about Schweizer and Wolff’s approach, we have to revisit the propertiesdemanded of concordance and dependence measures. Because of Proposition 3.1.1, we have to alterthe fifth axioms in the following sense:

A5∗. (dependence) If f and g are strictly monotone and continuous on ranX and ranY respectively,then δ(f(X), g(Y )) = δ(X,Y ) (invariance);

and

A5∗. (concordance) If T is strictly monotone and continuous on ranX , then

ρ(T (X), Y ) =

ρ(X,Y ), T increasing.

−ρ(X,Y ), T decreasing.

Another axiom, which has to be handled with care for dependence as well as for concordance measures,is A6. Because, if we base the measure on (some) underlying copula, we must be aware of Theorem5.4.2. The result given there concerns a sequence of continuous random variables, but it can also beextended to the case when Xn and Yn are arbitrary.

Secondly, in the general case, there exists neither a measure of concordance nor a measure of de-pendence which can be determined from the copula alone. This unfortunate result is due to thefollowing lemma:

Lemma 5.5.1. (Marshall, 1996, Proposition 2.3.) Let C be a copula and r, s in (0, 1). Then thereexist r∗ and s∗ in (0, 1) and a copula C∗ such that

(5.20) C(r, s) = C∗(r, s) and C∗(r∗, s∗) = r∗ · s∗.

This causes the following:

Theorem 5.5.1. (Marshall, 1996) Let H be a set of bivariate distributions which includes those witharbitrary Bernoulli marginals. Furthermore, let δ and ρ be any measure of dependence and concordance,respectively, defined for any H ∈ H. Suppose that H ∈ H and H = C(F,G)1 implies δH = δC andρH = ρC respectively, i.e., the measure depends only on the copula of H. Then δ and ρ are constant.

Proof. Let r, s be arbitrary numbers in (0, 1) and F and G be Bernoulli distributions with F (0) = rand G(0) = s. Then the measure in question depends only on H(0, 0) = C(r, s) (cf. Section 5.1).Therefore, we have for the copula C∗ from Lemma 5.5.1 δC = δC∗ and ρC = ρC∗ respectively. But onthe other hand, with r∗ and s∗ satisfying (5.20) and a pair of Bernoulli distributions F ∗ and G∗ suchthat F ∗(0) = r∗ and G∗(0) = s∗ we have C∗(F ∗, G∗) =π(F ∗, G∗). Since this distribution is assumedto belong to H, we have δC∗ = δπ and ρC∗ = ρπ. Hence δ and ρ are constant2.

Therefore, we must expect that the measure of association will generally depend on the marginaldistributions. This in turn has another unpleasant consequence. To see this, assume X and Y arerandom variables and T is some strictly monotone and continuous transformation on ranX .If the distributions of X and Y are continuous, then there exists a kind of symmetry between thedependence of (X,Y ) and the dependence of (T (X), Y ): on one hand side, if T is increasing, then

1C being a copula in CH

2This proof has already been given in Marshall (1996). The reason for we quoted it here is simply because we believeit provides a better understanding of the subject considered.

Page 84: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

84 CHAPTER 5. MULTIVARIATE DISCRETE DISTRIBUTIONS

(X,Y ) and (T (X), Y ) have the same copula. On the other, if T is decreasing, then the copulas of(X,Y ) and (T (X), Y ) are related by (c.f. Proposition 3.1.2)

C(T (X),Y ) = v − C(X,Y )(1 − u, v).

In particular,

C(X,Y ) = M ⇐⇒ C(T (X),Y ) = W and C(X,Y ) = W ⇐⇒ C(T (X),Y ) = M.

In the general case, however, things are no longer that simple – we have to deal with a whole classof copulas here plus take into account the fact that T possibly alters the range of the distributionfunction of T (X). If T is increasing, then the subcopulas of (X,Y ) and (T (X), Y ) coincide and hencethe classes C(X,Y ) and C(T (X),Y ) are equal. If, on the other hand, T is decreasing, then the followingholds:

Proposition 5.5.1. Suppose X and Y are random variables and T a strictly decreasing and continuoustransformation on ranX. Then C(T (X),Y ) = ψ(C(X,Y )) where ψ is a bijective mapping on the set ofall copulas, C2, given by

ψ : C2 → C2

C 7→ ψ(C), ψ(C)(u, v) := v − C(1 − u, v) for any u, v ∈ [0, 1].(5.21)

Moreover, if X and Y are discrete with finite support and if CS and CS denote the standard extensioncopula of (X,Y ) and (T (X), Y ), respectively, then

(5.22) ψ(CS) = CS .

In addition, if CL and CU denote the Carley’s extension copulas of (T (X), Y ), then

(5.23) ψ(CU ) = CL and ψ(CL) = CU ,

where CL and CU are the Carley’s extension copulas of (X,Y ).

Proof. First, if C 6= C∗ are copulas, then there exists a point (u, v) such that C(u, v) 6= C∗(u, v). Butthis immediately yields ψ(C) 6= ψ(C∗) since

ψ(C)(1 − u, v) = v − C(u, v) 6= v − C∗(u, v) = ψ(C∗)(1 − u, v),

and thus ψ is injective. In addition,

(5.24) ψ(ψ(C))(u, v) = v − ψ(C)(1 − u, v) = v − [v − C(1 − 1 + u, v)] = C(u, v)

and hence ψ is surjective. Now, by means of Proposition 3.1.2, the subcopula of (T (X), Y ) satisfiesS(T (X),Y ) = v − S(X,Y )(1 − u, v). Moreover, if u is in ranFT (X), there exists a t such that

u = P[T (X) ≤ t] = P[X ≥ T−1(t)] = 1 − FX(T−1(t)−) ⇔ 1 − u = FX(T−1(t)−)︸ ︷︷ ︸∈ranFX

.

Hence, (1 − u, v) is in ranFX × ranFY and consequently

ψ(C)(u, v) = v − C(1 − u, v) = v − S (X,Y )(1 − u, v) = S(T (X),Y )(u, v)

for any (u, v) ∈ ranFT (X) × ranFY . Hence, C(T (X),Y ) ⊇ ψ(C(X,Y )). But since T−1 is continuous anddecreasing on ranT (X), this also implies ψ(C(T (X),Y )) ⊆ C(X,Y ). Because ψ is bijective, from this wehave ψ(ψ(C(T (X),Y ))) ⊆ ψ(C(X,Y )) which together with (5.24) yields C(T (X),Y ) ⊆ ψ(C(X,Y )). Finally,(5.22) follows with Proposition 5.2.1 and (5.23) with Corollary 5.2.1.

Page 85: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

5.5. MEASURES OF ASSOCIATION 85

Remark 5.5.1. Note that (5.24) implies ψ−1 ≡ ψ. Moreover,

(5.25) ψ(π)(u, v) = v − (1 − u)v = uv =π(u, v)

and thus, for the independence copula, ψ(π) =π. This especially implies

π ∈ C(X,Y ) ⇐⇒ π ∈ C(T (X),Y ).

To recognize the pitfalls caused by Proposition 5.5.1, we will need the following notation:

Notation 5.5.1. LetX and Y be comonotonic random variables with subcopula S. Then the standardextension copula will be denoted by MS . Similarly, if X and Y are countermonotonic, WS will standfor the corresponding standard extension copula.

Remark 5.5.2. If X and Y are comonotonic [countermonotonic] random variables with subcopula Sand T some strictly decreasing and continuous transformation on ranX , then T (X) and Y are coun-termonotonic [comonotonic] (c.f. Theorem 4.1.3) and, if we denote their subcopula by S, Proposition5.5.1 yields:

(5.26) WS = ψ(MS) and MS = ψ(WS)

Now, suppose d is a metric on the set of all bivariate copulas, C2. In order to define a concordanceor dependence measure based on d, it is meaningful to claim

(5.27) d(C1, C2) = d(ψ(C1), ψ(C2)) (invariance) and d(C1, C2) = d(ϕ(C1), ϕ(C2)) (symmetry)

where ϕ is a function on the set of all copulas given by ϕ(C)(u, v) = C(v, u). Now, if we chooseto measure dependence between the random variables X and Y with subcopula S by means of thestandard extension copula, then we have by (5.25) and (5.26)

(5.28) d(MS ,π) = d(WS ,π) and d(WS ,π) = d(MS ,π)

where S is a subcopula corresponding to (T (X), Y ). But this does not guarantee d(MS ,π) =d(WS ,π). Hence, the quantities

ρd(CS) :=d(CS ,WS) − d(CS ,MS)

d(MS ,WS)and δd(S) :=

d(CS ,π)

d(MS ,π)

generally fail to satisfy A3 of Definition 4.3.2 and A4 of Definition 4.3.1, respectively.In the light of this and the results presented in the next sections, it seems necessary to reconsider theextreme values of both δ and ρ (axiom A4) according to

A4∗. (dependence) δ(X,Y ) = 1 if and only if Y = T (X) a.s. for some strictly monotone and continuoustransformation T on ranX ,

A4∗. (concordance) ρ(X,Y ) = 1 if and only if Y = T (X) a.s. for some strictly increasing and continu-ous transformation T on ranX , and ρ(X,Y ) = −1 if and only if Y = T (X)a.s. for some strictlydecreasing and continuous transformation T on ranX .

Remark 5.5.3. Note that if X and Y are continuous, A4∗ and A4 coincide by means ofRemark 4.1.1.

Page 86: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

86 CHAPTER 5. MULTIVARIATE DISCRETE DISTRIBUTIONS

For the rest of this section, we restrict our observations to discrete distributions with finite support.Before we proceed with construction of concordance and dependence measures, we close this generalintroduction by noting the following:

Remark 5.5.4. Consider discrete random variables X and Y with finite supports and a strictly mono-tone, continuous transformation T on ranX . The distribution function of T (X), F , say, will thenclearly not be the same as the distribution function of X . Moreover, the way F depends on F willdepend on the kind of monotonicity of T . Because, if T is increasing, the support of T (X) is given by

ξ1 = T (ξ1) < ξ2 = T (ξ2) < · · · < ξm = T (ξm).

Because pi := P[T (X) = T (ξi)] is equal to P[X = ξi], F (ξi) = F (ξi). Especially, ran F = ranF andthe sets p1, . . . , pm and p1, . . . , pm are the same.If, on the other hand, T is decreasing, then the support of T (X) is given by

ξ1 = T (ξm) < ξ2 = T (ξm−1) < · · · < ξm = T (ξ1)

and the distribution function F satisfies

F (ξi) = P[T (X) ≤ T (ξm−i+1)] = P[X ≥ ξm−i+1] = 1 − F (ξm−i).

Especially, pj = pm−j+1 and the set p1, . . . , pm is again identical with p1, . . . , pm. But, in this

case, ran F is not necessarily equal to ranF .However, if a quantity depends on the marginals through the corresponding probability densities only,it will be invariant under strictly monotone and continuous transformations T .

5.5.2 Measures of Concordance

As mentioned above, for any C ∈ C(X,Y ) and any concordance measure ρ based entirely on copulas wecan obtain a ”concordance measure” for X and Y . For example, if ρ is Kendall’s tau ρτ , this proceduregenerates the following quantities

ρτ (C) = 4

∫ 1

0

∫ 1

0

C(u, v) dC(u, v) − 1.

Obviously, ρτ will in general not be invariant within the class C(X,Y ).However, for discrete margins with finite support, we know that the least and the greatest elementof C(X,Y ) are given by the Carley’s extensions. Hence, under some mild conditions, it is possible toestimate how much influence the choice of the copula has on the quantity ρ.

Proposition 5.5.2. Let X and Y be random variables with discrete distributions with finite sup-port and CL and CU the corresponding Carley’s extensions. Furthermore, assume ρ is a measure ofconcordance fulfilling axiom A7 of Definition 4.3.2. Then, for any C ∈ C(X,Y ),

ρ(C) ∈ [ρmin, ρmax]

where ρmin = ρ(CL) and ρmax = ρ(CU ).

Proof. The statement follows straightforwardly with Theorem 5.2.2.

As we will later see, [ρmin, ρmax] can be quite large. Therefore, we now focus on finding a memberof C(X,Y ) which would provide a good ”representative”.

When discussing concordance measures, our starting point was Theorem 4.3.2. Hence, we begin withan analogue for the discrete case3.

3Recall Notation 5.0.1.

Page 87: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

5.5. MEASURES OF ASSOCIATION 87

Theorem 5.5.2. Let (X1, Y1) and (X2, Y2) be independent random vectors with common margins F(for X1 and X2) and G (for Y1 and Y2) which are both discrete with finite support. Furthermore, letH1 and H2 denote the joint distribution functions (not necessarily equal) of (X1, Y1) and (X2, Y2),respectively. Let Q denote the difference between the probabilities of concordance and discordance of(X1, Y1) and (X2, Y2), i.e. let

Q = P[(X1 −X2)(Y1 − Y2) > 0] − P[(X1 −X2)(Y1 − Y2) < 0].

Then

(5.29) Q = Q(H1, H2) =m∑

i=1

n∑

j=1

h2ij

[H1(ξi, ηj) +H1(ξi, ηj−1) +H1(ξi−1, ηj) +H1(ξi−1, ηj−1) − 1

]

with h2ij = P[X2 = ξi, Y2 = ηj ]. Moreover, Q can be expressed in terms of the standard extension

copulas C1S and C2

S according to

(5.30) Q = 4

∫ 1

0

∫ 1

0

C2S(u, v) dC1

S(u, v) − 1 = 4

∫ 1

0

∫ 1

0

C1S(u, v) dC2

S(u, v) − 1.

Proof. First, note that the probabilities of concordance and discordance can be re-written as follows:

P[(X1 −X2)(Y1 − Y2) > 0] = P[X1 < X2, Y1 < Y2] + P[X1 > X2, Y1 > Y2],(5.31)

P[(X1 −X2)(Y1 − Y2) < 0] = P[X1 < X2, Y1 > Y2] + P[X1 > X2, Y1 < Y2].(5.32)

For the quantities appearing on the right-hand side, we have

(5.33) P[X1 < X2, Y1 < Y2] =

m∑

i=1

n∑

j=1

P[X1 < ξi, Y1 < ηj ]h2ij =

m∑

i=1

n∑

j=1

H1(ξi−1, ηj−1)h2ij

and, along the same lines,

P[X1 > X2, Y1 > Y2] =

m∑

i=1

n∑

j=1

(1 − F (ξi) −G(ηj) +H1(ξi, ηj)

)h2ij ,(5.34a)

P[X1 < X2, Y1 > Y2] =

m∑

i=1

n∑

j=1

(F (ξi−1) −H1(ξi−1, ηj)

)h2ij ,(5.34b)

P[X1 > X2, Y1 < Y2] =m∑

i=1

n∑

j=1

(G(ηj−1) −H1(ξi, ηj−1)

)h2ij .(5.34c)

Hence, the difference between the probability of concordance and the probability of discordance equals

Q =

m∑

i=1

n∑

j=1

h2ij

[H1(ξi, ηj) +H1(ξi, ηj−1) +H1(ξi−1, ηj) +H1(ξi−1, ηj−1) − 1

]

+

m∑

i=1

n∑

j=1

h2ij

[2 − F (ξi) −G(ηj) − F (ξi−1) −G(ηj−1)

].

Page 88: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

88 CHAPTER 5. MULTIVARIATE DISCRETE DISTRIBUTIONS

So, in order to show (5.29), we only have to prove that the second term on the right-hand side is equalto zero. Indeed, we have with P[Xk = ξi] = pi and P[Yk = ηj ] = qj ,

m∑

i=1

n∑

j=1

h2ij

[2 − F (ξi) −G(ηj) − F (ξi−1) −G(ηj−1)

]

= 2 −m∑

i=1

n∑

j=1

h2ijF (ξi) −

m∑

i=1

n∑

j=1

h2ijG(ηj) −

m∑

i=1

n∑

j=1

h2ijF (ξi−1) −

m∑

i=1

n∑

j=1

h2ijG(ηj−1)

= 2 −m∑

i=1

piF (ξi) −n∑

j=1

qjG(ηj) −m∑

i=1

piF (ξi−1) −n∑

j=1

qjG(ηj−1)

= 2 −m∑

i=1

[F (ξi) − F (ξi−1)][F (ξi) + F (ξi−1)] −n∑

j=1

[G(ηj) −G(ηj−1)][G(ηj) +G(ηj−1)]

= 2 −m∑

i=1

[F (ξi)]2 +

m∑

i=1

[F (ξi−1)]2 −

n∑

j=1

[G(ηj)]2 +

n∑

j=1

[G(ηj−1)]2

= 2 −m∑

i=1

[F (ξi)]2 +

m−1∑

i=0

[F (ξi)]2 −

n∑

j=1

[G(ηj)]2 +

n−1∑

j=0

[G(ηj)]2

= 2 − [F (ξm)]2 − [G(ηn)]2 = 2 − 1 − 1 = 0.

In order to show (5.30), recall that the standard extension copula C2S has a density given by (cf. (5.6))

m∑

i=1

n∑

j=1

1u∈(F (ξi−1),F (ξi)]1v∈(G(ηj−1),G(ηj)]

h2ij

(F (ξi) − F (ξi−1))(G(ηj ) −G(ηj−1)).

Hence,∫ 1

0

∫ 1

0 C1S(u, v)dC2

S(u, v) evaluates as follows

∫ 1

0

∫ 1

0

C1S(u, v)dC2

S(u, v) =

m∑

i=1

n∑

j=1

h2ij

(F (ξi) − F (ξi−1))(G(ηj) −G(ηj−1))

F (ξi)∫

F (ξi−1)

G(ηj)∫

G(ηj−1)

C1S(u, v) du dv

︸ ︷︷ ︸I

.

With (5.5), I is equal to

I =(F (ξi) − F (ξi−1))(G(ηj ) −G(ηj−1))

4

[C1(F (ξi−1), G(ηj−1)) + C1(F (ξi), G(ηj−1))+

C1(F (ξi−1), G(ηj)) + C1(F (ξi), G(ηj))].

This yields straightforwardly∫ 1

0

∫ 1

0

C1S(u, v) dC2

S(u, v) =1

4

m∑

i=1

n∑

j=1

h2ij

[C1(F (ξi−1), G(ηj−1)) + C1(F (ξi), G(ηj−1))+

C1(F (ξi−1), G(ηj)) + C1(F (ξi), G(ηj))]

=1

4

m∑

i=1

n∑

j=1

h2ij

[H1(ξi−1, ηj−1) +H1(ξi, ηj−1)+

H1(ξi−1, ηj) +H1(ξi, ηj)]

and, because C1S and C2

S can obviously be interchanged in the above calculations, (5.30) follows.

Page 89: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

5.5. MEASURES OF ASSOCIATION 89

Now, the difference between the probabilities of concordance and discordance can be boundedby certain quantities depending on the marginal distribution functions. First, the Frechet-Hoeffdinginequality yields immediately

Corollary 5.5.1. Under the hypothesis of Theorem 5.5.2, for any given H2,

(5.35) Qmin ≤ Q(H1, H2) ≤ Qmax

where Qmax and Qmin is equal to Q for (X1, Y1) comonotonic and countermonotonic, respectively.

Because H1 is evaluated in the points (F (ξi), G(ηj)) only, the bounds Qmax and Qmin will in generaldepend on the marginal distribution functions throughout these points. Moreover, to calculate Qmin

and Qmax is not simple4 and no close formula is obtained, in general. Therefore, we provide otherbounds for Q, which are easy to calculate and sharp in certain situations, as we will soon see. First,note that the probabilities involved in the definition of Q can also be expressed in terms of the jointprobability densities, hkij = P[Xk = ξi, Yk = ηj ], k = 1, 2. Indeed, with (5.33) and (5.34) we have

P[X1 < X2, Y1 < Y2] =

m∑

i=1

n∑

j=1

i−1∑

k=1

j−1∑

l=1

h1klh

2ij ,

P[X1 > X2, Y1 > Y2] =

m∑

i=1

n∑

j=1

m∑

k=i+1

n∑

l=j+1

h1klh

2ij ,

P[X1 < X2, Y1 > Y2] =

m∑

i=1

n∑

j=1

i−1∑

k=1

n∑

l=j+1

h1klh

2ij ,

P[X1 > X2, Y1 < Y2] =

m∑

i=1

n∑

j=1

m∑

k=i+1

j−1∑

l=1

h1klh

2ij .

Hence, Q can be re-written as

(5.36) Q =

m∑

i=1

n∑

j=1

h2ij

[i−1∑

k=1

j−1∑

l=1

h1kl +

m∑

k=i+1

n∑

l=j+1

h1kl −

i−1∑

k=1

n∑

l=j+1

h1kl −

m∑

k=i+1

j−1∑

l=1

h1kl

].

Theorem 5.5.3. Under the hypothesis of the Theorem 5.5.2,

(5.37) |Q| ≤

√√√√1 −m∑

i=1

p2i

√√√√1 −n∑

j=1

q2j ,

where pi and qj denotes P[X1 = ξi] and P[Y1 = ηj ], respectively.

Proof. By defining

aik =

1, k < i

0, k = i

−1, k > i

and bjl =

1, l < j

0, l = j

−1, l > j

,

(5.36) can be re-written as

Q =m∑

i=1

n∑

j=1

m∑

k=1

n∑

l=1

h2ijh

1klaikbjl,

4This will be illustrated in Section 7.3.

Page 90: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

90 CHAPTER 5. MULTIVARIATE DISCRETE DISTRIBUTIONS

and, according to the Cauchy-Schwarz inequality,

(5.38) |Q| ≤√ ∑

i,j,k,l

h2ijh

1kl(aik)

2

√ ∑

i,j,k,l

h2ijh

1kl(bjl)

2.

If we now focus on the quantities appearing on the right-hand side, we get

i,j,k,l

h2ijh

1kl(aik)

2 =

m∑

i=1

n∑

j=1

h2ij

m∑

k=1

(aik)2

n∑

l=1

h1kl =

m∑

i=1

n∑

j=1

h2ij

m∑

k=1

(aik)2pk

=m∑

i=1

n∑

j=1

h2ijF (ξi−1) +

m∑

i=1

n∑

j=1

h2ij

m∑

k=i+1

pk

=

m∑

i=1

F (ξi−1)

n∑

j=1

h2ij +

m∑

k=1

n∑

j=1

k−1∑

i=1

h2ijpk

=

m∑

i=1

F (ξi−1)pi +

m∑

k=1

k−1∑

i=1

pipk = 2

m∑

i=1

piF (ξi−1)

Since

(5.39)

m∑

i=1

[F (ξi) − F (ξi−1)][F (ξi) + F (ξi−1)] =

m∑

i=1

[F (ξi)]2 −

m∑

i=1

[F (ξi−1)]2 = [F (ξm)]2 = 1

on one hand and

m∑

i=1

[F (ξi) − F (ξi−1)][F (ξi) + F (ξi−1)] =

m∑

i=1

pi[F (ξi−1)] +

m∑

i=1

pi[F (ξi)] =

m∑

i=1

pi[F (ξi−1)] +

m∑

i=1

pi[F (ξi−1)] +

m∑

i=1

(pi)2 = 2

m∑

i=1

pi[F (ξi−1)] +

m∑

i=1

(pi)2

on the other,

(5.40) 2

m∑

i=1

piF (ξi−1) = 1 −m∑

i=1

(pi)2.

Along the same lines,∑i,j,k,l h

2ijh

1kl(bjl)

2 is equal to 1 − ∑nj=1(qj)

2, which concludes the proof.

Corollary 5.5.2. Under the assumptions of Theorem 5.5.2 and the condition that Y1 = T (X1) a.s.for some strictly monotone and continuous transformation T on ranX1,

1. Q(H1, H1) = 1 − ∑mi=1 p

2i if T is increasing,

2. Q(H1, H1) = −1 +∑mi=1 p

2i if T is decreasing.

Proof. First, assume T to be increasing. If Y1 = T (X1), then the support of Y1 is given by

η1 = T (ξ1) < η2 = T (ξ2) < · · · < ηm = T (ξm).

Furthermore, P[Y1 = T (ξi)] is equal to P[X1 = ξi] and hence G(T (ξi)) coincides with F (ξi). Especially,ranG = ranF and, by means of Theorem 5.5.3, |Q| ≤ 1 − ∑m

i=1 p2i . Moreover, X1 and Y1 are

comonotonic and h1ij is given by

h1ij = P[X1 = ξi, Y1 = T (ξj)] =

pi if i = j,

0 otherwise.

Page 91: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

5.5. MEASURES OF ASSOCIATION 91

Hence, by (5.29),

Q =

m∑

i=1

pi[min

(F (ξi), G(T (ξi))

)+ 2 min

(F (ξi−1), G(T (ξi))

)+ min

(F (ξi−1), G(T (ξi−1))

)− 1

]

=

m∑

i=1

pi[min

(F (ξi), F (ξi)

)+ 2 min

(F (ξi−1), F (ξi)

)+ min

(F (ξi−1), F (ξi−1)

)− 1

]

=m∑

i=1

pi[F (ξi) + 3F (ξi−1) − 1] =m∑

i=1

[F (ξi)2 − F (ξi−1)

2] + 2m∑

i=1

piF (ξi−1) − 1

(5.39),(5.40)= 1 −

m∑

i=1

p2i .

Next, suppose T is decreasing. Then, with Remark 5.5.4 the support of Y1 is given by

η1 = T (ξm) < η2 = T (ξm−1) < · · · < ηm = T (ξ1)

and set p1, . . . , pm is identical with q1, . . . , qm. Hence, by means of Theorem 5.5.3, |Q| ≤ 1 −∑mi=1 p

2i again holds. Furthermore, G(ηi) = 1−F (ξm−i). But note that ranF is not necessarily equal

to ranG in this case. The joint probabilities are given by

h1ij = P[X1 = ξi, Y1 = T (ξm−j+1)] =

pi, if j = m− i+ 1,

0 otherwise.

and so, by (5.29),

Q =

m∑

i=1

pi[max

(F (ξi) +G(ηm−i+1) − 1, 0

)+ max

(F (ξi−1) +G(ηm−i+1) − 1, 0

)+

max(F (ξi) +G(ηm−i) − 1, 0

)+ max

(F (ξi−1) +G(ηm−i) − 1, 0

)− 1

]

=

m∑

i=1

pi[max

(F (ξi) − F (ξi−1), 0

)+ max

(F (ξi−1) − F (ξi−1), 0

)+

max(F (ξi) − F (ξi), 0

)+ max

(F (ξi−1) − F (ξi), 0

)− 1

]

=

m∑

i=1

pi[F (ξi) − F (ξi−1) − 1] =

m∑

i=1

[p2i ] −

m∑

i=1

pi = −1 +

m∑

i=1

p2i .

Kendall’s Tau

If (X,Y ) is a random vector whose marginals have discrete distributions with finite support, wementioned earlier in this section that ρτ (C) belongs to [ρτ (CL), ρτ (CU )] for the Carley’s extensions CLand CU . In this case, this area can be evaluated as follows:

Theorem 5.5.4. Suppose (X,Y ) is a random vector whose marginals have discrete distributions withfinite support. Furthermore, let hij denote P[X = ξi, Y = ηj ]. Then

ρτ (CU ) = 1 − 4

m∑

i=1

n∑

j=1

i−1∑

k=1

n∑

l=j+1

hijhkl(5.41a)

ρτ (CL) = −1 + 4

m∑

i=1

n∑

j=1

i−1∑

k=1

j−1∑

l=1

hijhkl(5.41b)

Page 92: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

92 CHAPTER 5. MULTIVARIATE DISCRETE DISTRIBUTIONS

Proof. The proof of the above statements is given in the Appendix on page 143.

Example 5.2. Imagine X and Y have both Bernoulli distributions, i.e. X ∼ B(p) and Y ∼ B(q) forsome p, q ∈ (0, 1). In this case we have

ρτ (CU ) = 1 − 4h10h01 ρτ (CL) = 4h11h00 − 1

(5.1)= 1 − 4(p− C(p, q))(q − C(p, q))

(5.1)= 4(C(p, q))(1 − p− q + C(p, q)) − 1

Recall also that the supports of CU and CL are illustrated in Figure 5.1. Suppose now p = q = 1/2.Then, if X and Y are comonotonic, we have C(p, q) = 1/2 and hence

ρτ (CU ) = 1 − 4 · 0 = 1 and ρτ (CL) = 4 · 1

4− 1 = 0.

Analogously, if X and Y are countermonotonic C(p, q) = 0 and hence

ρτ (CU ) = 1− 4 · 1

4= 0 and ρτ (CL) = 4 · 0 − 1 = −1.

And, what is even more unpleasant, if X and Y are independent,

ρτ (CU ) = 1 − 4(p− pq)(q − pq) ρτ (CL) = 4(pq)(1 − p− q − pq) − 1

= 1 − 4pq(1− p)(1 − q) = 4(pq)(1 − p)(1 − q) − 1

= 1 − 41

4· 1

4=

3

4= 4

1

4· 1

4− 1 = −3

4

Hence, even if X and Y are perfect dependent and independent respectively, some choices of a copulafrom CXY can be very misleading.

Next, in the light of the previous section, we focus on the standard extension copula. If (X,Y ) isa random vector whose marginals have discrete distributions with finite support and (X∗, Y ∗) is itsindependent copy, we have with Theorem 5.5.2

P[(X −X∗)(Y − Y ∗) > 0] − P[(X −X∗)(Y − Y ∗) < 0] = 4

∫ 1

0

∫ 1

0

CS(u, v) dCS(u, v) − 1,

where CS denotes the standard extension copula corresponding to (X,Y ). With Theorem 5.5.3 inmind, we can now define the following quantity.

Definition 5.5.1. Let (X,Y ) be a random vector whose marginals have discrete distributions withfinite support and let the probabilities P[X = ξi] and P[Y = ηj ] be denoted by pi and qj respectively.Then the discrete version of Kendall’s tau, ρ∗τ , is defined by

(5.42) ρ∗τ =1√

1 −m∑i=1

p2i

√1 −

n∑j=1

q2j

[4

∫ 1

0

∫ 1

0

CS(u, v) dCS(u, v) − 1],

where CS denotes the standard extension copula belonging to (X,Y ).

This quantity shows many properties similar to those of Kendall’s tau for continuous distributions.It is clearly symmetric and bounded by −1 and 1. Since CS =π if X and Y are independent, ρ∗τ = 0in such a case. Furthermore, with Lemma 5.3.1, if (X,Y ) is stochastically smaller than (X

, Y′

), then

Page 93: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

5.5. MEASURES OF ASSOCIATION 93

ρ∗τ (X,Y ) ≥ ρ∗τ (X′

, Y′

). Concerning the weak convergence, we have, under the hypothesis of Theo-rem 5.4.3, ρ∗τ (Xn, Yn) → ρ∗τ (X,Y ). In addition, by means of Corollary 5.5.2, ρ∗τ is 1 if Y is a.s. acontinuous, strictly increasing transformation of X and, similarly, ρ∗τ = −1 if Y is a.s. a continuous,strictly decreasing transformation of X . Moreover, ρ∗τ is clearly symmetric and, by means of LemmaB.2.1, Proposition 3.1.2 and Remark 5.5.4, ρ∗τ (X,Y ) = −ρ∗τ(T (X), Y ) for any continuous and strictlydecreasing transformation T on ranX . Also, since the subcopula corresponding to (X,Y ) is invariantunder continuous, strictly increasing transformations, ρ∗τ will also have this property and hence A4∗

follows. Hence, ρ∗τ satisfies axioms A1-A3, A4∗, A5∗, A7 and a modified version of A6 of a measure ofconcordance.

As we will see in the chapter on empirical copulas, ρ∗τ can also be justified by fact that if H , F ,and G are empirical distribution functions, it coincides with the sample version of Kendall’s tau pro-posed in the literature.

Spearman’s Rho

With Spearman’s rho, we proceed similarly as with Kendall’s tau. Again, ρS is in general not invariantwithin the class C(X,Y ) and

ρS(C) ∈ [ρS(CL), ρS(CU )]

for the Carley’s extensions CL and CU .

Example 5.3. Assume X and Y are Bernoulli distributed random variables, i.e. X ∼ B(p) andY ∼ B(q) for some p, q ∈ (0, 1). As can be found in the Appendix on page 145,

ρS(CU ) = 1 − 6h01h10(h01 + h10)(5.43a)

ρL(CU ) = 6h00h11(h00 + h11) − 1(5.43b)

with hij := P[X = i, Y = j]. Now imagine that p = q = 1/2 and recall the expressions for hij given in(5.1). If X and Y are comonotonic, we get h00 = M(p, q) = 1/2, h11 = 1/2 and h10 = h01 = 0 andconsequently

ρS(CU ) = 1 − 6 · 0 = 1 and ρS(CL) = 6(1

2

)(1

2

)(1) − 1 =

6

4− 1 =

1

2.

If on the other hand X and Y are countermonotonic, we get h00 = 0 = h11 and h10 = h01 = 1/2,which in turn implies

ρS(CU ) = 1 − 6(1

2

)(1

2

)(1) = 1 − 6

4= −1

2and ρS(CL) = 6 · 0 − 1 = −1.

And finally, if X and Y are independent,

ρS(CU ) = 1 − 6p(1 − q)q(1 − p)(p+ q − 2pq) ρL(CU ) = 6pq(1− p)(1 − q)(1 − p− q + 2pq) − 1

= 1 − 61

16

1

2=

13

16= 6

1

16

1

2− 1 = −13

16

Thus we get a similar situation as with Kendall’s tau: a ”wrong” choice a copula from CXY can yieldquite misleading results.

Consider now random vectors (X,Y ) and (X∗, Y ∗), the latter being independent of (X,Y ) andX∗ and Y ∗ being independent and copies of X and Y , respectively. Then, with Theorem 5.5.2, thedifference between the corresponding probabilities of concordance and discordance is given by

(5.44) Q = 4

∫ 1

0

∫ 1

0

CS(u, v) du dv − 1 = 4

∫ 1

0

∫ 1

0

uv dCS(u, v) − 1,

Page 94: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

94 CHAPTER 5. MULTIVARIATE DISCRETE DISTRIBUTIONS

where CS is the standard extension copula corresponding to (X,Y ). Now, if we denote by ρS theSpearman’s rho corresponding to the standard extension copula, i.e.,

(5.45) ρS := 3Q,

we obtain the following:

Theorem 5.5.5. Let (X,Y ) be a random vector with marginal distributions F and G. Furthermore,let F and G be discrete with finite supports. Then

(5.46) ρS = 12[ m∑

i=1

n∑

j=1

hij

((F (ξi) + F (ξi−1)

)

2

(G(ηj) +G(ηj−1)

)

2− 1

4

)].

Proof. (5.46) can be shown using a simple calculus. To do so, recall also that the standard extensioncopula has a Lebesgue density given by (5.6b).

ρS = 12

∫ 1

0

∫ 1

0

uv dCS(u, v) − 3

= 12

m∑

i=1

n∑

i=1

∫ F (ξi)

F (ξi−1)

∫ G(ηj)

G(ηj−1)

uv · hij(F (ξi) − F (ξi−1)

)(G(ηj) −G(ηj−1)

) dudv − 3

= 12m∑

i=1

n∑

i=1

hij4

(F (ξi) + F (ξi−1)

)(G(ηj) +G(ηj−1)

)− 3

= 12[ m∑

i=1

n∑

j=1

hij

((F (ξi) + F (ξi−1)

)

2

(G(ηj) +G(ηj−1)

)

2− 1

4

)]

In order to examine ρS , the following identity will turn out to be useful:

m∑

i=1

n∑

j=1

hij(F (ξi) + F (ξi−1)

)=

m∑

i=1

(F (ξi) − F (ξi−1)

)(F (ξi) + F (ξi−1)

)

=m∑

i=1

(F (ξi)

)2 −(F (ξi−1)

)2=

m∑

i=1

(F (ξi)

)2 −m−1∑

i=0

(F (ξi)

)2= 1.

(5.47)

In the first place, it implies

ρS = 3

m∑

i=1

n∑

j=1

hij[F (ξi) + F (ξi−1) − 1

][G(ηj) +G(ηj−1) − 1

]

and hence ρS is identical with the measure of dependence ρ for discontinuous random variables con-sidered by Hoeffding (1940b). He pointed out that ρ equals twelve times the covariance of certaindiscrete random variables. However, since he was using another standardization, our approach willslightly differ from his. Here, we define X and Y by:

P

[X =

F (ξi) + F (ξi−1)

2

]= F (ξi) − F (ξi−1) := pi, i = 1, . . . ,m,

P

[Y =

G(ηj) +G(ηj−1)

2

]= G(ηj) −G(ηj−1) := qj , j = 1, . . . , n,

P

[X =

F (ξi) + F (ξi−1)

2, Y =

G(ηj) +G(ηj−1)

2

]= hij , i = 1, . . . ,m, j = 1, . . . , n.

Page 95: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

5.5. MEASURES OF ASSOCIATION 95

With (5.47) it is easy to see that the expectations E X and E Y are both equal to 1/2. Unfortunately,however, their variances are not one, not even independent of the marginal distribution functions Fand G.

Lemma 5.5.2. With the above notation, the variances of X and Y are given by

Var X =

1 −m∑i=1

p3i

12,(5.48)

Var Y =

1 −n∑j=1

q3i

12.(5.49)

Proof. The proof is again based on a standard calculus and therefore given in the Appendix on page147.

This lemma has the following consequence for ρS :

(5.50) ρS = Corr(X, Y ) ·

√√√√(1 −

m∑

i=1

p3i

)(1 −

n∑

j=1

q3i)

and hence

(5.51) |ρS | ≤

√√√√(1 −

m∑

i=1

p3i

)(1 −

n∑

j=1

q3i)< 1.

This means that even if X = Y , ρS will never be one. In fact, in this case we get X = Y and henceCorr(X, Y ) = 1. Therefore,

(5.52) ρS = 1 −m∑

i=1

(F (ξi) − F (ξi−1)

)3.

Motivated by this results, we can correct ρS similarly as Hoeffding did and obtain

Definition 5.5.2. Let (X,Y ) be a random vector whose marginals have discrete distributions withfinite support and let the probabilities P[X = ξi] and P[Y = ηj ] be denoted by pi and qj respectively.Then the discrete version of Spearman’s rho, ρ∗S , is defined by

(5.53) ρ∗S = Corr(X, Y ) =ρS√

(1 −

m∑i=1

p3i

)(1 −

n∑j=1

q3i).

Clearly, this measure of concordance can achieve the limits ±1: it is equal to one if X is a strictlyincreasing and continuous transformation of Y and to −1 if X is a strictly decreasing and continuoustransformation of Y . Since the nominator is based on the standard extension copula, ρ∗S will inheritmany nice properties from the Spearman’s rho for continuous distribution functions. Since CS = πif X and Y are independent, ρ∗S will be zero in such a case. Lemma 5.3.1 guarantees that if (X,Y )

is stochastically smaller than (X′

, Y′

), then ρ∗S(X,Y ) ≥ ρ∗S(X′

, Y′

). Moreover, under the hypothesisof Theorem 5.4.3, ρ∗S(Xn, Yn) → ρ∗S(X,Y ). Also, by means of Proposition 5.2.1, ρ∗S(X,Y ) is equal toρ∗S(Y,X), and, for any continuous, strictly increasing transformation T, ρ∗S(X,Y ) and ρ∗S(T (X), Y )

Page 96: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

96 CHAPTER 5. MULTIVARIATE DISCRETE DISTRIBUTIONS

are the same. If, on the other hand, T is a strictly decreasing and continuous transformation on ranX ,the same proposition yields

1∫

0

1∫

0

CS(u, v) du dv =

1∫

0

1∫

0

v − CS(u, v) du dv =1

2−

1∫

0

1∫

0

CS(u, v) du dv,

where CS(u, v) denotes the standard extension copula of (T (X), Y ). Hence, with Remark 5.5.4,ρ∗S(T (X), Y ) is equal to −ρ∗S(X,Y ). In other words, ρ∗S satisfies the axioms axioms A1-A3, A4∗,A5∗, A7 and a modified version of A6 of a measure of concordance.

Moreover, ρ∗S shows an interesting behavior when the distributions involved, H , F and G, are em-pirical distributions, as we will see in the chapter dealing with empirical copulas.

5.5.3 Distance-based Measures

Further quantities for measuring concordance and dependence can be obtained from the Lp-distance,1 ≤ p ≤ ∞ by choosing any copula from CH (cf. Examples 4.2 and 4.6), or more general, with Proposi-tions 4.3.1 and 4.3.2. If the marginals are discrete with finite support, then the influence of the copulachoice on such quantities can again be estimated by the measures resulting from the Carley’s extensions.

In the light of the previous results on Kendall’s tau and Spearman’s rho, the standard extensioncopula seems to be a reasonable choice. However, pure replacing of C by CS will in general yield aquantity which will not satisfy all axioms required. Especially it will not necessarily attain ±1. Thenormalization will most certainly depend on the marginal distributions, and will probably not be easyto obtain. Another difficulty caused by choosing the standard extension copula is the fact that evenif X and Y are comonotonic and countermonotonic, respectively, CS will equal neither M nor W .Consequently, the role played by the Frechet-Hoeffding bounds in Propositions 4.3.1 and 4.3.2 shouldbe reconsidered. One possibility, which seems reasonable at first sight, is to work with the standardextensions of the subcopulas coinciding with the Frechet-Hoeffding bounds instead, i.e. with MS andWS (recall Notation 5.5.1). This approach yields the following quantities:

(†) ρd(CS) :=d(CS ,WS) − d(CS ,MS)

d(MS ,WS)and δd(S) :=

d(CS ,π)

d(MS ,π).

Despite the fact that they both depend on the marginal distributions (which, in the light of Theorem5.5.1 is not unexpected), they have one other drawback: as we already mentioned at the beginning ofSection 5.5,

(‡) d(π,MS) = d(π,WS)

is no longer implied by (5.27) in general. In particular, it is not valid for the L1-distance, as thefollowing example shows.

Example 5.4. Let Fp and Gq denote Bernoulli distributions, i.e. F = B(p) and G = B(q). Further-more, let the joint distribution function Hpq be equal to C(Fp, Gq) for a copula C. Being this the case,Hpq is given by (5.1), i.e. by the following joint probability densities:

h00 = C(p, q) h10 = q − C(p, q)

h01 = p− C(p, q) h11 = 1 − p− q + C(p, q)

Page 97: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

5.5. MEASURES OF ASSOCIATION 97

In particular, by means of Theorem 5.5.2,

1∫

0

1∫

0

CS du dv =Q

4+

1

4

where Q denotes the difference between the probabilities of concordance and discordance correspondingto Spearman’s rho (cf. (5.44)). Now, with (6.12), Q can be calculated as follows:

Q = h11pq + h00(1 − p)(1 − q) − h10p(1 − q) − h01(1 − p)q

= [1 − p− q + C(p, q)]pq + C(p, q)(1 − p)(1 − q) − [q − C(p, q)]p(1 − q) − [p− C(p, q)](1 − p)q

= p[q − pq − q2 + qC(p, q) − q + C(p, q) + q2 − qC(p, q)] + (1 − p)[C(p, q) − qC(p, q) − qp+ qC(p, q)]

= p[−pq + C(p, q)] + (1 − p)[C(p, q) − qp] = C(p, q) − qp

From this we have, since∫ 1

0

∫ 1

0uv du dv = 1/4,

1∫

0

1∫

0

[MS(u, v) − uv] du dv =M(p, q) − qp

4and

1∫

0

1∫

0

[uv −WS(u, v)] du dv =qp−W(p, q)

4

Now, setting p = q = 2/3 yields

1∫

0

1∫

0

[MS(u, v) − uv] du dv =2

36and

1∫

0

1∫

0

[uv −WS(u, v)] du dv =1

36.

Consequently, for the L1-distance ‖ · ‖1,

‖MS −π‖1 6= ‖WS −π‖1 in general.

Therefore, this way to extend Propositions 4.3.1 and 4.3.2 does not seem to be that clear, unless weadditionally demand (‡). In the light of the above example, however, it may be questionable whethersuch a metric will be suitable for practical purposes or have a meaningful interpretation.

Remark 5.5.5. It may be worth noticing that the above kind of ”asymmetry” between the distance ofthe independence copula and the Frechet-Hoeffding bounds bears certain similarities to the fact that%max 6= −%min for the linear correlation coefficient (c.f. Remark 4.3.5).In addition, inspired by the results concerning the discrete versions of Kendall’s tau and Spearman’srho, we could try to replace the upper bound MS and WS in (†) by M∗

S and W∗S , respectively, where

M∗S and W∗

S stands for the standard extension copula corresponding to the case when Y = g(X) a.s.for a strictly increasing and decreasing function g, respectively. The measures obtained in this wayshould then satisfy the modified axiom A4∗ and, under some minor assumptions, comply with theremaining demands as well.

We close this chapter by one more remark on the research done on this subject so far. Hoeffding(1940b) derived a version of δ2, the dependence measure based on the L2-distance, for discrete randomvariables with finite support. It can be easily seen, that his measure results from δ2 by using thestandard extension copula. With his calculations, it is also clear how difficult the necessary standard-ization is. It is worth noting that the measure obtained by Hoeffding does attend 1 if the marginalsare strictly monotone and continuous transformations of each other, but does not, in general, satisfyaxiom A4.

Page 98: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

98 CHAPTER 5. MULTIVARIATE DISCRETE DISTRIBUTIONS

Page 99: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

Chapter 6

Empirical Copulas

In the previous chapters, we proposed several dependence and concordance measures. However, wedid not consider their estimation. This is, in fact, far more ambiguous that it may seem at first sight.Since this topic is beyond the scope of this thesis, we will not discuss it in greater detail here1. Butbecause it shows the quantities ρ∗τ and ρ∗S , i.e. our versions of Kendall’s tau and Spearman’s rho fordiscrete distributions with finite support, in quite an interesting light, we will briefly mention one suchestimating procedure in this chapter: the empirical copulas. These functions were introduced as ”em-pirical dependence functions” by Deheuvels (1979), who used them for construction of nonparametrictests of independence (see also Deheuvels (1981)).

6.1 Basic Properties

If there are no ties in the observations, then an empirical copula is given as follows:

Definition 6.1.1. (Deheuvels, 1979) Let(xk , yk)

nk=1

denote a sample of size n from a bivariatecontinuous distribution and suppose there are no coincident xi’s and yj ’s respectively. The empirical

subcopula is the function Sn on i/n|0 ≤ i ≤ n2 given by

(6.1) Sn( in,j

n

)=

number of pairs (x, y) in the sample such that x ≤ x(i) and y ≤ y(j)

n,

for i, j 6= 0 and by zero otherwise, where x(i) and y(j), 1 ≤ i, j ≤ n, denote order statistics from

the sample. The empirical copula Cn is defined as a standard extension of Sn. The empirical copulafrequency cn is given by

(6.2) cn

( in,j

n

)=

number of pairs (x, y) in the sample such that x = x(i) and y = y(j)

n.

Remark 6.1.1. Note that there exist several versions of Definition 6.1.1. One other possibility is todefine an empirical copula via

C∗n(u, v) =

number of pairs (x, y) in the sample such that x ≤ x(bnuc) and y ≤ y(bnuc)n

for (u, v) ∈ I2. In this case, C∗n equals the empirical distribution function of the transformed ob-

servations (R(xk)/n,R(yk)/n), where R(xk) and R(yk) denote ranks of the observations xk and yk,

1A reference on this subject is e.g. Tjøstheim (1996).

99

Page 100: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

100 CHAPTER 6. EMPIRICAL COPULAS

respectively. Another possibility is to set

Cn(u, v) = Hn(F−1n (u), G−1

n (v)), u, v ∈ [0, 1]

where Hn, Fn and Gn stand for the empirical distribution functions corresponding to the sample. Butneither C∗

n nor Cn is continuous and hence a proper copula; which is the reason for why we chose to

use the above definition. However, an important thing here is that any of the versions Cn, C∗n and Cn

extends Sn and are asymptotically equal.

Remark 6.1.2. Note that if there are no ties present, as assumed in Definition 6.1.1, then, for any i,

cn

(in ,

jn

)equals one for precisely one j and is zero otherwise. And, similarly, for any j, cn

(in ,

jn

)

equals one for precisely one i and vanishes otherwise. In other words, the observed value of y can beuniquely determined by the observed value of x and hence there exists a unique functional relationshipbetween the components of the observations. This in turn implies that if we describe the dependencebetween the components of the sample by means of some dependence concept or measure, which isextreme if the variables are related by X = f(Y ) a.s. for an arbitrary (measurable) function2 then thisdependence concept will attain its maximal value. This is another reason why the general functionalrelationship is no suitable perfect dependence concept, in general.

From now on, we will occasionally allow ties in both components. To handle this, we will use thefollowing notation:

Notation 6.1.1. Let(xk , yk)

nk=1

denote a sample of size n from a bivariate distribution. Assumethat there are r and s distinct values of xi and yj in the sample, respectively. These values will bedenoted by

ξ1 < ξ2 < · · · < ξr and η1 < ηj < · · · < ηs.

It is worth noting that if there are no ties present, ξi equals the i-th order statistics x(i) and, analogously,ηj = y(j). The number of observations (x, y) in the sample with x = ξk, k = 1, . . . , r, will be denotedby uk. Similarly, for l = 1, . . . , s, vl will stand for the number of observations with y = ηl. Finally, bywkl we will denote the number of observations in the sample equal to (ξk , ηl).

Remark 6.1.3. Note that if i is an index satisfying

u1 + · · · + uk−1 < i ≤ u1 + · · ·uk for k = 2, . . . , r or 1 ≤ i ≤ uk for k = 1,

then the order statistics x(i) is equal to ξk. Analogously, y(j) corresponds to ηl, where l is an indexsatisfying

v1 + · · · + vl−1 < j ≤ v1 + · · · vl for l = 2, . . . , s or 1 ≤ j ≤ vl for l = 1.

Now, the empirical distribution functions Fn, Gn, Hn corresponding to the sample(xk , yk)

nk=1

2Such quantity is for example Pearson’s contingency coefficient.

Page 101: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

6.1. BASIC PROPERTIES 101

are given as follows:

Fn(x) =number of pairs (xk , yk) with xk ≤ x

n=

n∑

k=1

1[xk,∞)(x)1

n=

r∑

i=1

1[ξi,∞)(x)uin,(6.3a)

Gn(y) =number of pairs (xk , yk) with yk ≤ y

n=

n∑

k=1

1[yk,∞)(y)1

n=

s∑

j=1

1[ηj ,∞)(y)vin,(6.3b)

Hn(x, y) =number of pairs (xk , yk) with xk ≤ x and yk ≤ y

n(6.3c)

=

n∑

k=1

1[xk,∞)(x)1[yk,∞)(y)1

n=

r∑

i=1

s∑

j=1

1[ξi,∞)(x)1[ηj ,∞)(y)wijn.

If there are ties in the observations, then there exist several possibilities how to define ranks of theobservations. We will need the following two:

Definition 6.1.2. Let(xk, yk)

nk=1

denote a sample of size n with not necessarily distinct observa-tions. Then the ranks R(xk) and R(yk), respectively, are given as follows:

(6.4) R(xk) :=

n∑

i=1

1(xi ≤ xk), R(yk) :=

n∑

i=1

1(yi ≤ yk).

Furthermore, let i and j be such that xk = ξi and yk = ηj . Then the average ranks R(xk) of xk andyk are defined by

(6.5) R(xk) =

1+···+u1

u1= u1+1

2 if i = 1,i−1∑l=1

ul +ui+1

2 otherwise.and R(yk) =

1+···+v1v1

= v1+12 if j = 1,

j−1∑l=1

vl +vj+1

2 otherwise.

Remark 6.1.4. Note that the empirical distribution functions satisfy

Fn(xk) =R(xk)

nand Gn(yk) =

R(yk)

n.

Moreover, if no ties are present, R(x(i)) = i as well as R(y(j)) = j and hence

Fn(x(i)) =i

nand Gn(y(j)) =

j

n.

Remark 6.1.5. With (6.3), the average ranks of the observations can be re-written as

(6.6) R(xk) = n(Fn(ξi−1) +

ui2n

)+

1

2and R(yk) = n

(Gn(ηj−1) +

vj2n

)+

1

2

with Fn(ξ0) = Gn(η0) := 0 and i, j such that xk = ξi and yk = ηj . If no ties are present, then theranks and average ranks coincide, i.e.,

R(x(i)) = n(Fn(x(i−1)) +

1

2n

)+

1

2and R(y(j)) = n

(Gn(y(j−1) +

1

2n))

+1

n

=(2(i− 1) + 1) + 1

2= i = R(x(i)) =

(2(j − 1) + 1) + 1

2= j = R(y(j))

Page 102: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

102 CHAPTER 6. EMPIRICAL COPULAS

With the above definitions, we are now in position to define empirical copulas in the general case,when ties in the observations are allowed.

Definition 6.1.3. Let(xk, yk)

nk=1

denote a sample of size n from a bivariate distribution, with all ob-

servations not necessarily distinct. The empirical subcopula is the function Sn on 0, R(x(1))/n, . . . , R(x(n))/n×0, R(y(1))/n, . . . , R(y(n))/n given by

(6.7) Sn(R(x(i))

n,R(y(j))

n

)=

number of pairs (x, y) in the sample such that x ≤ x(i) and y ≤ y(j)

n

and by zero otherwise. The empirical copula, Cn, is defined as the standard extension of Cn. Theempirical copula frequency cn is given by

(6.8) cn

(R(x(i))

n,R(y(j))

n

)=

number of pairs (x, y) in the sample such that x = x(i) and y = y(j)

n.

With Remark 6.1.4, the above definition coincides with Definition 6.1.1 if no ties are present.Moreover, if k and l are such that x(i) = ξk and y(j) = ηl,

(6.9) cn

(R(x(i))

n,R(y(j))

n

)=wkln

and

(6.10) Cn(R(x(i))

n,R(y(j))

n

)=

k∑

k∗=1

l∑

l∗=1

wk∗l∗

n

which in turn implies

Lemma 6.1.1. Let(xk , yk)

nk=1

denote a sample of size n from a bivariate distribution, with all

observations not necessarily distinct. Furthermore, let Hn, Fn and Gn denote the corresponding dis-tribution functions. Then Sn is the (unique) subcopula of Hn; especially, for any 1 ≤ i, j ≤ n,

Hn(xi, yj) = Sn(Fn(xi), Gn(yj)

).

Proof. First, the following relationship holds:

xi = x(R(xi)) and yj = y(R(yj)).

Next, if k and l are such that

x(R(xi)) = ξk and y(R(yj)) = ηl,

then, with (6.3c) and (6.10),

Hn(xi, yj) =

k∑

k∗=1

l∑

l∗=1

hk∗l∗

n= Sn

(R(x(R(xi)))

n,R(y(R(yj)))

n

)

= Sn(R(xi)

n,R(yj)

n

)= Sn

(Fn(xi), Gn(yj)

),

where the last equation follows with Remark 6.1.4.

Page 103: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

6.2. KENDALL’S TAU AND SPEARMAN’S RHO FOR EMPIRICAL DISTRIBUTIONS 103

Remark 6.1.6. Note that the empirical copula Cn coincides with the empirical distribution function(restricted to the unit square) of the transformed observations

(xk , yk) 7→(R(xk)

n,R(yk)

n

)

on the set 0, R(x(1))/n, . . . , R(x(n))/n × 0, R(y(1))/n, . . . , R(y(n))/n. Hence, (5.5) yields that theempirical copula is a rank statistics. This also implies that various theorems concerning the weakconvergence of empirical processes apply to empirical copulas (see e.g. Vaart and Wellner (1996),Vaart (1998) as well as Deheuvels (1981)). Especially, according to Deheuvels (1981) or Vaart andWellner (1996, Section 3.9.4.4.), in the continuous case, the empirical copula converges weakly to a

mean zero Gaussian process and√n(Cn − C) converges weakly to a function of a tight Brownian

bridge.

Remark 6.1.7. It is also worth noticing that Cn equals the standard extension copula of Hn and, bymeans of Theorem 5.2.2, coincides with the unique copula corresponding to the smoothed version ofthe empirical distribution function, Hn.

Remark 6.1.8. If we sum up the observations(xk , yk)

nk=1

into a contingency table, i.e. a table withr rows and s columns and a number of observations equal to (ξi, ηj) placed in the ij-cell for any row iand column j, then the entry in the ij-cell equals n times the empirical copula frequency cn evaluatedat the point

(∑ik=1 ukn

,

∑jl=1 vln

)

by means of (6.9) and Remark 6.1.3.

6.2 Kendall’s Tau and Spearman’s Rho for Empirical Distri-

butions

In this section, we make use of the results of the preceding chapter. Indeed, empirical distributionfunctions are special cases of discrete distributions with finite support and therefore the theory devel-oped before apply to this case. The supports of the marginal empirical distribution functions are givenby

ξ1 < ξ2 < · · · < ξr and η1 < η2 < · · · < ηs

for Fn and Gn, respectively. For the sake of simplicity, we set further ξ0 := ξ1 − 1 and η0 := η1 − 1 aswell as

pi :=uin, qj :=

vjn

and hij =wijn.

We will now justify the discrete versions of Kendall’s tau and Spearman’s rho, ρ∗τ and ρ∗S , by showingthat they are equal to the sample versions of Kendall’s tau and Spearman’s rho proposed in theliterature (for definition and references, see e.g. Kraemer (1998)).

Theorem 6.2.1. Letxk , yk

nk=1

denote a sample of size n from a bivariate distribution H. Then ρ∗τcorresponding to the empirical distribution function Hn equals the sample version of Kendall’s tau,

(6.11) τ =#[concordant pairs] − #[discordant pairs]√(

n2

)− u

√(n2

)− v

,

where

u =

r∑

k=1

(uk2

), v =

s∑

l=1

(vl2

).

Page 104: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

104 CHAPTER 6. EMPIRICAL COPULAS

Proof. First, because of

P[X1 < X2, Y1 < Y2] =m∑

i=1

n∑

j=1

i−1∑

k=1

j−1∑

l=1

h1klh

2ij ,

P[X1 > X2, Y1 > Y2] =

m∑

i=1

n∑

j=1

m∑

k=i+1

n∑

l=j+1

h1klh

2ij =

m∑

k=1

n∑

l=1

k−1∑

i=1

l−1∑

j=1

h1klh

2ij ,

P[X1 < X2, Y1 > Y2] =

m∑

i=1

n∑

j=1

i−1∑

k=1

n∑

l=j+1

h1klh

2ij =

m∑

i=1

n∑

l=1

i−1∑

k=1

l−1∑

j=1

h1klh

2ij ,

P[X1 > X2, Y1 < Y2] =

m∑

i=1

n∑

j=1

m∑

k=i+1

j−1∑

l=1

h1klh

2ij =

m∑

k=1

n∑

j=1

k−1∑

i=1

j−1∑

l=1

h1klh

2ij

the difference between the probabilities of concordance and discordance between two independent ran-dom vectors (X1, Y1) and (X2, Y2) with common marginals and (not necessarily equal) joint probabilitydensity functions h1 and h2 can be re-written as

(6.12) Q =

m∑

i=1

n∑

j=1

i−1∑

k=1

j−1∑

l=1

[h1klh

2ij + h1

ijh2kl − h1

kjh2il − h1

ilh2kj

].

In our situation, when there are r and s distinct observations of X and Y , respectively,

(6.13) ρ∗τ = 2

r∑

i=1

s∑

j=1

i−1∑

k=1

j−1∑

l=1

[hklhij − hkjhil

]/

√√√√1 −r∑

i=1

p2i

√√√√1 −s∑

j=1

q2j .

Since hijhkl equals wijwkl/n2 if (ξi, ηj) and (ξk, ηl) are in the sample [concordant pair] and hilhkj

equals wilwkj/n2 if (ξi, ηl) and (ξk , ηj) are in the sample [discordant pair], the numerator in (6.13)

equals2

n2

(#[concordant pairs] − #[discordant pairs]

).

In addition, we look closer at the quantities n2/2(1 − ∑ri=1 p

2i ). If no ties are present, then r = n,

ξ(i) = x(i) for i = 1, . . . , n and pi = 1/n. Therefore,

n2

2

[1 −

r∑

i−1

p2i

]=n2

2

[1 −

n∑

i=1

1

n2

]=n2

2

[1 − 1

n

]=n2(n− 1)

2n=

(n

2

).

Otherwise, pi = ui/n and we get

n2

2

[1 −

r∑

i=1

u2i

n2

]=n2

2

[1 −

n∑

i=1

1

n2−

r∑

i=1

u2i − uin2

]=

[n(n− 1)

2−

r∑

i=1

ui(ui − 1)

2

]=

(n

2

)−

r∑

i=1

(ui2

)

︸ ︷︷ ︸u

.

Since n2/2(1− ∑sj=1 q

2j ) can be re-written similarly, the theorem follows.

Theorem 6.2.2. Letxk, yk

nk=1

denote a sample of size n from a bivariate distribution H. Then ρ∗Scorresponding to the empirical distribution function Hn equals the sample version of Spearman’s rho,

(6.14) ρ =

∑nk=1(R(xk) − Rx)(R(yk) −Ry)√∑n

k=1(R(xk) −Rx)2∑n

k=1(R(yk) −Ry)2,

Page 105: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

6.2. KENDALL’S TAU AND SPEARMAN’S RHO FOR EMPIRICAL DISTRIBUTIONS 105

where Rx and Ry are given by

Rx =1

n

n∑

i=1

R(xi), Ry =1

n

n∑

j=1

R(yj).

Proof. For the start, note that, by definition of the average ranks,

Rx =1

n

r∑

k=1

uk

(∑k−1i=1 ui + 1

)+ · · · +

(∑k−1i=1 ui + uk

)

uk=

1

n

n∑

i=1

i =n+ 1

2.

Along the same lines, Ry = (n+ 1)/2 and hence Ry = Rx. Now, with (5.53), ρ∗S is equal to

ρ∗S = Corr(X, Y ),

where X and Y are random variables with expectations 1/2 given by

P

[X =

F (ξi) + F (ξi−1)

2

]= pi, i = 1, . . . , r,

P

[Y =

G(ηj) + G(ηj−1)

2

]= qj , j = 1, . . . , s,

P

[X =

F (ξi) + F (ξi−1)

2, Y =

G(ηj) + G(ηj−1)

2

]= hij , i = 1, . . . , r, j = 1, . . . , s.

Now, with (6.6),

F (ξi) + F (ξi−1)

2− 1

2= F (ξi−1) +

pi2

− 1

2=

(R(xk)

n− 1

2n− 1

2

)=

1

n(R(xk) −Rx)

for any xk with xk = ξi. Similarly,

G(ηj) + G(ηj−1)

2− 1

2=

(R(yl)

n− 1

2n− 1

2

)=

1

n(R(yl) −Ry)

for any yl such that yl = ηj . Because there are exactly ui such xk’s, vj such yl’s and finally wijobservations equal to (ξi, ηj),

Cov(X, Y ) =

r∑

i=1

s∑

j=1

hij

( F (ξi) + F (ξi−1)

2− 1

2

)( G(ηj) + G(ηj−1)

2− 1

2

)

=1

n

r∑

i=1

s∑

j=1

wij

( F (ξi) + F (ξi−1)

2− 1

2

)( G(ηj) + G(ηj−1)

2− 1

2

)

=1

n3

n∑

k=1

(R(xk) −Rx)(R(yk) −Ry)

and

Var(X) =

r∑

i=1

pi

( F (ξi) + F (ξi−1)

2− 1

2

)2

=1

n3

n∑

k=1

(R(xk) −Rx)2,

Var(Y ) =

s∑

j=1

qj

( G(ηj) + G(ηj−1)

2− 1

2

)2

=1

n3

n∑

k=1

(R(yk) −Ry)2.

Page 106: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

106 CHAPTER 6. EMPIRICAL COPULAS

Hence,

ρ∗S =Cov(X, Y )√

Var(X) Var(Y )=

∑nk=1(R(xk) −Rx)(R(yk) −Ry)√∑n

k=1(R(xk) −Rx)2∑n

k=1(R(yk) −Ry)2.

If there are no ties in the observations, Theorem 6.2.2 simplifies a little. Moreover, in such a case,there exists a nice relationship between ρ∗S and the empirical copula.

Theorem 6.2.3. Suppose that in the above situation Hn, Fn and Gn are empirical distribution func-tions belonging to a sample (xk, yk) of size n from a continuous bivariate distribution. Furthermore,suppose there are no coincident xi’s and yj ’s respectively. Finally, for 1 ≤ i ≤ n let Ri = j whenever(x(i), y(j)) is an element of the sample. Then

(6.15) ρ∗S =12

n(n2 − 1)

[n∑

i=1

iRi −n(n+ 1)2

4

].

Moreover,

(6.16) ρ∗S =12

n2 − 1

n∑

i=1

n∑

j=1

[Cn

( in,j

n

)− i

n

j

n

].

Proof. The first statement is an immediate consequence of the well-known fact that if no ties in theobservations are present, the right hand side of (6.14) turns into the right hand side of (6.15). For(6.16), see Nelsen (1999, Theorem 5.5.2.).

We close this chapter with one final remark on the use of the results derived above. Theorems6.2.1 and 6.2.2 not only somewhat justify the discrete versions of Kendall’s tau and Spearman’s rho,but also show, the other way round that the sample versions of these quantities are functions of theempirical copula (or its standard extension). Hence, results on weak convergence (c.f. Remark 6.1.6)can be used when investigating the asymptotic properties of ρ∗S and ρ∗τ . Moreover, since these measuresof concordance are zero in the case of independence, they can alternatively be used for testing. Thisidea was already used by Deheuvels (1979) (c.f. also Deheuvels (1981)), who derived distribution-freetests of independence for continuous random variables based upon Kendall’s tau and Spearman’s rho.There is also some work done when the observations are subject of censoring. Dabrowska (1996)showed that her multivariate extension of the Kaplan-Meier estimator can also be expressed in termsof the empirical copula and obtained weak convergence properties as well as tests of independence,which turned out to be generalizations to censored data of the aforementioned results by Deheuvels.

Page 107: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

Chapter 7

Modeling Multivariate

Distributions with Copulas

Throughout this chapter, we consider the following task:

Generate a bivariate joint distribution H whose marginals F and G belong to a certain classeach, say F and G, for example the class of Bernoulli, Binomial or Poisson distributions,

and focus on a modeling approach based on copulas. There exist much literature on the subject ofmultivariate distributions with given (mostly continuous) marginals. But, on the contrary, far less hasbeen done when considering the reversed situation, i.e. when the copula is fixed and the marginalsarbitrary (or from a certain class). An exception here is again the paper by Marshall (1996); someresults regarding this topic can also be found in Joe (1997).

The outline of this chapter will be the following: first, we consider the generation of multivariatedistributions using Sklar’s Theorem and investigate their dependence properties, especially when themarginals involved are not necessarily continuous. Thereafter, we will pay attention to one specialcase: multivariate distributions with given marginals and maximum negative dependence possible.The results will then be illustrated on the Poisson distribution in the upcoming chapter.

In the preceding sections, we obtained measures of concordance and dependence for discrete randomvariables. However, there was one particular quantity which we excluded from our observations: thelinear correlation coefficient %. As we saw earlier in this thesis, % does not satisfy all axioms demandedfrom a proper measure of concordance. But, this dependence concept still has other properties whichmake it favorable in various setups, e.g. when elliptical distributions are involved. It would there-fore also be the aim the upcoming sections to investigate the linear correlation coefficient when themarginals are not necessarily continuous, especially discrete.

7.1 Copula Models

As we mentioned before, one way to obtain multivariate distributions with given marginals is to useSklar’s Theorem. In particular, any given C in C generates a family HC of multivariate distributionsaccording to

(7.1) HC =HC | HC = C(F1, . . . , Fn) , Fi ∈ Fi

.

107

Page 108: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

108 CHAPTER 7. MODELING MULTIVARIATE DISTRIBUTIONS WITH COPULAS

In the bivariate case, this approach leads to

(7.2) HC =HC | HC = C(F,G) , F ∈ F and G ∈ G

.

The classes Fi are given and usually stand for some family of univariate distributions, such as Poissonor Binomial. Since copulas can show different behavior in different parts of their domain, the membersof HC will be in general also quite different. Thus the following question arises:

What properties will the family HC have for a fixed C, when Fi are some known (arbitrary)classes of univariate distributions?

We close this introduction by mentioning two special classes, the second of which will be of particularuse later on:

HM =HM | HM = M(F,G) , F ∈ F and G ∈ G

,(7.3a)

HW =HW | HW = W(F,G) , F ∈ F and G ∈ G

,(7.3b)

In the light of the previous results, members of these classes have maximum positive and negativedependent marginals, respectively.

7.1.1 Bivariate Bernoulli Distributions Revisited

To provide a necessary background for upcoming results, we start again with the simplest case, i.e.when F and G are Bernoulli distributions. Despite its simplicity, this situation already provides a goodillustration of what is unfortunately the case in general settings: the question raised above is far moreambiguous as it would appear at first sight. We also mention that copula models do not lead to alldistributions imaginable.

Suppose F and G are distribution functions of two possibly dependent Bernoulli distributed randomvariables X and Y with

F (0) = p and G(0) = q.

Furthermore, assume that the random vector (X,Y ) has a distribution function Hpq and a copula classCpq .

We begin our investigations with the task of modeling bivariate Bernoulli distributions in general.Because of (5.1), we need to determine Hpq(0, 0) for any given p, q. If p and q range over the wholeinterval [0, 1], this can be done upon choosing a function ψ : [0, 1]2 → [0, 1] satisfying

(7.4) W(p, q) ≤ ψ(p, q) ≤ M(p, q) for any p, q ∈ [0, 1]

and setting Hpq = ψ(p, q). (7.4) is just the Frechet-Hoeffding inequality and hence satisfied by anycopula. Hence, setting ψ := C for some C ∈ C which leads to a family of bivariate Bernoulli distribu-tions, especially to the model (7.2). Although it is straightforward with (7.4) that ψ is grounded andsatisfies ψ(1, q) = q as well as ψ(p, 1) = p, the following example shows that ψ itself is not necessarilya copula (and hence, the model (7.2) does not yield all Bernoulli distributions imaginable)

Example 7.1. Let ψ : [0, 1]2 → [0, 1] be a function given by

ψ(p, q) =

M(p, q), if p ≤ 1/2 ∧ q ≤ 1/2,

W(p, q), otherwise.

Page 109: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

7.1. COPULA MODELS 109

Then ψ clearly satisfies (7.4). However, ψ is not 2-increasing, since, for an 0 < ε < 1/2,

ψ(1

2+ ε,

1

2− ε) − ψ(

1

2− ε,

1

2− ε) − ψ(

1

2+ ε, 0) + ψ(

1

2− ε, 0) =

W(1

2+ ε,

1

2− ε) −M(

1

2− ε,

1

2− ε) = 0 − (

1

2− ε) = ε− 1

2< 0.

Since copulas can behave totally unlike in different parts of the unit square, (7.2) can lead tovery different distributions for different values of the parameters p and q. The following exampleby Marshall (1996) introduces a copula which can generate bivariate distributions with comonotonic,countermonotonic and independent marginals at the same time.

Example 7.2. Let C be a singular copula having mass uniformly distributed along the line segmentsjoining (0, 0) to (1/3, 1/3), (2/3, 1/3) to (1, 2/3) and (1/3, 2/3) to (2/3, 1):

1/3 2/3

1/3

2/3

This copula is interesting especially in two regions: on [0, 1/3] × [0, 1/3] it equals the Frechet-Hoeffding upper bound M and on [2/3, 1]× [2/3, 1] the Frechet-Hoeffding lower bound W , since herewe have C(u, v) = 1/3 + (u− 2/3) + (v − 2/3) = u+ v − 1.Thus if we fix this copula, we get for p and q in [0, 1/3] a perfect positive and for p and q in [2/3, 1] aperfect negative dependence.Moreover, in the square [1/3, 2/3]× [1/3, 2/3] the copula is constant and equals 1/3. So if we considerthe point (1/

√3, 1/

√3), we have

C(1/√

3, 1/√

3) = 1/3 = 1/√

3 · 1√

3 =π(1/√

3, 1/√

3).

Hence, if p and q both equal 1/√

3, the above copula yields independence.

In fact, a general statement can be made about the situation when a copula generates the Frechet-Hoeffding bounds and independence, respectively.

Page 110: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

110 CHAPTER 7. MODELING MULTIVARIATE DISTRIBUTIONS WITH COPULAS

Theorem 7.1.1. Let C be a copula. Then

1. there exist non-degenerate marginals F and G for which C(F,G) equals M(F,G) if and only ifthere exists a square [0, a]× [1 − a, 1] or a square [1 − a, 1]× [0, a] where C puts no mass.

2. there exist non-degenerate marginals F and G for which C(F,G) equals W(F,G) if and only ifthere exists a square [0, a]× [0, a] or a square [1 − a, 1]× [1 − a, 1] where C puts no mass.

3. there exist non-degenerate marginals F and G for which C(F,G) equals π(F,G) if and only ifthere exist a point (u, v) in the interior of the unit square such that C(u, v) = u · v.

Proof. See Marshall (1996, Propositions 2.12. and 2.13.).

7.2 Dependence in Copula Models

Among the properties of HC for a fixed copula C, dependence is one of the most interesting. If theclasses Fi consist of continuous distributions only, dependence concepts which do not involve themarginals will remain the same for all members of the class HC . Yet not all dependence concepts areso; such as e.g. the linear correlation coefficient. In addition, as soon as the classes include discontin-uous distributions, we can no longer expect the dependence concept to be invariant within the class HC .

However, some of the dependence concepts mentioned earlier are quite trivial to handle, since thejoint distribution function inherits the properties from the underlying copula. If, for example, C is pos-itively and negatively orthant dependent respectively, then all members of HC will have this property.The same is true for tail dependence.

Unfortunately, things are far more complex and challenging when considering dependence and concor-dance measures, as we will see with this section. First, we focus on the linear correlation coefficient%, since this seems to be the only property of HC discussed in the literature so far. Thereafter, weconsider two more concordance measures obtained earlier: Kendall’s tau and Spearman’s rho.

7.2.1 Linear Correlation

First, by means of Theorem 4.3.7, if C1 and C2 are copulas such that C1 ≺ C2, then the correlationcoefficients, if existing, satisfy

%(HC1) ≤ %(HC2

)

for any two members of HC1and HC2

with common marginals. But, unfortunately, the value of % is ingeneral very much dependent on the marginals and hence not constant on the class HC .

Example 7.3. Let F and G stand for univariate Bernoulli distributions, Fp := B(p)|p ∈ (0, 1)and Gq := B(q)|q ∈ (0, 1), respectively. Furthermore, let C be some arbitrary copula. The linearcorrelation coefficient for Hpq := C(Fp, Gq) is given as follows:

(7.5) %C(Hpq) =1 − p− q + C(p, q) − (1 − p)(1 − q)√

pq(1 − p)(1 − q)=

C(p, q) − pq√pq(1 − p)(1 − q)

.

For example, if C = M and W , respectively, we get

(7.6) %M(Hpq) =

√p(1−q)√q(1−p)

if p ≤ q,√q(1−p)√p(1−q)

otherwise.and %W(Hpq) =

√pq√

(1−p)(1−q)if p ≤ 1 − q,

−√

(1−p)(1−q)√pq otherwise.

Page 111: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

7.2. DEPENDENCE IN COPULA MODELS 111

As mentioned above, copulas can look quite different in different parts of the unit square. As aconsequence, the set of correlation attainable within H will probably be large. In the following, wewill focus on just this set, which we from now on denote by RC , i.e.,

(7.7) RC =%(HC) | HC ∈ HC

.

First, Hoeffding’s Lemma 4.3.6 yields the following

Lemma 7.2.1. If C is positively and negatively quadrant dependent, respectively, then RC is a subsetof [0, 1] and [−1, 0], respectively. Furthermore, if F and G contain Bernoulli distributions, then thereverse is also true.

Proof. The ”if” part is straightforward. For ”only if” part concerning Bernoulli distributions, seeMarshall (1996, Proposition 2.11.).

Thus, if we choose a negatively quadrant dependent copula all members of HC will have negativecorrelation by means of this Lemma.

Example 7.4. If C belongs to the Gauss, t-, Frank, Plackett, Clayton, Mardia and the Frechet familyrespectively, then, depending on the choice of the copula parameter, RC is a subset of either [0, 1] or[−1, 0] (c.f. Examples 3.2, 3.3, 3.4, 3.5, 4.9 and Section 3.5.3). For example, if Cθ is a member ofthe Frank family, then θ > 0 will yield positively correlated distributions only. If, on the other hand,θ < 0, then all members of HCθ

will have negative correlation. Moreover, if θ passes either to ∞ or−∞, then the correlations will be arbitrarily close to their maximum and minimum bounds.

Now, under certain circumstances, RC is of a particular form (for a proof, see (Marshall, 1996))):

Theorem 7.2.1. (Marshall, 1996, Proposition 2.6.) If the classes F and G are convex, i.e. if for anyF , F ∗ in F and any G, G∗ in G,

αF + (1 − α)F ∗ ∈ F αG+ (1 − α)G∗ ∈ G

whenever α ∈ (0, 1), then RC is an interval.

Remark 7.2.1. The conditions of the above Theorem are satisfied especially when the classes F and Gcontain Bernoulli distributions.

Example 7.5. (Marshall, 1996) If F and G contain Bernoulli distributions, then

RW = [−1, 0) RM = [1, 0).

To see so, recall (7.5), i.e., if F and G are Bernoulli distributions with parameters p and q respectively,

%(HC) =C(p, q) − pq√pq(1 − p)(1 − q)

.

Therefore, if we set q = p if C = W and q = 1 − p if C = M, respectively, (7.6) yields

%(HW) =

− p

(1−p) , p ≤ 12

− (1−p)p , p ≥ 1

2

and %(HM) =

p

(1−p) , p ≤ 12

(1−p)p , p ≥ 1

2

.

Consequently, if p = 12 , then %(HW) = −1 and %(HM) = 1. On the other hand, if p is arbitrarily close

to zero, then %(HW) and %(HM) will be arbitrarily close to 0 as well.

Page 112: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

112 CHAPTER 7. MODELING MULTIVARIATE DISTRIBUTIONS WITH COPULAS

This example now yields the following (although straightforward, the proof can again be found inMarshall (1996)):

Theorem 7.2.2. (Marshall, 1996, Proposition 2.7.) Suppose, Bernoulli distributions are included inboth classes F and G. Then,

1. if a > 0 is in RC, then (0, a] ⊂ RC;

2. if b < 0 is in RC, then [b, 0) ⊂ RC.

Remark 7.2.2. For other classes F and G, the above result remains valid iff values arbitrarily close to0 are included in RW and RM respectively.

We conclude this section by the remark that there exist situations when RC is equal to [−1, 1) andsituations when RC is equal to (−1, 1] (c.f. Marshall (1996, Example 2.9)). However, whether RC canbe the entire interval [−1, 1] does not seem to be known yet.

7.2.2 Kendall’s Tau and Spearman’s Rho

As soon as the marginals fail to be continuous, any measure of association will in general dependon the marginal distributions by means of Theorem 5.5.1. In the chapter on multivariate discretedistributions, we saw that this is the case even if the margins are comonotonic and countermonotonicrespectively. Therefore, if the classes F and G include discontinuous distributions, a quantity formeasuring concordance or dependence will not necessarily be HC-invariant.From now on, we focus on discrete distributions with finite support, i.e. we assume that the classes Fand G include such distributions only, and examine the properties of the following sets TC , SC :

TC =ρ∗τ (HC) | HC ∈ HC

(7.8a)

SC =ρ∗S(HC) | HC ∈ HC

(7.8b)

where ρ∗τ and ρ∗S are discrete versions of Kendall’s tau and Spearman’s rho, respectively, as defined inDefinitions 5.5.1 and 5.5.2. With definitions of ρ∗τ and ρ∗S in mind, the following analogue to Lemma7.2.1 holds:

Lemma 7.2.2. If C is positively and negatively quadrant dependent respectively, then TC and SC aresubsets of [0, 1] and [−1, 0], respectively.Furthermore, if the classes F and G are convex, i.e. if for any F , F ∗ in F and any G, G∗ in G,

αF + (1 − α)F ∗ ∈ F αG+ (1 − α)G∗ ∈ G

whenever α ∈ (0, 1), then TC and SC are intervals.

Proof. The first part follows immediately by means of the fact that both ρ∗τ and ρ∗S satisfy concor-dance measure axioms A3 and A7. To show that TC and SC are intervals under the above condi-tions, we can proceed similarly as Marshall (1996, Proposition 2.11.). Suppose, ρ∗τ (C(F,G)) = a1 andρ∗τ (C(F ∗, G∗)) = a2 for some F, F ∗ in F and G,G∗ in G. Furthermore, assume ρ∗S(C(F,G)) = b1 andρ∗S(C(F ∗, G∗)) = b2. Now, with (5.29), (5.30), (5.42) and (5.53), the functions

ρ∗τ (C(αF + (1 − α)F ∗, αG+ (1 − α)G∗)) and ρ∗S(C(αF + (1 − α)F ∗, αG+ (1 − α)G∗))

are continuous in α in [0, 1] and hence any element of [a1, a2] and [b1, b2] belongs to TC and SC ,respectively.

Now, the quite surprising thing is that the sets TC and SC share even more properties with RC .This is due to the following fact:

Page 113: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

7.2. DEPENDENCE IN COPULA MODELS 113

Lemma 7.2.3. Assume Fp and Gq are Bernoulli distributions, i.e. Fp = B(p) and Gq = B(q),respectively. Furthermore, let Hpq denote a joint distribution function with marginals Fp and Gq andC some copula in CHpq

. Then

ρ∗τ (Hpq) = ρ∗S(Hpq) = %(Hpq) =C(p, q) − pq√pq(1 − p)(1 − q)

.

Hence, for bivariate Bernoulli distributions, Kendall’s tau and Spearman’s rho coincide with the linearcorrelation coefficient.

Proof. The statement can be shown straightforwardly. In order to do so, we first calculate the corre-sponding probability densities:

h00 = C(p, q) h10 = q − C(p, q)(7.9a)

h01 = p− C(p, q) h11 = 1 − p− q + C(p, q)(7.9b)

With (6.12), the difference between the corresponding probabilities of concordance and discordancecan be re-written as follows:

Qτ = 2h00h11 − 2h10h01 for Kendall’s tau,

QS = h11pq + h00(1 − p)(1 − q) − h10p(1 − q) − h01(1 − p)q for Spearman’s rho.

Now QS = C(p, q) − qp according to Example 5.4 and, by some minor calculations:

Qτ = 2[C(p, q)(1 − p− q + C(p, q)) − (p− C(p, q))(q − C(p, q))

]= 2

[C(p, q) − C(p, q)p− qC(p, q)+

+ (C(p, q))2 − pq + pC(p, q) + qC(p, q) − (C(p, q))2]

= 2[C(p, q) − pq

].

In addition,

√1 − p2 − (1 − p)2 =

√1 − p2 − 1 + 2p− p2 =

√2[p(1 − p)]

√1 − p3 − (1 − p)3 =

√1 − p3 − 1 + 3p− 3p2 + p3 =

√3[p(1 − p)]

Hence, by (5.42), (5.53) and (5.45),

ρ∗τ =Qτ√

1 − p2 − (1 − p)2√

1 − q2 − (1 − q)2ρ∗S =

3QS√1 − p3 − (1 − p)3

√1 − q3 − (1 − q)3

=2[C(p, q) − pq]

2√pq(1 − p)(1 − q)

=3[C(p, q)− pq]

3√pq(1 − p)(1 − q)

=C(p, q) − pq√pq(1 − p)(1 − q)

=C(p, q) − pq√pq(1 − p)(1 − q)

The Lemma now follows with (7.5).

Now, this example has the following consequence:

Corollary 7.2.1. Assume F and G contain Bernoulli distributions and let A stand for either TC orSC. Then

1. if A is a subset of [0, 1] and [−1, 0], respectively, then C is positively and negatively quadrantdependent, respectively;

Page 114: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

114 CHAPTER 7. MODELING MULTIVARIATE DISTRIBUTIONS WITH COPULAS

2. if C = M, then A ⊇ (0, 1] and if C = W, then A ⊇ [−1, 0);

3. if ρ > 0 is in A, then (0, a] ⊂ A and if ρ < 0 is in A, then [a, 0) ⊂ A.

Proof. 2. follows with Example 7.5, whereas the remaining statements can be shown along exactly thesame lines as Propositions 2.7. and 2.11. in Marshall (1996).

Remark 7.2.3. The second statement of the above corollary also shows that, even if the marginals arecomonotonic or countermonotonic, ρ∗τ as well as ρ∗S not always attain its extreme values.

7.3 Minimum Correlated Discrete Bivariate Distributions

In this section, we will concentrate on one special case, i.e. when the classes F and G consist ofdiscrete distributions with finite support and the copula equals the Frechet-Hoeffding lower bound W .Throughout this section, we will also make use of the Notation 5.0.1.

7.3.1 The Class HW

If we choose the Frechet-Hoeffding lower bound W , then all members of HC will have minimumcorrelation possible, by means of Theorem 4.3.7. However simple, this model has one unpleasantdrawback: there exists no close formula for the resulting joint probabilities hij in general. To calculatethem, we have to evaluate

hij = max[F (ξi) +G(ηj) − 1, 0]︸ ︷︷ ︸A

−max[F (ξi−1) +G(ηj) − 1, 0]︸ ︷︷ ︸B

max[F (ξi) +G(ηj−1) − 1, 0]︸ ︷︷ ︸C

+ max[F (ξi−1) +G(ηj−1) − 1, 0]︸ ︷︷ ︸D

.

Hence, there are at most six cases which have to be considered (recall that F and G are nondecreasing):

1. A,B,C,D are all zero: hij is clearly zero,

2. B,C,D are zero: hij is equal to A, i.e. hij = F (ξi) +G(ηj) − 1,

3. C,D are zero: hij = A− B, i.e.,

hij = F (ξi) +G(ηj) − 1 − F (ξi−1) −G(ηj) + 1 = F (ξi) − F (ξi−1),

4. B,D are zero: hij = A− C, i.e.,

hij = F (ξi) +G(ηj) − 1 − F (ξi) −G(ηj−1) + 1 = G(ηj) −G(ηj−1),

5. D is zero: hij = A−B − C, i.e.,

hij = G(ηj) −G(ηj−1) − F (ξi−1) −G(ηj) + 1 = 1 −G(ηj−1) − F (ξi−1),

6. none of A,B,C and D is zero: hij = A−B − C +D, i.e.,

hij = 1 −G(ηj−1) − F (ξi−1) + F (ξi−1) +G(ηj−1) − 1 = 0.

Page 115: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

7.3. MINIMUM CORRELATED DISCRETE BIVARIATE DISTRIBUTIONS 115

These cases now yield the following:

(7.10) hij =

0, 1 −G(ηj) ≤ 1 −G(ηj−1) ≤ F (ξi−1) ≤ F (ξi),

1 − F (ξi−1) −G(ηj−1), 1 −G(ηj) ≤ F (ξi−1) ≤ 1 −G(ηj−1) ≤ F (ξi),

F (ξi) − F (ξi−1), 1 −G(ηj) ≤ F (ξi−1) ≤ F (ξi) ≤ 1 −G(ηj−1),

G(ηj) −G(ηj−1), F (ξi−1) ≤ 1−G(ηj) ≤ 1 −G(ηj−1) ≤ F (ξi),

F (ξi) +G(ηj) − 1, F (ξi−1) ≤ 1−G(ηj) ≤ F (ξi) ≤ 1 −G(ηj−1),

0, F (ξi−1) ≤ F (ξi) ≤ 1 −G(ηj) ≤ 1 −G(ηj−1).

As we will see for example with the Poisson distribution later on, the conditions from the above casesare not easily evaluated, i.e., there exists no general recipe for determining which of them occurs for agiven pair (i, j).

This, in turn, has fatal effects on the evaluation of the correlation coefficient, as well as on any otherquantity based on expectation or the joint probability density. In such a situation, however, thereexists at least a quite simple and fast algorithm which can be used for calculation of such quantities,as we will soon see.

We conclude this subsection with one more remark on HW . It can be shown that its members not onlyhave minimum possible correlation, but also minimum possible expectation E(ψ(X,Y )) if the functionψ meets certain demands.Suppose, in general that ψ is some real-valued function and X and Y random variables with givendistribution functions, F and G respectively. Furthermore, consider the following task:

Generate bivariate joint distribution H with marginals F and G such that Eψ(X,Y ) isminimum possible.

As the upcoming theorem shows, this problem can again be solved by using the Frechet-Hoeffdingbounds. The result has been obtained and proved independently by various authors (under slightlydifferent conditions, more or less). Regarding other versions and further details, we refer to Rachevand Ruschendorf (1998a) and further literature given therein.

Theorem 7.3.1. (Cambanis et al., 1976) Let X and Y be random variables with arbitrary marginaldistribution functions F and G and C1 and C2 be some copulas in C. Furthermore, let ECi

ψ(X,Y )denote the expectation corresponding to the joint distribution function Ci(F (x), G(y)) of (X,Y ). More-over, let ψ be a right continuous and 2-increasing (also called quasi-monotone) real function of twovariables, i.e., a function satisfying

(7.11) ψ(x, y) + ψ(x∗, y∗) ≥ ψ(x, y∗) + ψ(x∗, y)

whenever x ≤ x∗ and y ≤ y∗. If one of the conditions

1. ψ is symmetric and both the expectations Eψ(X,X) and Eψ(Y, Y ) are finite, or

2. Eψ(X, y0) and Eψ(x0, X) are finite for some y0 and x0,

is satisfied, then the relation C1 ≺ C2 yields the inequality

(7.12) EC1ψ(X,Y ) ≤ EC2

ψ(X,Y ).

In particular, as soon as one of the above conditions hold,

(7.13) EW ψ(X,Y ) ≤ EC ψ(X,Y ) ≤ EM ψ(X,Y ) for any C ∈ C.

Proof. Since C1(F (x), F (y)) ≤ C2(F (x), F (y)) is implied by C1 ≺ C2, and because W ≺ C ≺ M for anycopula, the theorem follows with Theorem 1 from Cambanis et al. (1976).

Page 116: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

116 CHAPTER 7. MODELING MULTIVARIATE DISTRIBUTIONS WITH COPULAS

7.3.2 The North-West Corner Rule

In order to see how the joint probabilities hij can be evaluated, we have to look at the search for a jointdistribution with minimum possible correlation from a different point of view. If the marginals F andG are given, then, in particular, the probabilities p1, . . . , pm and q1, . . . , qn are known. Moreover, theysum up to one each and hence satisfy

∑mi=1 pi =

∑nj=1 qj . In order to obtain minimum correlation

possible, we are searching for the joint probabilities hij , i.e., for nm non-negative real numbers forwhich

m∑

i=1

n∑

j=1

ψ(ξi, ηj)hij is minimum possible.

Besides, hij have to be consistent with the margins considered, i.e. the non-negative numbers hij mustsatisfy

(7.14)

m∑

i=1

hij = qj and

n∑

j=1

hij = pi.

(this in particular implies that hij is at most one each). But this problem coincides with the simpletransportation problem, well-known from linear optimization and graph theory. Indeed, if we interpretpi as a ”number of products” manufactured in ”ξi”; qj as a ”number of products” needed in ”ηj” andfinally ψ(ξi, ηj) as a cost for ”shipping” one product from ξi to ηj , then each set of mn non-negativenumbers hij consistent with (7.14) can be viewed as a ”shipment strategy”, in the sense that hij givesthe ”number” of products which have to be transported from ”ξi” to ”ηj”. Moreover, the total numberof items produced is equal to the total number of items demanded. In the light of this interpretation,the problem considered above is nothing but a search for a delivery strategy which would minimizethe total transportation cost.

As is well-known from the linear optimization, if ψ satisfies1

(7.15) ψ(ξi, ηj) + ψ(ξi∗ , ηj∗) ≥ ψ(ξi, ηj∗) + ψ(ξi∗ , ηj)

wherever j ≥ j∗ and i ≥ i∗, then hij can be calculated using the following procedure (cf. Gass (1969)):

Algorithm 7.3.1. The North-West Corner Rule (Hoffman, 1963)

Step 1: Set k = 1 and l = n.

Step 2: Set hkl := min(pk, ql).

Step 3: Replace pk by pk − hkl and ql by ql − hkl.

Step 4: If k = m and l = 1, stop. Otherwise,

• if k < m, set k = k + 1 and go to step 2,

• if k = m and l > 1, set l = l − 1 and k = 1 and go to step 2.

Remark 7.3.1. If the inequality in (7.15) is reversed, i.e. if

(7.16) ψ(ξi, ηj) + ψ(ξi∗ , ηj∗) ≤ ψ(ξi, ηj∗) + ψ(ξi∗ , ηj)

holds wherever j ≥ j∗ and i ≥ i∗, then (7.16) is called the Monge condition and the minimizationproblem the Monge-Kantorowich transportation problem. For Gaspar Monge, in the eighteenth centuryalready (cf. Monge (1781)), discovered the essential idea behind its solution:

1Note that this requirement holds especially when ψ is 2-increasing (or, alternatively, quasi-monotone).

Page 117: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

7.3. MINIMUM CORRELATED DISCRETE BIVARIATE DISTRIBUTIONS 117

”In order to minimize the total distance traveled, the routes from x to y and from x∗ to y∗

must not intersect...”(cf. (Hoffman, 1963, pp. 317))

Kantorowich then re-discovered this problem and formulated it in a new and abstract setting. Thisgave birth to a whole theory on the so-called Mass Transportation, which has been widely studied inthe literature on various levels of abstraction. For more details, we refer to the the work of Rachevand Ruschendorf (cf. Rachev and Ruschendorf (1998a) and Rachev and Ruschendorf (1998b) as wellas further references given therein).The greedy algorithm, which solves the Monge-Kantorowich problem, has been considered and provedby A. J. Hoffman (cf. Hoffman (1963)). His algorithm finds a minimum if the numbers ψ(ξi, ηj) canbe rearranged in a Monge sequence, i.e. provided with new indices such that the Monge condition issatisfied. If (7.15) holds, then, if we store the transportation costs the following matrix,

North

ψ(ξ1, ηn) ψ(ξ2, ηn) · · · ψ(ξm, ηn)

ψ(ξ1, ηn−1) ψ(ξ2, ηn−1) · · · ψ(ξm, ηn−1)West ...

.... . .

...

East

ψ(ξ1, η1) ψ(ξ2, η1) · · · ψ(ξm, η1)

South

one such sequence starts in the upper left corner (north-west), i.e. with ψ(ξ1, ηn), continues with thefirst row rightwards, i.e. with ψ(ξi, ηn), i = 2, . . . ,m, goes back to the number which is again in thefarthest upper left corner left, i.e., ψ(ξ1, ηn−1), and so on. The last element is ψ(ξm, η1). Hence, theaforementioned north-west corner rule algorithm results.

Because, for any i, j ≥ 1,

ξiηj + ξi+1ηj+1 = ξiηj+1 + ξi(ηj − ηj+1) + ξi+1ηj + ξi+1(ηj+1 − ηj) = ξiηj+1 + ξi+1ηj+

+ (ηj+1 − ηj)(ξi+1 − ξi) ≥ ξiηj+1 + ξi+1ηj ,

ψ(ξi, ηj) := ξiηj satisfies (7.15) and hence

Lemma 7.3.1. Let F and G be discrete distributions with finite support. Then the joint distribu-tion function given by the probability densities generated by Algorithm 7.3.1 minimizes the correlationcoefficient.

Remark 7.3.2. In Lemma 7.3.1, ”correlation coefficient” can obviously be replaced by E(ψ(X,Y )),whenever ψ satisfies (7.15). As noted by Cambanis et al. (1976), such functions are for example(x+ y)2, min(x, y) or f(x− y), where f is concave and continuous. For |x− y|p, p ≥ 1, the inequalityin (7.15) is reversed, but in such a case an algorithm analogous to 7.3.1 can be obtained (with the onlydifference that the Monge sequence in the sense of Hoffman (1963) would now start in the lower-rightcorner).

Now, the north-west corner rule Algorithm 7.3.1 does not directly work with the quantities ψ(ξi, ηj).It just determines the ”optimal shipment amounts” hij form the numbers pi and qj alone. As we willsee with the upcoming theorem, they coincide with the probabilities corresponding to the Frechet-Hoeffding lower bound. In fact, a more general statement can be made.

Page 118: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

118 CHAPTER 7. MODELING MULTIVARIATE DISTRIBUTIONS WITH COPULAS

Theorem 7.3.2. Suppose p1, . . . , pm and q1, . . . , qn are non-negative real numbers such that

(7.17)

m∑

i=1

pi =

n∑

j=1

qj = C.

Furthermore, let F (i) and G(j) be given by

F (0) = 0 G(0) = 0

F (i) =

i∑

k=1

pk, i = 1, . . . ,m G(j) =

j∑

l=1

gl, j = 1, . . . , n

Then the quantities hij for i = 1, . . . ,m and j = 1, . . . , n determined by the north-west corner ruleAlgorithm 7.3.1 satisfy

(7.18) hij = max(F (i) +G(j) − C, 0) − max(F (i− 1) +G(j) − C, 0) − max(F (i) +G(j − 1) − C, 0)

+ max(F (i− 1) +G(j − 1) − C, 0).

Corollary 7.3.1. If pi and qi are the marginal densities, then hij are equal to the joint probabilitydensities (7.10) corresponding to the Frechet-Hoeffding lower bound W.

Proof. [Theorem 7.3.2] Throughout the proof, imagine the hij , pi and qj are stored in the followingmatrix:

qn

qn−1

...

q1

p1 p2 · · · pm

h1n h2n · · · hmn

h1n−1 h2n−1 · · · hmn−1

h11 h21 · · · hm1

......

.... . .

Now, we will prove the above theorem with induction in m + n. First, suppose m = n = 1. Then,according to (7.17), p1 = q1 = C and hence h11 = C. Moreover,

max(F (1) +G(1) − C, 0) − max(F (0) +G(1) − C, 0) − max(F (1) +G(0) − C, 0)+

max(F (0)+G(0)−C, 0) = max(2C−C, 0)− 2 max(C −C, 0)+max(0−C, 0) = 2C−C = C = h11.

Secondly, suppose m + n = N and (7.18) holds whenever n +m < N . Before we continue with theproof, however, we list several useful items which we will need throughout the calculations:

Page 119: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

7.3. MINIMUM CORRELATED DISCRETE BIVARIATE DISTRIBUTIONS 119

• Since F (m) is given by the sum of all pi, it must be equal to C. Furthermore, F (i) ≤ F (m) forall i, and hence

(7.19) max(G(0) + F (i) − C, 0) = max(F (i) − C, 0) = 0, for all i = 1, . . . ,m;

• similarly, because G(n) = C and G(j) ≤ G(n) for all j,

(7.20) max(F (0) +G(j) − C, 0) = max(G(j) − C, 0) = 0, for all j = 1, . . . , n.

Now, the North-West Corner Rule Algorithm 7.3.1 starts with the calculation of h1n. Hence, accordingto Step 2, are two cases to consider.

p1 ≤ qn: In this case, h1n = p1 and according to step 3, p1 is replaced by 0 and qn by qn−p1. Therefore,regardless of pi and qj , i = 2, . . . ,m and j = 1, . . . , n−1, h1k will be zero for all k = 1, . . . , n−1.Moreover, (7.18) applies for these quantities, because

max(F (1) +G(n) − C, 0) − max(F (0) +G(n) − C, 0) − max(F (1) +G(n− 1) − C, 0)+

max(F (0) +G(n− 1) − C, 0)(7.20)= max(F (1) + C − C, 0) − 0−

max(F (1) − (C −G(n− 1)), 0) + 0 = p1 − max(p1 − qn, 0) = p1 = h1n

and, for k ≤ n− 1,

max(F (1) +G(k) − C, 0) − max(F (0) +G(k) − C, 0) − max(F (1) +G(k − 1) − C, 0)+

max(F (0) +G(k − 1) − C, 0)(7.20)= max(F (1) +G(k) − C, 0) − max(F (1) +G(k − 1) − C, 0) =

max(p1 +G(k) − C, 0) − max(p1 +G(k − 1) − C, 0) ≤ max(qn +G(k) − C, 0)−max(qn +G(k − 1) − C, 0) ≤ max(G(n) − C, 0) − max(G(n) − C, 0) = 0 = h1k.

To see that (7.18) is valid for the remaining hij also, note that the above matrix will look like asfollows

qn

qn−1

...

q1

p1 p2 · · · pm

p1 h2n · · · hmn

0 h2n−1 · · · hmn−1

0 h21 · · · hm1

......

.... . .

Therefore, the hij in the shaded region mustsum up to

C∗ := C − p1.

Since the North-West Corner Rule is of recur-sive nature, remaining hij can be calculated al-ternatively by applying this algorithm to thefollowing input:

p∗1, . . . , p∗m−1, with p∗i = pi+1

q∗1 , . . . , q∗n, with q∗j = qj , for j < n

and q∗n = qn − p1

and by setting

hij := h∗i−1j .

So, because (m − 1) + n=N − 1, the induction assumption applies and (7.18) is valid (with

Page 120: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

120 CHAPTER 7. MODELING MULTIVARIATE DISTRIBUTIONS WITH COPULAS

C := C∗, F := F ∗ and G := G∗) for h∗ij and hence for hij in the shaded region of the abovematrix. In this situation, we clearly get

C∗ = C − p1

F ∗(i) = F (i+ 1) − p1, i = 0, . . . ,m− 1

G∗(j) = G(j), j = 0, . . . , n− 1

G∗(n) = G(n) − p1.

To see that (7.18) holds with C, F and G also, we first consider hij for j < n (and, naturally,i ≥ 2). Then

hij = h∗i−1j = max(F ∗(i− 1) +G∗(j) − C∗, 0) − max(F ∗(i− 2) +G∗(j) − C∗, 0)−max(F ∗(i− 1) +G∗(j − 1) − C∗, 0) + max(F ∗(i− 2) +G∗(j − 1) − C∗, 0)

= max(F (i) − p1 +G(j) − C + p1, 0) − max(F (i− 1) − p1 +G(j) − C + p1, 0)−max(F (i) − p1 +G(j − 1) − C + p1, 0) + max(F (i− 1) − p1 +G(j − 1) − C + p1, 0)

= max(F (i) +G(j) − C, 0) − max(F (i− 1) +G(j) − C, 0)−max(F (i) +G(j − 1) − C, 0) + max(F (i− 1) +G(j − 1) − C, 0).

In addition, if j = n,

hin = h∗i−1n = max(F ∗(i− 1) +G∗(n) − C∗, 0) − max(F ∗(i− 2) +G∗(n) − C∗, 0)−max(F ∗(i− 1) +G∗(n− 1) − C∗, 0) + max(F ∗(i− 2) +G∗(n− 1) − C∗, 0)

= max(F (i) − p1 +G(n) − p1 − C + p1, 0) − max(F (i− 1) − p1 +G(n) − p1 − C + p1, 0)−max(F (i) − p1 +G(n− 1) − C + p1, 0) + max(F (i− 1) − p1 +G(n− 1) − C + p1, 0)

= max(F (i) +G(n) − p1 − C, 0) − max(F (i− 1) +G(n) − p1 − C, 0)−max(F (i) +G(n− 1) − C, 0) + max(F (i− 1) +G(n− 1) − C, 0)

= max(F (i) − p1, 0) − max(F (i− 1) − p1, 0)−(∗)

max(F (i) +G(n− 1) − C, 0) + max(F (i− 1) +G(n− 1) − C, 0)

= F (i) − p1 − F (i− 1) + p1−max(F (i) +G(n− 1) − C, 0) + max(F (i− 1) +G(n− 1) − C, 0)

= F (i) − F (i− 1) − max(F (i) +G(n− 1) − C, 0) + max(F (i− 1) +G(n− 1) − C, 0)

= max(F (i) +G(n) − C, 0) − max(F (i− 1) +G(n) − C, 0)−(∗)

max(F (i) +G(n− 1) − C, 0) + max(F (i− 1) +G(n− 1) − C, 0).

where (∗) holds because G(n) = C.

p1 > qn: In this case, the proof works along similar lines.

First, we have h1n = qn and p1 is replaced by p1 − qn and qn by 0. Therefore, hkn will be zerofor all k = 2, . . . ,m. Moreover, (7.18) applies for these quantities, because

max(F (1) +G(n) − C, 0) − max(F (0) +G(n) − C, 0) − max(F (1) +G(n− 1) − C, 0)+

max(F (0) +G(n− 1) − C, 0)(7.20)= max(F (1) + C − C, 0) − 0−

max(F (1) − (C −G(n− 1)), 0) + 0 = p1 − max(p1 − qn, 0) = p1 − p1 + qn = qn = h1n

Page 121: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

7.3. MINIMUM CORRELATED DISCRETE BIVARIATE DISTRIBUTIONS 121

and, for k ≥ 2,

max(F (k) +G(n) − C, 0) − max(F (k − 1) +G(n) − C, 0) − max(F (k) +G(n− 1) − C, 0)+

max(F (k−1)+G(n−1)−C, 0) = F (k)−F (k−1)−max(F (k)−qn, 0)+max(F (k−1)−qn, 0) =

F (k) − F (k − 1) − F (k) + qn + F (k − 1) − qn = 0 = hk1,

because F (i) − qn = (p1 − qn) +∑il=2 pl ≥ 0 for all i ≥ 1. Moreover, in this case, the above

matrix will be as follows:

qn

qn−1

...

q1

p1 p2 · · · pm

qn 0 · · · 0

h1n−1 h2n−1 · · · hmn−1

h11 h21 · · · hm1

......

.... . .

Here, the hij in the shaded region must sum upto

C∗ := C − qn

and can again be calculated alternatively by ap-plying the north-west corner rule to the follow-ing input:

p∗1, . . . , p∗m, with p∗1 = p1 − qn

and p∗i = pi, i = 2, . . . ,m

q∗1 , . . . , q∗n−1, with q∗j = qj

and upon setting

hij := h∗ij .

So, because m+ (n− 1)=N − 1, the induction assumption applies and (7.18) is valid (with C, Fand G replaced by C∗, F ∗ and G∗, respectively) for h∗ij and hence for hij in the shaded regionof the above matrix. Moreover,

C∗ = C − qn

G∗(j) = G(j), j = 0, . . . , n− 1

F ∗(i) = F (i) − qn, i = 1, . . . ,m

F ∗(0) = 0.

Now, it can again be shown that (7.18) holds with C, F and G as well. For this purpose, we firstconsider hij for i > 1 (and, naturally, j < n). Then

hij = h∗ij = max(F ∗(i) +G∗(j) − C∗, 0) − max(F ∗(i− 1) +G∗(j) − C∗, 0)−max(F ∗(i) +G∗(j − 1) − C∗, 0) + max(F ∗(i− 1) +G∗(j − 1) − C∗, 0)

= max(F (i) − qn +G(j) − C + qn, 0) − max(F (i− 1) − qn +G(j) − C + qn, 0)−max(F (i) − qn +G(j − 1) − C + qn, 0) + max(F (i− 1) − qn +G(j − 1) − C + qn, 0)

= max(F (i) +G(j) − C, 0) − max(F (i− 1) +G(j) − C, 0)−max(F (i) +G(j − 1) − C, 0) + max(F (i− 1) +G(j − 1) − C, 0).

Page 122: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

122 CHAPTER 7. MODELING MULTIVARIATE DISTRIBUTIONS WITH COPULAS

In addition, if i = 1,

h1j = h∗1j = max(F ∗(1) +G∗(j) − C∗, 0) − max(F ∗(0) +G∗(j) − C∗, 0)−max(F ∗(1) +G∗(j − 1) − C∗, 0) + max(F ∗(0) +G∗(j − 1) − C∗, 0)

= max(F (1) − qn +G(j) − C + qn, 0) − max(0 +G(j) − C + qn, 0)−max(F (1) − qn +G(j − 1) − C + qn, 0) + max(0 +G(j − 1) − C + qn, 0)

= max(F (1) +G(j) − C, 0) − max(F (1) +G(j − 1) − C, 0)(∗)(7.20)= max(F (1) +G(j) − C, 0) − max(F (1) +G(j − 1) − C, 0)−max(F (0) +G(j) − C, 0) + max(F (0) +G(j − 1) − C, 0)

where (∗) holds because G(k) + qn − C ≤ G(n) − C = 0 for all k < n.

7.3.3 Discrete Distributions with Infinite Support

We close this chapter with the observation that the above procedure for calculating the probabilitiescorresponding to the Frechet-Hoeffding lower bound copula can also be extended to the case when thesupports of F and G are given by

ξ1 < ξ2 < ξ3 . . . and η1 < η2 < η3 ≤ . . .

respectively. Since F (ξi) and G(ηj) tend to one, it is surely possible to find an n and m, such that

(7.21) F (ξ1) +G(ηn) − 1 ≥ 0 and F (ξm) +G(η1) − 1 ≥ 0.

Hence, because F and G are nondecreasing, (7.10) yields:

h1j = G(ηj) −G(ηj−1) = qj for j > n,(7.22a)

hi1 = F (ξi) − F (ξi−1) = pi for i > m,(7.22b)

hij = 0 for i > 1 & j > n or i > m & j > 1.(7.22c)

Therefore, the minimum covariance possible satisfies

Covmin(X,Y ) =

∞∑

i=m+1

η1ξi pi +

∞∑

j=n+1

ξ1ηj qj +

m∑

i=1

n∑

j=1

ξi ηj hij

where hij are given by (7.10). Furthermore, the last term in the above equation satisfies

m∑

i=1

n∑

j=1

ξi ηj hij = minh∗

ij

m∑

i=1

n∑

j=1

ξi ηj h∗ij ,

where the minimum is taken over all mn non-negative numbers satisfying

n∑

j=1

h∗1j = F (ξ1) +G(ηn) − 1,m∑

i=1

h∗i1 = F (ξm) +G(η1) − 1,(7.23a)

n∑

j=1

h∗ij = F (ξi) − F (ξi−1), i = 2, . . . ,m,

m∑

i=1

h∗ij = G(ηj) −G(ηj−1), j = 2, . . . , n.(7.23b)

Hence, the remaining probabilities hij can again be determined by the north-west corner rule. More-over, the so calculated densities correspond to the Frechet-Hoeffding lower bound.

Page 123: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

7.3. MINIMUM CORRELATED DISCRETE BIVARIATE DISTRIBUTIONS 123

Corollary 7.3.2. Assume that n and m are such that (7.21) holds. Then the densities hij determinedaccording to (7.22) for i > m or j > n and by the north-west corner rule under the conditions (7.23)satisfy

(7.24) hij = W(F (ξi), G(ηj)) −W(F (ξi−1), G(ηj)) −W(F (ξi), G(ηj−1)) + W(F (ξi−1), G(ηj−1))

(with F (ξ0) = G(η0) := 0) for all i = 1, 2, . . . and j = 1, 2, . . . .

Proof. First, assume 1 ≤ i ≤ m and 1 ≤ j ≤ n. Then, by means of Theorem 7.3.2,

hij = max(F ∗(i) +G∗(j) − C∗, 0) − max(F ∗(i− 1) +G∗(j) − C∗, 0)−max(F ∗(i) +G∗(j − 1) − C∗, 0) + max(F ∗(i− 1) +G∗(j − 1) − C∗, 0)

with C∗ = 1 − (1 − F (ξm)) − (1 −G(ηn)) and

F ∗(0) = 0 G∗(0) = 0

F ∗(i) = F (ξi) − (1 − F (ξm)), i = 1, . . . ,m G∗(j) = G(ηj) − (1 −G(ηn)), j = 1, . . . , n

But, since for any 1 ≤ k ≤ m and 1 ≤ l ≤ n,

F ∗(k) +G∗(l) − C∗ = F (ξk) +G(ηl) − 1

(7.24) is satisfied for any 2 ≤ i ≤ m and any 2 ≤ j ≤ n. In addition,

h1j = max(F ∗(1) +G∗(j) − C∗, 0) − max(0 +G∗(j) − C∗, 0)−max(F ∗(1) +G∗(j − 1) − C∗, 0) + max(F ∗(0) +G∗(j − 1) − C∗, 0)

(7.20)= max(F ∗(1) +G∗(j) − C∗, 0) − max(F ∗(1) +G∗(j − 1) − C∗, 0)

= max(F (ξ1) +G(ηj) − 1, 0) − max(F (ξ1) +G(ηj−1) − 1, 0)

= max(F (ξ1) +G(ηj) − 1, 0) − max(F (ξ1) +G(ηj−1) − 1, 0)−max(F (ξ0) +G(ηj−1) − 1, 0) + max(F (ξ0) +G(ηj−1) − 1, 0)

and (7.24) holds for any h1j , 2 ≤ j ≤ n. By exactly the same arguments, (7.24) is also satisfied forany hi1, 1 ≤ i ≤ m. As to the remaining cases, because

F (ξk) +G(ηl) − 1 ≥ 0

for any k > m and 1 ≤ l and any 1 ≤ k and l > n,

W(F (ξi), G(ηj)) −W(F (ξi−1), G(ηj)) −W(F (ξi), G(ηj−1)) + W(F (ξi−1), G(ηj−1)) = 0

whenever either i > m and 1 < j or 1 < i and j > n and thus (7.24) is again true.In addition, if j = 1 and i > m,

W(F (ξi), G(η1)) −W(F (ξi−1), G(η1)) −W(F (ξi), G(η0)) + W(F (ξi−1), G(η0)) =

W(F (ξi), G(η1))−W(F (ξi−1), G(η1)) = F (ξi) +G(η1)− 1−F (ξi−1)−G(η1) + 1 = F (ξi)−F (ξi−1)

and, consequently, (7.24) is satisfied for j = 1 and i > m. Lastly, the case ”i = 1 and j > n” can beshown along the same lines and hence the corollary follows.

Page 124: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

124 CHAPTER 7. MODELING MULTIVARIATE DISTRIBUTIONS WITH COPULAS

Page 125: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

Chapter 8

Negatively Correlated Bivariate

Poisson Distributions

In this chapter, we apply the previous results concerning negatively correlated distributions to thecase when the marginal distributions are Poisson distributions. From now on, we will call any bi-variate distribution with Poisson marginals a bivariate Poisson distribution (however, be aware of thefact that not every such distribution has a nice stochastic representation or shows properties similar tothose of a univariate Poisson distribution; hence this definition is not always accepted in the literature).

The problem of constructing a bivariate Poisson distribution with negative correlation is an old oneand has been studied by various authors, see e.g. Griffiths et al. (1979) and Nelsen (1987) and fur-ther references mentioned therein. Therefore, several constructions of such distributions have beenproposed in the literature, which are not all based on copulas. Hence, the outline of this section willbe the following: first, we present several models considered so far, and focus on the copula modelingapproach thereafter. We also illustrate the north-west-corner rule algorithm and pay special attentionto the explicit evaluation of the correlation coefficient.

8.1 Models

As has been stated in Theorem 4.3.7, the correlation coefficient for such distributions is bounded frombelow by %min < 0, a quantity which is attained for members of the class HW . Therefore, thereexist bivariate Poisson distributions that have negative correlation. However, probably the best knownbivariate Poisson distribution (e.g. Campbell (1934) and Teicher (1954) as well as Johnson et al. (1997)and references given therein) has been derived as a limiting case of a bivariate binomial distribution.Its joint probability generating function takes the form

exp(λ1(t1 − 1) + λ2(t2 − 1) + a12(t1 − 1)(t2 − 1)).

This distribution can be interpreted in the following way (for further details and references for thecorresponding work of Dwass and Teicher see Johnson et al. (1997)):

If Y1, Y2 and Y12 are independent random variables distributed according to P(λ1 − a12),P(λ2 − a12) and P(a12) respectively, then the random vector (Y1 + Y12, Y2 + Y12) has theaforementioned bivariate Poisson distribution.

Therefore, if X1 and X2 have the above joined distribution function,

Cov(X1, X2) = Var(Y12) = a12

125

Page 126: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

126 CHAPTER 8. NEGATIVELY CORRELATED BIVARIATE POISSON DISTRIBUTIONS

and only positive correlation can be achieved in this way. Furthermore, this holds for all infinitelydivisible bivariate Poisson distributions, since their joint probability generating functions are all of theabove form. Another class of Poisson distributions can be constructed by mixing the aforementioneddistribution with respect to the covariance, but, although such distributions are no longer infinitelydivisible, again only non-negative correlation is permissible (see Griffiths et al. (1979)). For furtherproperties of such bivariate Poisson distributions, see Johnson et al. (1997) and Kocherlakota andKocherlakota (1992).

Another way to obtain bivariate distributions is to generate the joint probabilities by

(8.1) P[X = i, Y = j] = P[X = i]P[Y = j][1 + αψ(i)ϕ(j)]

where ψ and ϕ are some functions from N0 to R with

(8.2) E(ψ(X)) = E(ϕ(Y )) = 0

and

(8.3) |ψ(i)| ≤ 1 and |ϕ(i)| ≤ 1 for any i ∈ N0.

α is a parameter belonging to [−1, 1]. Alternatively, (8.3) can be skipped, but then constraints haveto be made on the parameter α that guarantee [1 + αψ(i)ϕ(j)] > 0 for any i and j in N0. Regardingthe correlation coefficient, (8.1) yields

Cov(X,Y ) = α

∞∑

i=1

∞∑

j=1

ijψ(i)ϕ(j)P[X = i]P[Y = j] = αE(Xψ(X)) E(Y ϕ(Y )),

and hence

%(X,Y ) =αE(Xψ(X)) E(Y ϕ(Y ))√

λ1λ2

,

where X ∼ P(λ1) and Y ∼ P(λ2). In case when λ1 = λ2 and ϕ = ψ, we get

%(X,Y ) =α(E(Xψ(X))

)2

λ,

and hence negative correlation whenever α < 1. This procedure was considered by Griffiths et al.(1979), who set

ψ(k) = ϕ(k) =γke1−γ − 1

1 + e2

with γ ∈ [−1, 1] and λ1 = λ2 = 1 and obtain a negatively correlated distribution whenever α < 0. Asimilar idea was used by Lee (1996) and Lakshminarayana et al. (1999) who choose

ψ(i) = e−i − e−λ1(1− 1e) and ϕ(j) = e−j − e−λ2(1− 1

e)

and construct a bivariate Poisson distribution with P(λ1) and P(λ2) margins belonging to the so-calledSarmanov family of distributions. Again, upon setting α < 0 a negative correlation is achieved which,however, will tend to zero for increasing values of λ1 and λ2 (cf. Lakshminarayana et al. (1999)). Notethat choosing the Farlie-Gumbel-Morgenstern family1 of copulas can be viewed as a special case of theabove mentioned construction by setting ψ(i) = (1 − 2F (i)) and ϕ(j) = (1 − 2G(j)).

1For definition, and references, see e.g. Nelsen (1999, pp.68). As can be found therein, the Farlie-Gumbel Morgensternfamily is positively ordered and abs. continuous w.r.t. λ. However, it does not include the Frechet-Hoeffding bounds,not even as limiting cases. Hence, its range of dependence is limited (c.f. also Joe (1997)).

Page 127: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

8.2. COPULA MODELS 127

8.2 Copula Models

Here, we shall follow another scheme, however. As has been already pointed out e.g. by Joe (1997),we can obtain a bivariate Poisson distribution from (7.2) upon choosing a copula C and setting

F = G = P(λ)|λ > 0.

Such models may not possess a stochastic interpretation, but can cover a wide range of dependence,including the desired negative correlation. Furthermore, many parametric families of copulas can beeasily simulated.

The copula approach yields an elegant solution especially when considering a bivariate Poisson dis-tribution H with minimum correlation, as noted already by Griffiths et al. (1979) as well as Nelsen(1987). As mentioned before, the correlation coefficient achieves its minimum when choosing theFrechet-Hoeffding lower bound W as the underlying copula. Therefore, if we denote by F and Gthe distribution functions of P(λ1) and P(λ2) respectively and by hij the probabilities assigned to(i− 1, i] × (j − 1, j], we get, upon setting

hij = W(F (i), G(j)) −W(F (i− 1), G(j)) −W(F (i), G(j − 1)) + W(F (i− 1), G(j − 1))

= max(F (i) +G(j) − 1, 0) − max(F (i− 1) +G(j) − 1, 0)

− max(F (i) +G(j − 1) − 1, 0) + max(F (i− 1) +G(i− 1) − 1, 0),

a bivariate Poisson distribution with P(λ1) and P(λ2) margins and the minimum possible (negative)correlation.

This procedure can alternatively be visualized by a two-dimensional analogue for the generation ofrandom variables from a discrete distribution, using the Frechet-Hoeffding lower bound in terms of apair (U, 1− U) where U is a uniformly U(0, 1)-distributed random variable.If we subdivide the x- and y-axis in the unit square by the values F (k) and G(k) for k = 0, 1, 2, . . .each where F and G as above denote the univariate cumulative distribution functions of a bivariatePoisson distribution with P(λ1) and P(λ2) margins, then the unit square is divided into a countablyinfinite collection of rectangles Rij given by Rij :=

(F (i− 1), F (i)

(G(j − 1), G(j)

]for i, j ≥ 0 with

F (−1) = G(−1) = 0.

0

0.2

0.4

0.6

0.8

1

Y

0.2 0.4 0.6 0.8 1X0

0.2

0.4

0.6

0.8

1

Y

0.2 0.4 0.6 0.8 1X

Figure 8.1: Generation of maximum negatively dependent Poisson random variables

Page 128: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

128 CHAPTER 8. NEGATIVELY CORRELATED BIVARIATE POISSON DISTRIBUTIONS

Generating a pair (U, 1−U) then produces a random ”point” in the unit square which with proba-bility one hits exactly one of these rectangles Rij whose upper coordinates F (i) and G(j) determine thevalues of the discrete random variables X = i and Y = j which are, by construction, P(λ1) and P(λ2)-distributed, respectively and have minimal (negative) correlation. The graphs in Figure 8.1 show howthis works in the situation of a pair (X,Y ) with P(1) and P(2) margins (Figure 8.1 left) and P(3) andP(5) margins (Figure 8.1 right). The vertical lines here correspond to the subdivision of the x-axisaccording to F , the horizontal lines to the subdivision of the y-axis according to G. In the above ex-ample, we have U = 0.3, resulting in the simulated pair (X,Y ) = (0, 4) [left] and (X,Y ) = (2, 6) [right].

The above graphics also show that hij will be greater than zero if and only if the negative diago-nal intersects the rectangle Rij . With (7.22) we have that at most a finite number of the probabilitieshij , i, j ≥ 1 will have a nonzero value, i.e., from a certain n on, all mass will be stored in hi0 and h0j

respectively. Moreover, as we showed in the preceding chapter, the probability density function canbe computed using the north-west corner rule. In this case, it can be formulated in the following way(cf (7.22)):

Algorithm 8.2.1. Step 1: Find the smallest m and n such that

G(0) ≥ 1 − F (m) and F (0) ≥ 1 −G(n);

Step 2: Set hij = 0 for i > m, j > 0 and for j > 0, j > n;

Step 3: Set hi0 = e−λ1

i! λi1 for i > m and h0j = e−λ2

j! λj2 for j > n;

Step 4: Calculate the probabilities hij as solutions of the following transportation problem

Minimize

m∑

i=1

n∑

j=1

ijhij , where

n∑

j=0

h0j = F (0) +G(n) − 1,m∑

i=0

hi0 = F (m) +G(0) − 1,

n∑

j=0

hij =e−λ1

i!λi1, i = 1, . . . ,m,

m∑

i=0

hij =e−λ2

j!λj2, j = 1, . . . , n,

using the north-west corner rule Algorithm 7.3.1.

Unfortunately, the structure of the above probability density function h will depend on the param-eters λ1 and λ2 in a rather complex way, so that no close formula is obtained. Generally we can saythat as the values of the parameters λ1 and λ2 increase, the number of cells (i− 1, i] × (j − 1, j] withnonzero probabilities rise and their positions move away from (0, 0).To illustrate this, we will now consider the case that the marginal distributions are equal, i.e.,λ1 = λ2 =: λ, and calculate the densities hij for certain values of λ. Clearly, the first step of theAlgorithm 8.2.1 reduces to

Step 1∗: Find the smallest n such that

(8.4) F (0) ≥ 1 − F (n).

Page 129: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

8.2. COPULA MODELS 129

In order to find such n, we have to check the inequality

2e−λ +

n∑

k=1

λk

k!e−λ ≥ 1

for different values of n. Clearly, if it is satisfied for some n, than the same is true for any n∗ greaterthan n. So, if n is the smallest number possible which meets the condition (8.4), we have

2e−λ +

m∑

k=1

λk

k!e−λ < 1 for all m < n and 2e−λ +

m∑

k=1

λk

k!e−λ ≥ 1 for all m ≥ n.

On the other hand, the following holds:

Lemma 8.2.1. The functions fmn : (0,∞) → R,

(8.5) fmn(λ) := F (m) + F (n) − 1 =

m∑

k=0

λk

k!e−λ +

n∑

k=0

λk

k!e−λ − 1

are decreasing. In addition, if λmn is a (positive) solution of fmn(λ) = 0, then

(8.6) F (m) + F (n) ≥ 1 for λ ∈ (0, λmn] and F (m) + F (n) < 1 for λ ∈ (λmn,∞).

Proof. The first part of the lemma can be easily shown by discussing the first-order derivatives, as isdone in the Appendix on page 147; (8.6) is then its trivial consequence.

Remark 8.2.1. Note that fmn ≤ fm∗n∗ and hence2 λmn ≤ λm∗n∗ as soon as m ≤ m∗ and n ≤ n∗.However, if either n > n∗ and m ≤ m∗ or n ≤ n∗ and m > m∗, there is no easy rule in general whichwould determine the order between λmn and λm∗n∗ . As we will see with our illustration, the highern is, the more λm∗n∗ have to be determined during the Algorithm 8.2.1 (in the worst case λm∗n∗ forall n∗ = 1, . . . , n − 1 and m∗ = 1, . . . , n∗). Because the position of λ with respect to these λm∗n∗

will in turn determine the rectangles Rij of positive measure, this is precisely the reason for why therelationship between the probability density function and the parameter of the Poisson marginals isso complex.

By means of Lemma 8.2.1, the functions

f0n(λ) := 2e−λ +n∑

k=1

λk

k!e−λ − 1

are decreasing on (0,∞). Hence, whether an n meets the requirements of Step 1∗ will depend on thevalue of the parameter λ, i.e. if λ0n in (0,∞) is such that f0n(λ0n) = 0, then

2e−λ +

n∑

k=1

λk

k!e−λ > 1 for λ ∈ (0, λ0n) and 2e−λ +

n∑

k=1

λk

k!e−λ ≤ 1 for λ ∈ [λ0n,∞).

For n = 0, λ00 is given by 2e−λ00 = 1 ⇔ λ00 = ln 2. If n > 0 however, the equation f0n(λ) = 0 hasto be solved numerically. For n = 1, . . . , 5, the solutions are summarized in the following table:

n 0 1 2 3 4 5λ0n 0.6931 1.1462 1.5681 1.9762 2.3762 2.77093

2Note that since limλ→0 fmn(λ) = 1 and limλ→∞fmn(λ) = −1, a (unique) solution λmn always exists.

Page 130: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

130 CHAPTER 8. NEGATIVELY CORRELATED BIVARIATE POISSON DISTRIBUTIONS

We complete this illustration by calculating the probability density function for λ in the intervals(0, 0.6931], (0.6931, 1.1462] and (1.1462, 1.5681].

Example 8.1. If λ is in (0, 0.6931], then n determined in Step 1∗ is zero. Hence, hij = 0 for i > 0, j > 0

and hi0 = h0i = e−λ1

i! λi for i > 0. In addition, according to Step 4,

h00 = min(2F (0) − 1, 2(F (0))− 1) = 2F (0) − 1 = 2e−λ − 1.

Example 8.2. For λ in (0.6931, 1.1462], we have n = 1 and hij = 0 for i > 1, j > 0 or i > 0, j > 1

according to Step 2 and hi0 = h0i = e−λ1

i! λi for i > 1 according to Step 3. If we now set

p0 = q0 := F (0) + F (1) − 1 = 2e−λ + λe−λ − 1 and p1 = q1 = F (1) − F (0) = λe−λ,

the remaining hij , i, j = 0, 1 can be determined by solving

Minimize h11 where

h00 + h01 = p0, h00 + h10 = q0,

h10 + h11 = p1 h01 + h11 = q1

by the north-west corner rule. Because F (0) + F (1)− 1− (F (1)−F (0)) = 2F (0)− 1 is negative for λin (0.6931, 1.1462], Step 1 of the north-west corner rule Algorithm 7.3.1 yields

h01 = min(p0, q1) = min(F (0) + F (1) − 1, F (1) − F (0)) = F (0) + F (1) − 1 = 2e−λ + λe−λ − 1

which in turn implies h00 = 0 and, by symmetry, h10 = F (0) + F (1)− 1. Replacing p0 by 0 and q1 by(F (1) − F (0)) − (F (0) + F (1) − 1) = 1 − 2F (0) leads finally to

h11 = min(p1, q1) = min(F (1) − F (0), 1 − 2F (0)) = 1 − 2F (0) = 1 − 2e−λ.

Hence, in the unit square, only the following shaded regions will have positive measure:

F (0) F (1) F (2)10

F (0)

F (1)

F (2)

1

Example 8.3. If λ is in (1.1462, 1.5681], then n determined in Step 1∗ is 2. In addition, according

to Steps 2-3 , hij = 0 for i > 1, j > 0 or i > 0, j > 2 and hi0 = h0i = e−λ1

i! λi for i > 2. Again, theremaining hij , i, j = 0, . . . , 2 can be determined by the north-west corner rule applied to

Page 131: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

8.2. COPULA MODELS 131

Minimize h11 + 2h21 + 2h12 + 4h22 where

h00 + h01 + h02 = p0, h00 + h10 + h20 = q0,

h10 + h11 + h12 = p1, h01 + h11 + h21 = q1,

h20 + h21 + h22 = p2. h02 + h12 + h22 = q2.

with p0 = q0 = F (0) + F (2)− 1 and pi = qi = F (i)− F (i− 1) for i = 1, 2. Because p0 − q2 is equal toF (0) + F (1) − 1 and hence negative (cf. (8.6)),

h02 = min(p0, q2) = p0 = F (0) + F (2) − 1 = e−λ(2 + λ+ λ2/2) − 1.

In addition, h01 = h00 = 0. Replacing p0 by zero and q2 by q2 − (F (0) +F (2)− 1) = 1−F (1)− F (0),leads further to

h12 = min(p1, q2) = min(F (1) − F (0), 1 − F (1) − F (0)).

Now, p1 − q2 is equal to F (1)+F (1)− 1. In order to determine h12, we therefore have to calculate λ11

and check whether it lies in (1.1462, 1.5681]. By numeric methods, λ11 is equal to 1.6783 and henceF (1) + F (1) − 1 ≥ 0 for all λ considered in this example. Hence,

h12 = min(p1, q2) = q2 = 1 − F (1) − F (0) = 1 − e−λ(2 + λ)

and, consequently, h22 = 0. Replacing q2 by zero and p1 by F (1) + F (1) − 1 yields finally h11:

h11 = min(p1, q1) = min(F (1) + F (1) − 1, F (1) − F (0)) = F (1) + F (1) − 1 = 2e−λ(1 + λ) − 1.

The remaining hij are equal to hji by symmetry, i.e. h10 = 0, h20 = e−λ(2 + λ + λ2/2) − 1 andh21 = 1 − e−λ(2 + λ). Hence, the regions in the unit square with positive mass are the following:

F (0) F (1) F (2)10

F (0)

F (1)

F (2)

1

For λ in (1.5681, 2.77093], we can proceed in a completely analogous manner. However, this regionhas to be subdivided further by the following, numerically determined λmn:

λ11 λ12 λ13 λ22

1.6783 2.1559 2.6083 2.6741

To complete our illustration, we present a graphics which shows those regions in the unit square, whichhave positive measure for λ in (1.5681, 2.6083].

Page 132: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

132 CHAPTER 8. NEGATIVELY CORRELATED BIVARIATE POISSON DISTRIBUTIONS

0

0.2

0.4

0.6

0.8

1

Y

0.2 0.4 0.6 0.8 1X0

0.2

0.4

0.6

0.8

1

Y

0.2 0.4 0.6 0.8 1X

λ ∈ (0, 0.6931] λ ∈ (0.6931, 1.1461]

0

0.2

0.4

0.6

0.8

1

Y

0.2 0.4 0.6 0.8 1X0

0.2

0.4

0.6

0.8

1

Y

0.2 0.4 0.6 0.8 1X

λ ∈ (1.1461, 1.5681] λ ∈ (1.5681, 1.6783]

0

0.2

0.4

0.6

0.8

1

Y

0.2 0.4 0.6 0.8 1X0

0.2

0.4

0.6

0.8

1

Y

0.2 0.4 0.6 0.8 1X

λ ∈ (1.6783, 1.9762] λ ∈ (1.9762, 2.1559]

0

0.2

0.4

0.6

0.8

1

Y

0.2 0.4 0.6 0.8 1X0

0.2

0.4

0.6

0.8

1

Y

0.2 0.4 0.6 0.8 1X

λ ∈ (2.1559, 2.3762] λ ∈ (2.3762, 2.6083]

Figure 8.2: The north-west corner rule

Page 133: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

8.2. COPULA MODELS 133

Remark 8.2.2. If the marginals have different parameters, λ1 and λ2 say, then the calculations areanalogous, although even more complex. The visualization of the rectangles Rij in the unit squarewhich have non-zero mass is similar: again, the number of such rectangles rises with λ1 +λ2 and theirposition ”moves away” from (0, 0). But clearly, the graphics will be no longer symmetric with respectto the main diagonal, because hij 6= hji in general.

Remark 8.2.3. Griffiths et al. (1979) present a method how to construct bivariate Poisson distributionswith negative correlation from an independent pair of Poisson distributed variables by shifting theprobability mass under preserving the margins. It is now worth noticing that the optimality of thenorth west corner rule algorithm is based on the same idea of shifting mass away from the maindiagonal in the sense that this algorithm yields the maximum possible shift.

In a similar way, by choosing various other families of copulas, arbitrary joint Poisson distributionscan be constructed. For instance, if we choose the Frank or the Plackett’s family, we get a one-parameter family of joint Poisson distributions, which are negatively correlated for a proper choiceof the copula parameter. Moreover, the minimum possible negative correlation can be achieved as alimiting case.

Figure 8.3: Generation of random variables with Poisson marginals and Frank copula.

Page 134: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

134 CHAPTER 8. NEGATIVELY CORRELATED BIVARIATE POISSON DISTRIBUTIONS

Figure 8.1 shows a simulation of 100 points distributed according to a Frank copula with θ = −10,together with the grid generated by the distribution of a pair (X,Y ) with P(1) and P(2) margins as inFigure 8.1 [left]. For instance, all simulated random points falling into the shaded rectangle generatethe (same) pair (1, 2).

8.3 Calculation of the Minimum Correlation

Although the correlation is minimal possible for members of HW , it will depend both on λ1 and λ2

because the joint probability density does. Nelsen (1987) and Griffiths et al. (1979) use the followingexpression for the correlation coefficient (c.f. Hoeffding’s Lemma 4.3.6):

%W =

∑∞i=0

∑∞j=0 P[X > i, Y > j] − λ1λ2√

λ1λ2

=−∑∞

i=0

∑∞j=0 min(F (i) +G(j) − 1, 0) − λ1λ2√

λ1λ2

.

Unfortunately, the above expression depends on the parameters λ1 and λ2 in a complex way and can-not be easily evaluated explicitly, in general. Since neither hi0 nor h0j contribute to the covariance,this will be the case even if there is only a finite number of hij involved. Therefore, we have to usenumerical or indirect methods, such as the north-west corner rule algorithm, in order to obtain theminimum correlation.

Visually, hij , i, j ≥ 1 will provide a non zero contribution to the covariance iff the negative diago-nal intersects the rectangle Rij .

Figure 8.4: Areas in the unit square contributing to the correlation coefficient

The case of λ1 = λ2 (= λ, say) is again of particular interest here because the calculation of thecorresponding minimal correlation %(λ) simplifies a little. The above Figure shows the situation for

Page 135: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

8.3. CALCULATION OF THE MINIMUM CORRELATION 135

λ = 2; the numbers in the rectangles correspond to the values of X · Y . The shaded rectangles denotethose areas which are involved in the calculation of the minimal (negative) correlation (cf. also Nelsen(1987) for some corresponding tabulated values). The following table summarizes an evaluation of %(λ)in functional form which can be extracted from the calculations in the preceding Section 8.2, givenpiecewise as %n(λ) in the interval (λmn, λm∗n∗ ] :

n %n(λ) λ in

1 −λ (0, λ00] = (0, 0.6931]

2 1−2e−λ

λ − λ (λ00, λ01] = (0.6931, 1.1462]

3 3−(6+2λ)e−λ

λ − λ (λ01, λ02] = (1.1462, 1.5681]

4 5−(10+4λ+λ2)e−λ

λ − λ (λ02, λ11] = (1.5681, 1.6783]

5 6−(12+6λ+λ2)e−λ

λ − λ (λ11, λ03] = (1.6783, 1.9762]

68−(16+8λ+2λ2+ λ3

3)e−λ

λ − λ (λ03, λ12] = (1.9762, 2.1559]

710−(20+12λ+3λ2+ λ3

3)e−λ

λ − λ (λ12, λ04] = (2.1559, 2.3762]

812−(24+14λ+4λ2+ 2λ3

3+ λ4

12)e−λ

λ − λ (λ04, λ13] = (2.3762, 2.6083]

Surprisingly, the correlation %(λ) is not a decreasing function of λ.

–1

–0.9

–0.8

–0.7

–0.6

–0.5

–0.4

–0.3

–0.2

–0.1

0

rho

0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 2.2 2.4 2.6lambda

Figure 8.5: Minimum correlation coefficient plotted against λ

In Figure 8.3, a plot of %n(λ) in the intervals (λmn, λm∗n∗ ] given by the above table is shown. Note

Page 136: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

136 CHAPTER 8. NEGATIVELY CORRELATED BIVARIATE POISSON DISTRIBUTIONS

that such a plot has been provided already by Griffiths et al. (1979), however on a purely numericalbasis only, except for the first interval.To complete the mathematical analysis made so far, we shall give a simple elementary proof for theminimality of the functions %n(λ) for n = 1, 2, 3 here. Note that this is equivalent to prove that

E(XY ) ≥

0, 0 < λ < λ1

1 − 2e−λ, λ1 < λ < λ2

3 − (6 + 2λ)e−λ, λ2 < λ < λ3.

Case 1: trivial;

Case 2: Let A0 := X = 0, B0 := Y = 0, then P(A0) = P(B0) = e−λ, and hence, denoting by 1Athe indicator random variable for the event A, since XY ≥ 1A

0

1B0

,

E(XY ) ≥ E(1(A0∪B0)) = P(A0 ∩B

0 ) = 1 − P(A0 ∪ B0) ≥ 1 − P(A0) − P(B0) = 1 − 2e−λ.

Case 3: Let further A1 := X = 1, B1 := Y = 1, then XY ≥ 1(A0∪B0) + 1(A0∪A1)∩B0

+

1A0∩(B0∪B1) and hence

E(XY ) ≥ P((A0 ∪ B0)

)

+ P((A0 ∪ A1)

∩ B0

)+ P

(A

0 ∩ (B0 ∪B1))

= 1 − P((A0 ∪ B0)

)+ 1 − P

((A0 ∪ A1) ∪ B0

)+ 1 − P

(A0 ∪ (B0 ∪ B1)

)

≥ 3 − 3P(A0) − 3P(B0) − P(A1) − P(B1) = 3 − 6e−λ − 2λe−λ,

as stated.

The case of different parameters λ1 and λ2 can in principle be tackled the same way, although moresubtle cases have to be distinguished. For simplicity, we shall outline the analogues of cases 1 and 2only, i.e.

E(XY )

0, e−λ1 + e−λ2 ≥ 1,

1 − e−λ1 + e−λ2 , e−λ1 + e−λ2 ≤ 1 and

min(e−λ1 + (1 + λ2)e−λ2 , (1 − λ1)e

−λ1 + e−λ2) ≥ 1,

which follows immediately along the lines of the proof above, or, in terms of the correlation %(λ1, λ2):

%(λ1, λ2)

−√λ1λ2, e−λ1 + e−λ2 ≥ 1,

1−e−λ1+e−λ2√λ1λ2

−√λ1λ2, e−λ1 + e−λ2 ≤ 1 and

min(e−λ1 + (1 + λ2)e−λ2 , (1 − λ1)e

−λ1 + e−λ2) ≥ 1,

which provides indeed the lower attainable bounds for the correlation in the range specified (see alsoGriffiths et al. (1979) for tabulated values). The following picture shows a 3D-plot of this function.

Page 137: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

8.3. CALCULATION OF THE MINIMUM CORRELATION 137

0

0.5

1x0 0.2 0.4 0.6 0.8 1y

–0.8

–0.6

–0.4

–0.2

0 0.2 0.40.60.8 1x0 0.2 0.4 0.6 0.8 1

y

–0.8

–0.6

–0.4

–0.2

Figure 8.6:

Naturally, not only negatively correlated Poisson random variables can be constructed or simulatedthis way. Choosing the appropriate copulas (e.g. the Frank family with θ being arbitrary), all possibledependencies and correlations can be achieved. For the problem of positive correlation with differentλ1 and λ2 (the only non-trivial case), see e.g. Nelsen (1987) or Griffiths et al. (1979).In higher dimensions, we can use a similar setup, i.e. construct multivariate Poisson distributions bychoosing a copula. However, as noted in Corollary 3.5.2, multivariate Archimedean copulas are alwayspositive orthant dependent, and hence members of the class HC will have only positive correlation.Therefore, in order to obtain multivariate Poisson distribution with negative correlation, other copulashave to be used, as for example the Gauss or t-copula.

The case of minimal (negative) correlation for identical parameters λ1 = λ2 (= λ, say) also playsa general role for the construction of negatively (and even positively) correlated Poisson distributedrandom variables with non-identical parameters. We close this chapter by showing how this can bedone.Suppose for instance that (X,Y ) is a Poisson distributed pair with margins P(λ) each and V andW are further Poisson distributed random variables, independent of each other and of (X,Y ) withparameters µ and ν, respectively. Then the random variables X + V and Y + W are also Poissondistributed, with parameters λ+ µ and λ+ ν respectively, and negative correlation

%(X + V, Y +W ) =λ%(X,Y )√

(λ+ µ)(λ + ν)< 0.

If Z is another Poisson distributed random variable with parameter τ , independent of X,Y, V and Wthen

%(X + V + Z, Y +W + Z) =λ%(X,Y ) + τ√

(λ+ µ+ τ)(λ + ν + τ)< 0.

which can even achieve positive values.This shows again that there are many ways to generate Poisson distributed random variables withthe same parameters and the same correlation, but with different joint distributions. Likewise, if thepair (V,W ) has itself a positive correlation % = %(V,W ) = −%(X,Y ) and we have µ = ν then X + Vand Y + W are indeed uncorrelated, but not independent. Note that such a pair (V,W ) can easilybe generated through the choice V = S + T , W = S + U with independent S, T , U being Poissondistributed with parameters

λS = %µ, λT = λU = (1 − %)µ.

Page 138: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

138 CHAPTER 8. NEGATIVELY CORRELATED BIVARIATE POISSON DISTRIBUTIONS

Hence, if we keep the sum λ+µ = const then there is obviously a continuum of possibilities to constructuncorrelated, but not independent Poisson pairs X + V and Y +W with the same marginal Poissondistribution.

Page 139: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

Appendix A

Proofs

A.1 Chapter 4

Proof of Theorem 4.3.5.(4.29):

∫ 1

0

· · ·∫ 1

0

Mn(u1, . . . , un) dun · · · du1 =

n∑

i=1

∫ 1

0

∫ 1

ui

· · ·∫ 1

ui

ui dun · · · du1 dui

= n

∫ 1

0

∫ 1

u1

· · ·∫ 1

u1

u1 dun · · · du2 du1

= n

∫ 1

0

∫ 1

u1

· · ·∫ 1

u1

u1(1 − u1) dun−1 · · · du2 du1

= n

∫ 1

0

(1 − u1)un−11 du1 = n

∫ 1

0

un−11 du1 − n

∫ 1

0

un1 du1

=n

n− n

n+ 1=

1

n+ 1.

(4.30):

∫ 1

0

· · ·∫ 1

0

πn(u1, . . . , un) dun · · · du1 =

∫ 1

0

· · ·∫ 1

0

u1 · · ·un dun · · · du1 =

=

∫ 1

0

· · ·∫ 1

0

1

2· u1 · · ·un−1 dun−1 · · · du1 = · · · =

∫ 1

0

u1

2n−1du1 =

1

2n.

(4.29): first, note that the set where Wn is greater than zero can be characterized as follows:

u1 + · · · + un − n+ 1 ≥ 0 ⇐⇒ u1 ≥ n− 1 − (u2 + · · · + un)

& n− 1 − (u2 + · · · + un) ≤ 1 ⇐⇒ u2 ≥ n− 2 − (u3 + · · · + un)

& n− 2 − (u3 + · · · + un) ≤ 1 ⇐⇒ u3 ≥ n− 3 − (u4 + · · · + un)

...

& n− (k − 1) − (uk + · · · + un) ≤ 1 ⇐⇒ uk ≥ n− k − (uk+1 + · · · + un)

...

& n− (n− 1) − un ≤ 1 ⇐⇒ un ≥ 0

139

Page 140: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

140 APPENDIX A. PROOFS

Hence, with Fubini’s Theorem,

∫ 1

0

· · ·∫ 1

0

Wn(u1, . . . , un) du1 · · · dun =

=

∫ 1

0

· · ·∫ 1

n−k−(uk+1+···+un)

· · ·∫ 1

n−1−(u2+···+un)

((u1 + · · · + un) − n+ 1

)du1 · · · dun

Substitution s1 := (u1 + · · · + un) − n+ 1 :

=

∫ 1

0

· · ·∫ 1

n−k−(uk+1+···+un)

· · ·∫ −n+2+(u2+···+un)

0

s1 ds1 · · · dun

=

∫ 1

0

· · ·∫ 1

n−k−(uk+1+···+un)

· · ·∫ 1

n−2+(u3+···+un)

1

2

((u2 + · · · + un) − n+ 2

)2du2 · · · dun

Substitution s2 := (u2 + · · · + un) − n+ 2 :

=

∫ 1

0

· · ·∫ 1

n−k−(uk+1+···+un)

· · ·∫ 1

n−3−(u4+···+un)

∫ −n+3+(u3+···+un)

0

1

2(s2)

2 ds2 du3 · · · dun

=

∫ 1

0

· · ·∫ 1

n−k−(uk+1+···+un)

· · ·∫ 1

n−3−(u4+···+un)

1

3!((u3 + · · · + un) − n+ 3)3 du3 · · · dun

...

=

∫ 1

0

· · ·∫ 1

n−k−(uk+1+···+un)

1

k!

((uk + · · · + un) − n+ k

)kduk · · · dun

Substitution sk := (uk + · · · + un) − n+ k :

=

∫ 1

0

· · ·∫ 1

n−(k+1)−(uk+2+···+un)

∫ −n+(k+1)+(uk+1+···+un)

0

1

k!(sk)

k dsk duk+1 · · · dun

=

∫ 1

0

· · ·∫ 1

n−(k+1)−(uk+2+···+un)

1

(k + 1)!

((uk+1 + · · · + un) − n+ (k + 1)

)k+1duk+1 · · · dun

...

=

∫ 1

0

1

n!(−n+ n+ un)

n dun =

∫ 1

0

1

n!unn dun =

1

(n+ 1)!.

A.2 Chapter 5

Proof of Proposition 5.2.1.Throughout the proof, we will make use of Notation 5.0.1 and, furthermore, set

ai := F (ξi) and bj := G(ηj) i = 0, . . . ,m, j = 0, . . . , n.

Now, the first part of the proposition is straightforward. The subcopula S> of (Y,X) clearly satisfiesS>(u, v) = S(v, u). Hence, if k ∈ 0, . . . ,m− 1 and l ∈ 0, . . . , n− 1 are such that

bl ≤ u ≤ bl+1 and ak ≤ v ≤ ak+1,

Page 141: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

A.2. CHAPTER 5 141

then (assume, without loss of generality, that (u, v) 6∈ ranF × ranG),

C>S (u, v) =

1

(ak+1 − ak)(bl+1 − bl)

[(ak+1 − v)(bl+1 − u)S(ak , bl) + (v − ak)(bl+1 − u)S(ak+1, bl)+

(ak+1 − v)(u− bl)S(ak , bl+1) + (v − ak)(u− bl)S(ak+1, bl+1)]

= CS(v, u)

If T is strictly increasing and continuous on ranX , then the corresponding subcopula does not changeby means of Proposition 3.1.1. The support of T (X) is given by

T (ξ1) < T (ξ2) < · · · < T (ξm).

This in turn impliesP[T (X) ≤ T (ξi)] = F (ξi) = ai,

and hence the standard extension copula of (T (X), Y ) is CS . If, on the other hand, T is strictlydecreasing, the support of the distribution function F of T (X) changes according to

T (ξm) < T (ξm−1) < · · · < T (ξ1).

If we set ak := P[T (X) ≤ T (ξm−k+1)] for k = 1, . . . ,m and a0 := 0 as well as ξm+1 := ξm + 1, we have

ak = P[X ≥ ξm−k+1] = 1 − F (ξm−k) = 1 − am−k for k = 0, . . . ,m.

Now, with Proposition 3.1.2, the unique subcopula of (T (X), Y ), S(u, v), is given by1 v −S(1− u, v).Therefore, if k and l are such that

ak ≤ u ≤ ak+1 and bl ≤ v ≤ bl+1,

the standard extension copula CS of (T (X), Y ) is given by (assume, without loss of generality, that(u, v) 6∈ ran F × ranG),

CS(u, v) =1

(ak+1 − ak)(bl+1 − bl)

[(ak+1 − u)(bl+1 − u)S(ak, bl) + (u− ak)(bl+1 − v)S(ak+1, bl)+

(ak+1 − u)(v − bl)S(ak, bl+1) + (u− ak)(v − bl)S(ak+1, bl+1)]

=1

(am−k − am−k−1)(bl+1 − bl)

[(1 − am−k−1 − u)(bl+1 − v)

(bl − S(1 − (1 − am−k), bl)

)+

(u− 1 + am−k)(bl+1 − v)(bl − S(1 − 1 + am−k−1, bl)

)+

(1 − am−k−1 − u)(v − bl)(bl+1 − S(1 − 1 + am−k, bl+1)

)+

(u− 1 + am−k)(v − bl)(bl+1 − S(1 − 1 + am−k−1, bl+1)

)]

= A−B,

where

B =1

(am−k − am−k−1)(bl+1 − bl)

[((1 − u) − am−k−1)(bl+1 − v)S(am−k , bl)+

(am−k − (1 − u))(bl+1 − v)S(am−k−1, bl) + ((1 − u) − am−k−1)(v − bl)S(am−k, bl+1)+

(am−k − (1 − u))(v − bl)S(am−k−1, bl+1)]

1Note that ranF and ranG are closed sets.

Page 142: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

142 APPENDIX A. PROOFS

and

A =1

(am−k − am−k−1)(bl+1 − bl)

[(1 − am−k−1 − u)(bl+1 − v)bl+

(u− 1 + am−k)(bl+1 − v)bl + (1 − am−k−1 − u)(v − bl)bl+1+

(u− 1 + am−k)(v − bl)bl+1

]

=1

(am−k − am−k−1)(bl+1 − bl)

[(bl+1 − v)bl

(1 − am−k−1 − u+ u− 1 + am−k

)+

(v − bl)bl+1

(1 − am−k−1 − u+ u− 1 + am−k

)]

=1

(am−k − am−k−1)(bl+1 − bl)

[(bl+1 − v)bl(am−k − am−k−1) + (v − bl)bl+1

(am−k − am−k−1)

]

=1

(bl+1 − bl)

[bl+1bl − vbl + vbl+1 − bl+1bl

]=

1

(bl+1 − bl)

[v(bl+1 − bl)

]= v.

Because

a∗k ≤ u ≤ a∗k+1 ⇐⇒ 1 − am−k ≤ u ≤ 1 − am−k−1 ⇐⇒ am−k−1 ≤ 1 − u ≤ am−k,

B is equal to CS(1 − u, v) and hence

CS(u, v) = A−B = v − CS(1 − u, v).

Proof of Proposition 5.2.2.The equation

H(x, y) = CS(F (x), G(y))

clearly holds for any x and y such that either F (x) or G(y) equals zero and one respectively. Further-more, since CS extends the unique subcopula corresponding to H and H(ξi, ηj), F (ξi) and G(ηj) agreewith H(ξi, ηj), F (ξi) and G(ηj) for all i = 0, . . . ,m and j = 0, . . . , n, respectively, the above equationholds at the points (ξi, ηj) for all i = 0, . . . ,m and j = 0, . . . , n. On the other hand, for some arbitraryx and y with ξ0 < x ≤ ξm and η0 < y ≤ ηn, we can find i and j such that

F (ξi−1) = F (ξi−1) < F (x) ≤ F (ξi) = F (ξi) and G(ηj−1) = G(ηj−1) < G(x) ≤ G(ηj) = G(ηj).

Hence we get for the right hand side of (5.8)

CS(F (x), G(y)) =

(1 − F (x) − F (ξi−1)

F (ξi) − F (ξi−1)

) (1 − G(y) −G(ηj−1)

G(ηj) −G(ηj−1)

)C(F (ξi−1), G(ηj−1))

+

(1 − F (x) − F (ξi−1)

F (ξi) − F (ξi−1)

) (G(y) −G(ηj−1)

G(ηj) −G(ηj−1)

)C(F (ξi−1), G(ηj))

+

(F (x) − F (ξi−1)

F (ξi) − F (ξi−1)

) (1 − G(y) −G(ηj−1)

G(ηj) −G(ηj−1)

)C(F (ξi), G(ηj−1))

+

(F (x) − F (ξi−1)

F (ξi) − F (ξi−1)

) (G(y) −G(ηj−1)

G(ηj) −G(ηj−1)

)C(F (ξi), G(ηj)).

Now, since

F (x) − F (ξi−1)

F (ξi) − F (ξi−1)=

1

F (ξi) − F (ξi−1)

(F (ξi−1) +

x− ξi−1

ξi − ξi−1(F (ξi) − F (ξi−1)) − F (ξi−1)

)

=1

F (ξi) − F (ξi−1)

(x− ξi−1

ξi − ξi−1

(F (ξi) − F (ξi−1)

))=x− ξi−1

ξi − ξi−1

Page 143: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

A.2. CHAPTER 5 143

and since C(F (ξi), G(ηj)) = H(ξi, ηj) for all i = 0, . . . ,m and j = 0, . . . , n, the right hand side equals

CS(F (x), G(y)) =

(1 − x− ξi−1

ξi − ξi−1

) (1 − y − ηj−1

ηj − ηj−1

)H(ξi−1, ηj−1)

+

(1 − x− ξi−1

ξi − ξi−1

) (y − ηj−1

ηj − ηj−1

)H(ξi−1, ηj)

+

(x− ξi−1

ξi − ξi−1

) (1 − y − ηj−1

ηj − ηj−1

)H(ξi, ηj−1)

+

(x− ξi−1

ξi − ξi−1

) (y − ηj−1

ηj − ηj−1

)H(ξi, ηj)

= H(ξi−1, ηj−1) +x− ξi−1

ξi − ξi−1

(H(ξi, ηj−1) −H(ξi−1, ηj−1)

)

+y − ηj−1

ηj − ηj−1

(H(ξi−1, ηj) −H(ξi−1, ηj−1)

)

+x− ξi−1

ξi − ξi−1

y − ηj−1

ηj − ηj−1

(H(ξi, ηj) −H(ξi−1, ηj) −H(ξi, ηj−1) +H(ξi−1, ηj−1)

)

= H(ξi−1, ηj−1) +x− ξi−1

ξi − ξi−1

(H(ξi, ηj−1) −H(ξi−1, ηj−1)

)

+y − ηj−1

ηj − ηj−1

(H(ξi−1, ηj) −H(ξi−1, ηj−1)

)+x− ξi−1

ξi − ξi−1

y − ηj−1

ηj − ηj−1hij .

Proof of Theorem 5.5.4.We first show (5.41a) and make use of Lemma B.2.1 thereafter, which will enable us to see that (5.41b)follows straightforwardly with (5.41a).So suppose X and Y are discrete random variables with finite supports, distribution functions F andG, respectively and Carley’s upper bound copula CU . Throughout the proof, we will make use ofNotation 5.0.1 and set, as in Section 5.2.2, ai := F (ξi) and bj := G(ηj). To start with, recall (5.10),(5.14) and (5.12) and note that

(†) CU (αij , βij) =

i−1∑

k=1

j−1∑

l=1

hkl +

i−1∑

k=1

hkj +

j−1∑

l=1

hil.

Now if we setαij := αij − CU (αij , βij) and βij := βij − CU (αij , βij)

for i = 1, . . . ,m and j = 1, . . . , n, we have

(‡) βij − αij − βij = βij − αij − βij + CU (αij , βij) = −αij + CU (αij , βij) = −αij .Moreover, if we denote by Rij the rectangle (ai−1, ai] × (bj−1, bj ], i = 1, . . . ,m and j = 1, . . . , n, weget

1∫

0

1∫

0

CU (u, v) dCU (u, v) =

m∑

i=1

n∑

j=1

Rij

CU (u, v) dCU (u, v)

︸ ︷︷ ︸Iij

.

Now on one hand, the part of the support of CU which lies within the rectangle Rij is the line segmentconnecting the points (αij , βij) and (αi j+1, βi+1 j); on the other, CU equals

CU (αij , βij) + min(u− αij , v − βij) = min(u− αij , v − βij)

Page 144: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

144 APPENDIX A. PROOFS

for any u ∈ [αij , αi j+1] and v ∈ [βij , βi+1 j ] (c.f. Section 5.2.2). Consequently, the integral Iij evaluatesas follows:

Iij =

αi j+1∫

αij

min(u− αij , u+ βij − αij − βij) du(‡)=

αi j+1∫

αij

min(u− αij , u− αij) du

=(u− αij)

2

2

∣∣αi j+1

αij=

(αi j+1 − αij)2

2− (αij − αij)

2

2.

Now, since αi j+1 = αij + hij (by means of (5.12)), Iij simplifies in the following way:

Iij =h2ij + 2hij(αij − αij) + (αij − αij)

2

2− (αij − αij)

2

2

=h2ij + 2hij(αij − αij)

2=h2ij

2+ hijCU (αij , βij)

(†)=h2ij

2+ hij

[ i−1∑

k=1

j−1∑

l=1

hkl +

i−1∑

k=1

hkj +

j−1∑

l=1

hil

]

Hence,

ρτ (CU ) = 2

m∑

i=1

n∑

j=1

[h2ij + 2

(i−1∑

k=1

j−1∑

l=1

hijhkl +

i−1∑

k=1

hijhkj +

j−1∑

l=1

hijhil

)]− 1.

Furthermore, we have

( m∑

i=1

n∑

j=1

hij

)2

=

m∑

i=1

n∑

j=1

(h2ij +

i−1∑

k=1

n∑

l=1

hijhkl +

m∑

k=i+1

n∑

l=1

hijhkl +

j−1∑

l=1

hijhil +

n∑

l=j+1

hijhil

)

=

m∑

i=1

n∑

j=1

h2ij +

m∑

i=1

n∑

j=1

(i−1∑

k=1

j−1∑

l=1

hijhkl +

i−1∑

k=1

hijhkj +

i−1∑

k=1

n∑

l=j+1

hijhkl+

+

m∑

k=i+1

j−1∑

l=1

hijhkl +

m∑

k=i+1

hijhkj +

m∑

k=i+1

n∑

l=j+1

hijhkl +

j−1∑

l=1

hijhil +

n∑

l=j+1

hijhil

)

=m∑

i=1

n∑

j=1

h2ij +

m∑

i=1

n∑

j=1

(i−1∑

k=1

j−1∑

l=1

hijhkl +i−1∑

k=1

hijhkj +i−1∑

k=1

n∑

l=j+1

hijhkl +

j−1∑

l=1

hijhil

)

+m∑

i=1

n∑

j=1

( m∑

k=i+1

j−1∑

l=1

hijhkl +m∑

k=i+1

hijhkj +m∑

k=i+1

n∑

l=j+1

hijhkl +n∑

l=j+1

hijhil

)

=

m∑

i=1

n∑

j=1

h2ij +

m∑

i=1

n∑

j=1

(i−1∑

k=1

j−1∑

l=1

hijhkl +

i−1∑

k=1

hijhkj +

i−1∑

k=1

n∑

l=j+1

hijhkl +

j−1∑

l=1

hijhil

)

+

m∑

k=1

n∑

l=1

(k−1∑

i=1

n∑

j=l+1

hijhkl +

k−1∑

i=1

hijhkj +

k−1∑

i=1

l−1∑

j=1

hijhkl +

l−1∑

j=1

hijhil

)

=

m∑

i=1

n∑

j=1

h2ij + 2

m∑

i=1

n∑

j=1

(i−1∑

k=1

j−1∑

l=1

hijhkl +

i−1∑

k=1

hijhkj +

i−1∑

k=1

n∑

l=j+1

hijhkl +

j−1∑

l=1

hijhil

)

Page 145: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

A.2. CHAPTER 5 145

which yields finally (5.41a):

ρτ (CU ) = 2[( m∑

i=1

n∑

j=1

hij

︸ ︷︷ ︸=1

)2

− 2

m∑

i=1

n∑

j=1

i−1∑

k=1

n∑

l=j+1

hijhkl

]− 1

= 2 − 4

m∑

i=1

n∑

j=1

i−1∑

k=1

n∑

l=j+1

hijhkl − 1

= 1 − 4

m∑

i=1

n∑

j=1

i−1∑

k=1

n∑

l=j+1

hijhkl.

In addition, we show (5.41b) using Lemma B.2.1. With (5.15) we have CL = v−CU (1−u, v) where CU isthe Carley’s upper extension copula corresponding to T (X) and Y for an arbitrary strictly decreasingand continuous transformation on ranX by means of Proposition 3.1.2. Hence, with (B.7),

ρτ (CL) = 2 − 4

∫CU dCU − 1 = −ρτ (CU ).

If we now make use of what we have already proved for ρτ (CU ), we get

ρτ (CL) = 4

m∑

i=1

n∑

j=1

i−1∑

k=1

n∑

l=j+1

hij hkl − 1,

where hij are the joint probability densities corresponding to (T (X), Y ). But, according to Remark5.5.4 we know that these densities look like as follows:

hij = P[T (X) = ξi, Y = ηj ] = P[T (X) = T (ξm−i+1), Y = ηj ] = P[X = ξm−i+1, Y = ηj ] = hm−i+1 j .

This finally leads to

ρτ (CL) = 4

m∑

i=1

n∑

j=1

i−1∑

k=1

n∑

l=j+1

hij hkl − 1 = 4

m∑

i=1

n∑

j=1

i−1∑

k=1

n∑

l=j+1

hm−i+1 jhm−k+1 l − 1

i∗:=m−i+1= 4

m∑

i∗=1

n∑

j=1

m−i∗∑

k=1

n∑

l=j+1

hi∗jhm−k+1 l − 1

k∗:=m−k+1= 4

m∑

i∗=1

n∑

j=1

m∑

k∗=i∗+1

n∑

l=j+1

hi∗jhk∗l − 1

= 4

m∑

k∗=1

n∑

l=1

k∗−1∑

i∗=1

l−1∑

j=1

hi∗jhk∗l − 1

which completes the proof.

Proof of (5.43) from Example 5.3.For Bernoulli margins, the Carley’s upper extension is a shuffle of M as given by (5.3) from Theorem5.1.1 (with θ = h00, a = h00 + h01 and b = h00 + h10). Recall also Figure 5.1.

CU = M[4, [0, h00], [h00, h00+h01], [h00+h01, h00+h01+h10], [h00+h01+h10, 1], (1, 3, 2, 4), (1, 1, 1, 1)

].

Page 146: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

146 APPENDIX A. PROOFS

Hence,

1∫

0

1∫

0

uv dCU (u, v) =

h00∫

0

u2 du+

h00+h01∫

h00

u(u− h00 + h00 + h10) du

+

h00+h01+h10∫

h00+h01

u(u− h00 − h01 + h00) du+

1∫

h00+h01+h10

u2 du

=

1∫

0

u2 du+

h00+h01∫

h00

uh10 du−h00+h01+h10∫

h00+h01

uh01 du

=1

3+h10(h00 + h01)

2

2− h10h

200

2− h01(h00 + h01 + h10)

2

2+h01(h00 + h01)

2

2.

Since

h10(h00 + h01)2 − h10h

200 − h01(h00 + h01 + h10)

2 + h01(h00 + h01)2

= h10h200 + 2h10h00h01 + h10h

201 − h10h

200 − h01h

200 − h3

01 − h01h210

− 2h201h00 − 2h01h00h10 − 2h2

01h10 + h01h200 + h3

01 + 2h201h00

= −h01h210 − h2

01h10 = −h01h10(h01 + h10)

we have1∫

0

1∫

0

uv dCU (u, v) =1

3− h01h10(h01 + h10)

2

and consequently

ρS(CU ) = 3(4

1∫

0

1∫

0

uv dCU (u, v) − 1)

= 3(4

3− 2h01h10(h01 + h10) − 1

)= 1 − 6h01h10(h01 + h10).

As to the lower extension CL, we have with (5.15) CL = v − CU (1 − u, v) where CU is the Carley’supper extension corresponding to T (X) and Y for an arbitrary strictly decreasing and continuoustransformation on ranX by means of Proposition 3.1.2 and Corollary 5.2.1. In particular, CU is theupper extension of 1 −X and Y . Hence, with (B.8),

ρS(CL) = 3(2 − 4

∫CU dCU − 1

)= −ρS(CU ).

If we now make use of just derived the formula for ρS(CU ), we get

ρS(CL) = 6h01h10(h01 + h10) − 1,

where hij are the joint probability densities corresponding to (T (X), Y ). But, according to Remark5.5.4 we know that these densities look like as follows:

hij = P[1 −X = i, Y = j] = P[X = 1 − i, Y = j] = h1−i j , i, j = 0, 1.

This finally leads to

ρS(CL) = 6h01h10(h01 + h10) − 1 = 6h11h00(h11 + h00) − 1.

Page 147: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

A.3. CHAPTER 8 147

Proof of Lemma 5.5.2.We show just the formula (5.48), for (5.49) follows in a completely analogous way.

Var X =

m∑

i=1

pi

(F (ξi) + F (ξi−1) − 1

)2

4

=1

4

[ m∑

i=1

pi(F (ξi) + F (ξi−1)

)2 − 2

m∑

i=1

pi(F (ξi) + F (ξi−1)

)+

m∑

i=1

pi

]

(5.47)=

1

4

[ m∑

i=1

(F (ξi) − F (ξi−1)

)(F (ξi) + F (ξi−1)

)2 − 1]

=1

4

[ m∑

i=1

((F (ξi)

)3 −(F (ξi−1)

)3) − 1

3

m∑

i=1

(F (ξi) − F (ξi−1)

)3+

+1

3

m∑

i=1

((F (ξi)

)3 −(F (ξi−1)

)3) − 1]

=1

4

[ m∑

i=1

(F (ξi)

)3 −m−1∑

i=0

(F (ξi)

)3 − 1

3

m∑

i=1

p3i +

1

3

m∑

i=1

(F (ξi)

)3 − 1

3

m−1∑

i=0

(F (ξi)

)3 − 1]

=1

4

[1

3− 1

3

m∑

i=1

p3i

]=

1

12

[1 −

m∑

i=1

p3i

]

A.3 Chapter 8

Proof of Lemma 8.2.1.First, if m = n = 0, then

f00(λ) = 2e−λ − 1.

The first order derivative is given by f′

00(λ) = −2e−λ and thus negative for all λ. In particular, f00

is decreasing on its domain. Next, consider that either n or m is zero; without loss of generality m.Then

f0n = 2e−λ +n∑

k=1

λk

k!e−λ − 1

and

f′

0n(λ) = e−λ(−2 +

n∑

k=1

λk−1(k − λ)

k!

)

= e−λ(−2 +

n−1∑

k=0

λk

k!−

n∑

k=1

λk

k!

)

= e−λ(−2 + 1 − λn

n!

)= −e−λ

(1 +

λn

n!

)< 0

Page 148: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

148 APPENDIX A. PROOFS

Again, this yields that f0n is decreasing. Finally, for n,m > 0, the first order derivative is given by

f′

nm(λ) = e−λ(−2 +

m∑

k=1

λk−1(k − λ)

k!+

n∑

k=1

λk−1(k − λ)

k!

)

= e−λ(−2 + 1 − λm

m!+ 1 − λn

n!

)

= −e−λ(λmm!

+λn

n!

)< 0

and hence negative, which completes the proof.

Page 149: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

Appendix B

Results from Probability and

Measure

B.1 Differentiation of Lebesgue-Stieltjes Measures

This survey basically relies on the monograph by Munroe (1968). Other books on Lebesgue-Stieltjesintegration dealing with this topic are e.g. Hahn and Rosenthal (1948) and Saks (1937).

In the following, let a cube denote an (arbitrary) interval in Rn with equal, nonzero sides.

Definition B.1.1. A sequence Ik of intervals in Rn converges to x ∈ Rn , in notation

Ik → x,

if x ∈ Ik for each k andlimk→∞

λn(Ik) = 0.

A sequence Ik of intervals converging to x is called a regular sequence if there is a constant α > 0such that for each k, there exists a cube Jk ⊃ Ik for which

λn(Ik)

λn(Jk)≥ α.

The constant α is called a parameter of regularity for the sequence Ik.Let F be a distribution function in Rn and µF the induced Lebesgue-Stieltjes measure.

Definition B.1.2. The upper derivative of µF at x is defined as

µ′F (x) = sup

Jk→xlim sup

k

µF (Jk)

λn(Jk),

where the supremum is taken over all sequences Jk of closed cubes converging to x. Similarly, thelower derivative of µF at x is defined as

µ′F(x) = inf

Jk→xlim inf

k

µF (Jk)

λn(Jk).

If the upper and lower derivative coincide and are finite, then µF is said to be differentiable at x . Thederivative of µF at x is then defined by

µ′F (x) = µ′

F (x) = µ′F(x).

149

Page 150: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

150 APPENDIX B. RESULTS FROM PROBABILITY AND MEASURE

Definition B.1.3. The regular upper and lower derivatives of µF at x are defined as

µ′∗F (x) = sup

Ik→xlim sup

k

µF (Ik)

λn(Ik)

and

µ′∗F

(x) = infIk→x

lim infk

µF (Ik)

λn(Ik),

respectively, where the supremum and infimum are taken over all regular sequences Ik of closedintervals converging to x. If the upper and lower regular derivatives coincide and are finite, then µFis said to be regularly differentiable at x. The regular derivative of µF at x is then defined by

µ′∗F (x) = µ′∗

F (x) = µ′∗F

(x).

Definition B.1.4. The strong upper and lower derivatives of µF at x are defined as

DµF (x) = supIk→x

lim supk

µF (Ik)

λn(Ik)

and

DµF (x) = infIk→x

lim infk

µF (Ik)

λn(Ik),

respectively, where the supremum and infimum are taken over all sequences Ik of closed intervalsconverging to x with λn(Ik) > 0 for each k. If the upper and lower regular derivatives coincide andare finite, then µF is said to be strongly differentiable at x. The strong derivative of µF at x is thendefined by

DµF (x) = DµF (x) = DµF (x).

Corollary B.1.1. The above derivatives are connected in the following way:

DµF (x) ≤ µ′∗F

(x) ≤ µ′F

(x) ≤ µ′F (x) ≤ µ′∗

F (x) ≤ DµF (x).

Consequently, existence of the strong derivative implies the existence of the regular derivative, whichin turn implies that of µ′

F (x).

Theorem B.1.1. Lebesgue Decomposition Theorem in Rn

If µF is an everywhere finite Lebesgue-Stieltjes measure in Rn, then µF is differentiable a.e. and hasa unique decomposition

µF = µAF + µSF ,

where µAF and µSF are absolutely continuous and singular w.r.t. λn respectively. Moreover,

µAF =

∫µ′F dλ

n and (µSF )′ = 0 a.e..

Proof. See (Munroe, 1968, Section 6.34).

Theorem B.1.2. Let f be a bounded, measurable function in Rn. If f is integrable and µF =∫fdλn,

then

DµF = f a.e.

Proof. See (Munroe, 1968, Section 6.35).

Page 151: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

B.2. CONDITIONAL PROBABILITIES 151

As noted by Munroe (1968, pp. 31), the existence of the strong derivative implies the existence ofthe double limit of

F (a1+h1,a2+h2)−F (a1,a2+h2)h1

− F (a1+h1,a2)−F (a1,a2)h1

h2

for hi → 0, i = 1, 2. Hence, if additionally the inside limits, i.e. the first order partial derivatives ofF , exist in some neighborhood of (a1, a2) then the second order partial derivatives,

∂x1∂x2F and

∂x2∂x1F

exist and are equal (c.f. Fichtenholz (1964, pp. 382)).

B.2 Conditional Probabilities

Definition B.2.1. Let (Ω,A) and (Ω1,B) be measurable spaces. A function K : Ω × B → [0, 1] iscalled a Markov kernel from (Ω,A) to (Ω1,B) if it satisfies the following conditions

(B.1) ω 7→ K(ω,B), ω ∈ Ω isA-measurable for all B ∈ B,

(B.2) for each ω ∈ Ω, B 7→ K(ω,B), B ∈ B is a probability measure.

Moreover, if P denotes a probability measure on (Ω,A), C a σ-algebra included in A andX a measurablefunction from (Ω,A) to (X,X ), then a Markov kernel K from (Ω,D) to (X,X ) satisfying

for each B ∈ X , ω 7→ K(ω,B) is a version of E(1B(X)|C),

is called a conditional distribution of X given C , in notation PX|D(ω,B).In particular, if the σ-algebra C is generated by a measurable function Y from (Ω,A) to (Y,Y), i.e.C = Y −1(Y), then the conditional distribution of X given C is referred to as the conditional distributionof X given Y and denoted by PX|Y (ω,B).

According to factorization (see Witting, 1985, chapter 1.6.3) or (Billingsley, 1995, Section 33), thefunction PX|Y (·, B) can, for any given B ∈ X , be expressed as H(·, B) Y , where H(·, B) is a Ymeasurable function from (Y,Y) to [0, 1]. If the family H(·, B) can be chosen in such way that thefunction (y,B) 7→ H(y,B) is a Markov kernel from (Y,Y) to (X,X ), then the probability measureB 7→ H(y,B) is called a conditional distribution of X given Y = y, in notation H(y,B) = PX|Y=y(B).In the following theorem another conditional distribution is mentioned, P (X,Y )|Y=y. By this we meanthe conditional distribution of (X,Y ) given Y = y, i.e. the conditional kernel from (Y,Y) to (X ×Y,X ⊗ Y) for a given y. Furthermore, PX|Y=y is a marginal distribution of P (X,Y )|Y=y, i.e.

PX|Y=y(B) = P (X,Y )|Y=y(B × Y), B ∈ X .

Theorem B.2.1. (Witting, 1985, Theorem 1.122) Let P be a probability measure on (Ω,A) and Xand Y denote measurable functions

X : (Ω,A) → (X,X ), Y : (Ω,A) → (Y,Y).

1. If (X,X ) is an Euclidean space, then for any given y ∈ Y, there always exists a conditionaldistribution PX|Y=y of X given Y = y.

2. If the conditional distribution of X given Y = y exists, then

Page 152: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

152 APPENDIX B. RESULTS FROM PROBABILITY AND MEASURE

(a)

(y,B) 7→ P (X,Y )|Y=y(B) := PX|Y=y(x : (x, y) ∈ B), B ∈ X × Yis a conditional probability of (X,Y ) given Y = y.

(b) If g is a X ⊗ Y measurable function, and the expectation Eg(X,Y ) exists, then

E(g(X,Y )|Y = y) =

Xg(x, y)dPX|Y=y(v) [PX ]

A×Bg(x, y)dP (X,Y )(x, y) =

A

B

g(x, y)dPX|Y=y(x)dP Y (y) ∀A ∈ X , ∀B ∈ Y .

Theorem B.2.2. (Witting, 1985, Theorem 1.126) In the situation of the above theorem, let µ andν be σ-finite measures over (X,X ) and (Y,Y) respectively. Furthermore, assume P (X,Y ) is absolutelycontinuous w.r.t. µ⊗ ν with density p(X,Y ) and let

pY (y) :=

X

pX,Y (x, y)dµ(x)

denote a marginal density of Y w.r.t. ν. Finally, let define the conditional density pX|Y=y w.r.t. µ ofX given Y = y by

pX|Y=y :=pX,Y (x, y)

pY (y)if pY (y) > 0,

and by an arbitrary µ-density otherwise. Then the following holds:

1. For any y ∈ Y, there always exists a conditional distribution of X given Y = y. Furthermore, itcan be given explicitly, i.e. if pY (y) > 0, then

(B.3) PX|Y=y(A) =

A

pX|Y=y(x)dµ(x) ∀A ∈ X .

(B.4) P (X,Y )|Y=y(B) =

x: (x,y)∈BpX|Y=y(x)dµ(x), ∀ B ∈ X × Y.

2. If g is a X ⊗ Y measurable function and the expectation Eg(X,Y ) exists, then

A×Bg(x, y)p(X,Y )(x, y)dµ⊗ ν =

A

[∫

B

g(x, y)pX|Y=y(x)dµ(x)]pY (y)dν(y) ∀A ∈ X , ∀B ∈ Y .

Theorem B.2.3. Transformation Formula for Lebesgue Densities

Let X and Y be n-dimensional real valued random vectors on a probability space (Ω,A,P). If thefollowing assumptions

i the distribution of X is concentrated on an open subset X in Bn, i.e. P[X ∈ X ] = 1,

ii T : X → Rn is injective and continuously partially differentiable, with Jacobian T ′,

iii Y ∼ T (X),

are all satisfied, then

Page 153: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

B.2. CONDITIONAL PROBABILITIES 153

1 If Y is absolutely continuous w.r.t. Lebesgue measure having a λn-density fY , then X is abso-lutely continuous w.r.t. Lebesgue measure and the λn-density is a.s. given by

(B.5) fX(x) = fY (T (x)) · | detT ′(x)| · 1X (x),

2 If X is absolutely continuous w.r.t. Lebesgue measure having a λn-density fX and if the set

N := x ∈ X : detT ′(x) = 0

has Lebesgue measure zero, then Y is absolutely continuous w.r.t. Lebesgue measure and theλn-density is a.s. given by

(B.6) fY (y) =fX(T−1(y))

| detT ′(T−1(y))| · 1T (N)(y).

Lemma B.2.1. Let C and C be copulas related by C(u, v) = v − C(1 − u, v). Then

∫ 1

0

∫ 1

0

C(u, v) dC(u, v) =1

2−

∫ 1

0

∫ 1

0

C(u, v) dC(u, v)(B.7)

∫ 1

0

∫ 1

0

uv dC(u, v) =1

2−

∫ 1

0

∫ 1

0

uv dC(u, v)(B.8)

Proof. Suppose U and V are uniformly distributed random variables with distribution function C(restricted to the unit square). Then, by means of Proposition 3.1.2, U := 1 − U and V have jointdistribution function C (again, restricted to the unit square). Hence, by means of the TransformationTheorem for Measures (c.f. Bauer (1992, Corollary 19.3)),

∫ 1

0

∫ 1

0

C(u, v) dC(u, v) =

[0,1]2C(U , V ) dP(U ,V ) =

[0,1]2C(1 − U, V ) dP(U,V )

=

∫ 1

0

∫ 1

0

C(1 − u, v) dC(u, v) =

∫ 1

0

∫ 1

0

v − C(1 − (1 − u), v) dC(u, v)

=

∫ 1

0

∫ 1

0

v dC(u, v) −∫ 1

0

∫ 1

0

C(u, v) dC(u, v)

=

∫ 1

0

v dv −∫ 1

0

∫ 1

0

C(u, v) dC(u, v) =1

2−

∫ 1

0

∫ 1

0

C(u, v) dC(u, v).

Analogously,

∫ 1

0

∫ 1

0

uv dC(u, v) =

[0,1]2UV dP(U ,V ) =

[0,1]2(1 − U)V dP(U,V )

=

∫ 1

0

∫ 1

0

(1 − u)v dC(u, v) =

∫ 1

0

∫ 1

0

v − uv dC(u, v)

=

∫ 1

0

∫ 1

0

v dC(u, v) −∫ 1

0

∫ 1

0

uv dC(u, v)

=

∫ 1

0

v dv −∫ 1

0

∫ 1

0

uv dC(u, v) =1

2−

∫ 1

0

∫ 1

0

uv dC(u, v).

Page 154: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

154 APPENDIX B. RESULTS FROM PROBABILITY AND MEASURE

Page 155: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

List of Symbols

Rn n-dimensional real space.

Bn, Bn(S) Borel σ-algebra on Rn [on S].

λn Lebesgue measure on (Rn,Bn).

Rn

Extended n-dimensional real space, p. 15.

x Vector in Rn, p. 15.

X Real-valued random vector.

L(X) Distribution of a random vector X.

L(Xn) → L(X) Convergence in law.

(x,y] Half open n-box in Rn, p. 15.

In Unit n-cube in Rn, p. 16.

domH Domain of a function H , p. 16.

ranH Range of a function H , p. 16.

VH((a, b]), ∆baH H-volume of an n-box, p. 16.

Hk Univariate margin of a n-place function H , p. 16.

H Joint distribution function, p. 19.

S Subcopula, p. 19.

C Copula, p. 19.

Cn, C Set of all (n-dimensional) copulas, p. 19.

Sj1,...,jk , Cj1,...,jk k-margin of a subcopula [copula], p. 19.

πn, π Independence copula, p. 20.

Mn, M Frechet-Hoeffding upper bound copula, p. 20.

Wn, W Frechet-Hoeffding lower bound copula, p. 20.

C1 ≺ C1 Concordance ordering, p. 21.

155

Page 156: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

156 LIST OF SYMBOLS

AC Absolutely continuous component of a copula C, p. 22.

SC Singular component of a copula C, p. 22.

PC Probability measure in In generated by a copula C, p. 22.

U(0, 1) Uniform distribution on [0, 1], p. 28.

B(p) Univariate Bernoulli distribution with parameter p.

P(λ) Univariate Poisson distribution with parameter λ.

F (−1) Quasi-inverse of a univariate distribution function F .

F (x−) Left hand side limit of a univariate distribution function F in x.

F Survival function, p. 33.

H Joint survival function, p. 33.

CH Copula corresponding to a distribution function H , p. 28.

CX Copula corresponding to a random vector X, p. 28.

SH Unique subopula corresponding to a distribution function H , p. 28.

SX Unique subcopula corresponding to a random vector X, p. 28.

SH Unique subopula corresponding to a distribution function H extended onto ranF×ranG,p. 28.

SX Unique subcopula corresponding to a random vector X extended onto ranF × ranG,p. 28.

CH Class of all possible copulas corresponding to a distribution function H , p. 28.

CX Class of all possible copulas corresponding to a random vector X, p. 28.

δC(t) Diagonal section of a copula, p. 31.

cI|J(uI |uJ ) Conditional density of a copula, p. 32.

C Survival copula, p. 33.

C Dual of a copula, p. 33.

C Co-copula, p. 33.

Sn(φ) Shperical distribution with generator φ, p. 36.

u(n) Random vector distributed uniformly on the unit sphere, p. 36.

En(µ,Σ, φ) Elliptical distribution, p. 37.

CGaΣ , CGaρ Gaussian copula, p. 38.

det Σ Determinant of a matrix Σ.

Ctν,R, Ctν,ρ t-copula, p. 39.

Page 157: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

LIST OF SYMBOLS 157

ϕ[−1] Pseudo-inverse of an Archimedean generator, p. 40.

CFraθ Frank copula, p. 41.

CClaθ Clayton copula, p. 42.

CFreα,β Frechet copula, p. 69.

CMarθ Mardia copula, p. 69.

CPlaθ Plackett’s copula, p. 44.

M(n, J, π, ω) Shuffle of M, p. 45.

d∞ Supremum norm.

δ (abstract) measure of dependence, p. 56.

δd(C) Measure of dependence based on a metric d, p. 57.

‖ · ‖p Lp-norm, p. 58.

δp(C) Measure of dependence based on Lp-distance, p. 58.

κ(C) Measure of dependence based on L∞-distance, p. 59.

Q Difference between the probabilities of concordance and discordance, p. 60.

ρ (abstract) measure of concordance, p. 59.

ρd(C) Measure of concordance based on a metric d, p. 62.

ρp(C) Measure of concordance based on Lp-distance, p. 63.

ρτ (X,Y ) Kendall’s tau, p. 60.

ρS(X,Y ) Spearman’s rho, p. 61.

δψ(X,Y ) Scarsini’s measure of concordance, p. 64.

λU Coefficient of upper tail dependence, p. 66.

λL Coefficient of lower tail dependence, p. 66.

%(X,Y ) Linear correlation coefficient, p. 67.

Cpq Class of copulas corresponding to a bivariate Bernoulli distribution, p. 72.

CS Standard extension copula, p. 73.

CU , CL Carley’s extensions, p. 75.

ψ Transformation of a class of all posible copulas, p. 84.

MS Standard extension copula for comonotonic random variables, p. 85.

WS Standard extension copula for countermonotonic random variables, p. 85.

ρ∗τ Discrete version of Kendall’s tau, p. 92.

Page 158: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

158 LIST OF SYMBOLS

ρ∗S Discrete version of Spearman’s rho, p. 95.

Sn Empirical subcopula, p. 99.

Cn Empirical copula, p. 99.

x(i) Order statistics, p. 99.

cn Empirical copula frequency, p. 99.

Fn, Gn, Hn Empirical distribution function, p. 100.

R(xk) Rank of the observarion xk, p. 101.

R(xk) Average rank of the observation xk , p. 101.

HC Family of distributions obtained from a copula model, p. 107.

RC Set of attainable correlation in a copula model, p. 111.

TC Set of attainable values of Kendall’s tau in a copula model, p. 112.

SC Set of attainable values of Spearman’s rho in a copula model, p. 112.

Page 159: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

Index

Archimedeancopula, see copula

Bernoulli distribution, see distributionBlomquist’s medial correlation coefficient, 64

Censored data, 106Co-copula, 33Comonotonic, 49Concordance

measure of, see measuremultivariate, 54

Concordant, 59Contingency table, 43, 103Convergence

of copulas, 80, 81of cubes, 149weak, 80–82, 103, 106

Copula, 19, 27t-, 39, 67, 111, 137absolutely continuous, 22, 25absolutely continuous component of, 22Archimedean, 39, 40, 61

multivariate, 42, 43, 54Clayton, 42, 111density, 22, 29dual, 33elliptical, 37, 61empirical, see empirical copulaFrechet, 69, 111Frechet-Hoeffding lower bound, 20, 48, 75,

110, 118, 122Frechet-Hoeffding upper bound, 20, 23, 48,

51, 75, 110Frank, 41, 42, 111Gauss, 111, 137Gaussian, 38, 39, 67independence, 20, 110Mardia, 69, 111measure generated by, 22

Plackett’s, 44, 111radially symmetric, 35set of, 19, 21–23singular, 22, 25singular component of, 22spherical, 36support, 22survival, 33, 34

Corner set monotonicity, 55Correlation coefficient, see linear correlation co-

efficientCountermonotonic, 49Cube, 149

Densityconditional, 32, 152marginal, 28of a copula, see Copulaof a joint distribution function, 29

Dependencefunctional, 49, 57, 100likelihood ratio, 55maximum negative, 49maximum positive, 49orthant, 53perfect, 49quadrant, 53, 55, 77, 111tail, 54, 55, 78

Differencebetween probabilities of concordance and

discordance, 60, 87, 89, 93Discordant, 59Distance

L1, 64, 66, 96, 97L∞, 59Lp, 58, 63measure based on, see measure

Distributionbivariate Bernoulli, 71, 72, 92, 93, 96, 108,

110, 111, 113

159

Page 160: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

160 INDEX

bivariate Poisson, 125, 133, 137infinitely divisible, 126Teicher’s, 125uncorrelated, 137with minimum correlation, 127

circular uniform, 36conditional, 32, 151, 152elliptical, 37, 38function

empirical, see empiricalmultivariate Poisson, 137Plackett’s, 43Sarmanov, 126spherical, 36, 37

Empiricalcopula, 99, 102, 106

frequency, 99, 102distribution function, 101, 102, 104

multivariate, 101subcopula, 99, 102

ExtensionCarley’s, 75, 77, 84, 86, 91standard, 73, 74, 77, 78, 82, 84, 87, 94, 96

Extreme values, 66

Familyt-, 39, 67, 69Clayton, 42, 54, 61, 69

multivariate, 43Farlie-Gumbel-Morgenstern, 126Frechet, 69Frank, 41, 42, 54, 69, 133, 137

multivariate, 43Gauss, 54, 69Gaussian, 38, 39, 67Mardia, 69Plackett’s, 44, 54, 69, 133

Functionn-dimensional, 16n-increasing, 162-increasing, 116completely monotonic, 43grounded, 16joint distribution, 19joint survival, 33multivariate totally positive of order two,

56quasi-monotone, 16, 116right continuous, 16

strongly piecewise strictly monotone, 49survival, 33

Generalized hairpin, 50Generator

Archimedean, 40multivariate, 43strict, 40

Clayton, 42Frank, 41of an elliptical distribution, 37of a spherical distribution, 36

Gini’s gamma, 64

Hoeffding’s Lemma, 67

InequalityFrechet-Hoeffding, 21

Kendall’s tau, 60, 61, 91attainable in copula models, 112, 113discrete version, 92, 103, 112sample version, 103

Lebesgue-Stieltjes measureabsolutely continuous component, 150decomposition of, 150derivative of, 149

lower, 149upper, 149

regular derivative of, 150lower, 150upper, 150

singular component, 150strong derivative of, 150

lower, 150upper, 150

Left tail decreasing, 54, 78Linear correlation coefficient, 67, 68, 82, 110,

117, 134attainable in copula models, 111, 112minimum, 134–136

Marginsk-dimensional, 16univariate, 16

Markov kernel, 151Mass Transportation, 117Measure

concordance, 83dependence, 83

Page 161: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

INDEX 161

of association, 56, 82multivariate, 65

of concordance, 59, 60, 62, 83distance-based, 62, 85, 96multivariate, 65

of dependence, 56, 83L∞-distance based, 59Lp-distance based, 58distance-based, 57, 85, 96multivariate, 65

Modelcopula, 107

with maximum negative dependence, 108with maximum positive dependence, 108

Monge condition, 116Monge sequence, 117Multivariate distributions

construction, 28, 107

Negative lower orthant dependence, 53Negative orthant dependence, 53Negative upper orthant dependence, 53North-West Corner Rule, 116–118, 122, 128,

134

Orderingconcordance, 21on R

n, 15

Order statistics, 99

Pearson’s linear correlation coefficient, see lin-ear correlation coefficient

Perfect dependence, see DependencePositive likelihood ratio dependence, 56Positive lower orthant dependence, 53Positive orthant dependence, 53Positive tail dependence, 55Positive upper orthant dependence, 53Probability

conditional, 32, 151pseudo-inverse, 40

Radially symmetric copula, see CopulaRank, 101

average, 101statistics, 103

Ratioodds, 43

Regular sequence, 149Right tail decreasing, 55

Right tail increasing, 55, 78

Scarsini’s Measures of Concordance, 64Section

k-th, 31diagonal, 31horizontal, 31vertical, 31

Setnondecreasing, 48n-dimensional, 50

nonincreasing, 48of copulas, see Copula: set of

Shuffle of M, 44, 45, 49, 72, 75flipped, 45regular, 45

Spearman’s rank correlation coefficient, see Spear-man’s rho

Spearman’s rho, 61, 62, 64, 93attainable in copula models, 112, 113discrete version, 95, 103, 106, 112sample version, 104

Standard extension, see extensionStochastic monotonicity, 55Subcopula, 19Symmetry

joint, 34marginal, 34radial, 34, 35

Tail dependence index, 66lower, 66upper, 66

Tests of independence, 106Transformation

decreasing, 30, 84, 86increasing, 29, 84, 86

Transportation problem, 116Monge-Kantorowich, 116

Vertex, 16Volume

H-, 16of an n-box, 16

Page 162: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

162 INDEX

Page 163: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

Bibliography

Bauer, H. (1992). Maß- und Integrationstheorie. Walter de Gruyter, Berlin, New York, 2 edition.

Behnen, K. and Neuhaus, G. (1984). Grundkurs Stochastik. Teubner, Stuttgart.

Billingsley, P. (1995). Probability and Measure. Willey, New York, 3. edition.

Blum, J. R., Kiefer, J., and Rosenblatt, M. (1961). Distribution-free tests of independence on thesample distribution function. Ann. Math. Statist., 32:485–498.

Cambanis, S., Simons, G., and Stout, W. (1976). Inequalities for E(k(x, y)) when the marginals arefixed. Zeitschrift fur Wahrscheinlichkeitstheorie und verwandte Gebiete, 36:285–294.

Campbell, J. (1934). The poisson correlation function. In Proceedings Edinb. Math. Soc., II. Ser.,volume II. of 4., pages 18–26.

Carley, H. (2002). Maximum and minimum extensions of finite subcopulas. Communications inStatistics, Theory and Methods, 31(12):2151–2166.

Clayton, D. G. (1978). A model for association in bivariate life tables and its application in epidemi-ological studies of familial tendency in chronic disease incidence. Biometrika, 65:141–151.

Dabrowska, D. M. (1996). Weak convergence of a product integral dependence measure. Scand. J.Stat., 23(4):551–580.

Dall’Aglio, G. (1959). Sulla compatibilita delle funzioni di ripartizione doppia. Rendiconti Sem. Mat.Roma, 18:385–413.

Darsow, W. F., Nguyen, B., and Olsen, E. T. (1992). Copulas and markov processes. Illinois Journalof Mathematics, 36(4):600–642.

Deheuvels, P. (1979). Non parametric tests of independence. In Raoult, J. P., editor, Statistique NonParametrique Asymptotique Proceedings, pages 95–107. Springer Verlag, Berlin, Heidelberg, NewYork. Lecture Notes in Mathematics 821.

Deheuvels, P. (1981). An asymptotic decomposition for multivariate distribution-free tests of indepen-dence. J. Multivariate Anal., 11:102–113.

Dhaene, J. and Groovaerts, M. J. (1996). Dependency of risks and stop-loss orders. ASTIN Bulletin,26(2):201–212.

Elstrodt, J. (1980). Mass- und integrationstheorie. Vorlesungsskript, Universitat Hamburg.

Embrechts, P., Lindskog, F., and McNail, A. (2001). Modelling dependence with copulas and applica-tions to risk management. Technical report, Department of Mathematics, ETHZ, Zurich, Switzer-land. www.math.ethz.ch/finance.

163

Page 164: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

164 BIBLIOGRAPHY

Embrechts, P., McNeil, A., and Straumann, D. (2002). Correlation and dependence in risk man-agement: Properties and pitfalls. In Dempster, M., editor, Risk Management: Value at Risk andBeyond., pages 176–223. Cambridge Univ. Press, Cambridge.

Fang, K.-T., Kotz, S., and K.-W., N. (1990). Symmetric Multivariate and Related Distributions.Chapman & Hall, New York.

Fang, K.-T. and Zhang, Y.-T. (1990). Generalized Multivariate Analysis. Springer Verlag & SciencePress, Beijing.

Ferguson, T. S. (1967). Mathematical Statistics: A Decision Theoretic Approach. Academic Press,New York and London.

Fichtenholz, G. M. (1964). Differential- und Integralrechnung, volume I. Deutscher Verlag der Wis-senschaften, Berlin.

Frank, M. J. (1979). On the simultaneous associativity of f(x, y) and x + y − f(x, y). AequationesMath., 19:194–226.

Frank, M. J. (1991). Convolutions for dependent random variables. In Dall’Aglio, G., Kotz, S.,and Salinetti, G., editors, Advances in Probability Distributions with Given Marginals, pages 75–93,Dordrecht, Boston, London. Kluwer Academic Publishers.

Gass, S. I. (1969). Linear Programming: Methods and Applications. McGraw-Hill Book Company andKogakusha Company, LTD., Tokyo, 3. edition.

Griffiths, R., Milne, R. K., and Wood, R. (1979). Aspects of correlation in bivariate poisson distribu-tions and processes. Australian Journal of Statistics, 21(3):238–255.

Hahn, H. and Rosenthal, A. (1948). Set Functions. The University of New Mexico Press, Albuquerque.

Hewitt, E. and Stromberg, K. (1975). Real and Abstract Analysis. Springer-Verlag, New York Heidel-berg Berlin, 3. edition.

Hoeffding, W. (1940a). Maßstabinvariante korrelationstheorie. Schriften des Mathematischen Seminarsund des Instituts fur Angewandte Mathematik der Universitat Berlin, 5(3):181–233.

Hoeffding, W. (1940b). Maßstabinvariante korrelationstheorie fur diskontinuierliche verteilungen.Archiv fur mathematische Wirtschafts- und Sozialforschung, VII.(2):4–70.

Hoeffding, W. (1942). Stochastische abhangigkeit und funktionaler zusammenhang. SkandinaviskAktuarie Tidskrift, 25:200–227.

Hoffman, A. (1963). On simple linear programming problems. In Klee, editor, Convexity VII., pages317 – 327, Providence, RI. Proc. Symp. Pure Math.

Hult, H. and Lindskog, F. (2001). Multivariate extremes, aggregation and dependence in ellipticaldistributions. Preprint, www.risklab.ch/Papers.

Joe, H. (1997). Multivariate Models and Dependence Concepts. Chapman & Hall, London.

Johnson, N., Kotz, S., and Balakrishnan, N. (1997). Discrete Multivariate Distributions. Wiley, N.Y.

Johnson, N. L. and Kotz, S. (1972). Distributions in Statistics: Continuous Multivariate Distributions.John Wiley & Sons, Inc., New York, London, Sydney, Toronto.

Page 165: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

BIBLIOGRAPHY 165

Kamke, E. (1956). Das Lebesgue-Stieltjes Integral. B.G. Teubner Verlagsgesellschaft, Leipzig.

Kocherlakota, S. and Kocherlakota, K. (1992). Bivariate Discrete Distributions. M. Dekker, N.Y.

Kotz, S. and Mari, D. D. (2001). Correlation and Dependence. Imperial College Press, London.

Kraemer, H. C. (1998). Rank correlation. In Armitage, P. and Colton, T., editors, Encyclopedia ofBiostatistics, pages 3715–3717. John Willey & Sons, Chichester, New York, Weinham, Brisbane,Singapore, Toronto.

Kulpa, T. (1999). On approximation of copulas. International Journal of Mathematics & mathematicalSciences, 22(2):259–269.

Lakshminarayana, J., Pandit, S., and Rao, K. (1999). On a bivariate poisson distribution. Communi-cations in Statistics - Theory and Methods, 28(2):267 – 276.

Lee, M.-L. T. (1996). Properties and applications of the sarmanov family of bivariate distributions.Communications in Statistics - Theory and Methods, 25(6):1207 – 1222.

Lehmann, E. L. (1966). Some concepts of dependence. Ann. Math. Stat., 37(III.):1137–1153.

Li, X., Mikusinski, P., and Taylor, M. D. (1998). Strong approximation of copulas. Journal ofmathematical Analysis and Applications, 225:608–623.

Lindskog, F., McNeil, A. J., and Schmock, U. (2001). A note on kendall’s tau for elliptical distributions.Preprint, www.risklab.ch/Papers.

Ling, C.-H. (1965). Representation of associative functions. Publ. Math. Debrecen, 12:189–212.

Marshall, A. W. (1996). Copulas, marginals and joint distributions. In Ruschendorf, L., Schweizer, B.,and Taylor, M. D., editors, Distributions with Fixed Marginals and Related Topics, pages 213–222.Institute of Mathematical Statistics, Hayward, CA.

Menger, K. (1942). Statistical metrics. Proc. Nat. Acad. Sci. U.S.A., 28:535–537.

Mikusinski, P., Sherwood, H., and Taylor, M. D. (1991). Probabilistic interpretations of copulas andtheir convex sums. In Dall’Aglio, G., Kotz, S., and Salinetti, G., editors, Advances in ProbabilityDistributions with Given Marginals, pages 95–112, Dordrecht, Boston, London. Kluwer AcademicPublishers.

Monge, G. (1781). Deblai et remblai. In Memoires de l’Academie Des Sciences.

Munroe, M. E. (1968). Measure and Integration. Addison-Wesley Publishing company, second edition.

Nelsen, R. B. (1987). Discrete bivariate distributions with given marginals and correlation. Commu-nications in Statistics - Simulation, 16(1):199 – 208.

Nelsen, R. B. (1991). Copulas and association. In Dall’Aglio, G., Kotz, S., and Salinetti, G., edi-tors, Advances in Probability Distributions with Given Marginals, pages 51–74, Dordrecht, Boston,London. Kluwer Academic Publishers.

Nelsen, R. B. (1999). An Introduction to Copulas. Springer, New York.

Neslehova, J. and Pfeifer, D. (2003). Modeling and generating dependent risk processes for IRM andDFA. Conference Volume of the XXXIV International Astin Colloquium 2003; submitted to theAstin Bulletin.

Page 166: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

166 BIBLIOGRAPHY

Plackett, R. L. (1965). A class of bivariate distributions. J. Amer. Statist. Assoc., 60:516–522.

Rachev, S. and Ruschendorf, L. (1998a). Mass Transportation Problems. Volume I: Theory. Springer,N.Y.

Rachev, S. and Ruschendorf, L. (1998b). Mass Transportation Problems. Volume II: Applications.Springer, N.Y.

Randles, R. H. and Wolfe, D. A. (1979). Introduction to the Theory of Nonparametric Statistics. JohnWiley & Sons, New York.

Renyi, A. (1959). On measures of dependence. Acta. Math. Acad. Sci. Hungar., 10:441–451.

Royden, H. L. (1988). Real Analysis. Prentice-Hall, Englewood Cliffs, NJ 07632, 3. edition.

Rudin, W. (1973). Functional Analysis. McGraw-Hill Series in Higher Mathematics. McGraw-Hill.

Saks, S. (1937). Theory of the Integral. Monografje Matematyczne, Tom VII., Warsaw.

Scarsini, M. (1984). On measures of concordance. Stochastica, 8:201–218.

Schmitz, V. (2003). Copulas and Stochastic Processes. PhD thesis, RWTH Aachen.

Schweizer, B. (1991). Thirty years of copulas. In Dall’Aglio, G., Kotz, S., and Salinetti, G., editors, Ad-vances in Probability Distributions with Given Marginals, pages 13–50, Dordrecht, Boston, London.Kluwer Academic Publishers.

Schweizer, B. and Wolff, E. F. (1981). On nonparametric measures of dependence for random variables.Annals of Statistics, 9:870–885.

Shiryayev, A. N. (1984). Probability. Graduate Texts in Mathematics, Vol. 95. Springer, New YorkBerlin Heidelberg Tokyo.

Sklar, A. (1959). Fonctions de repartition a n dimensions et leurs marges. Publ. Inst. Statist. Univ.Paris, 8:229–231.

Teicher, H. (1954). On the multivariate poisson distribution. Skandinavisk Aktuarietidskrift, 37:1–9.

Tjøstheim, D. (1996). Measures of dependence and tests of independence. Statistics, 28:249–284.

Vaart, A. W. V. d. (1998). Asymptotic Statistics. Cambridge University Press, Cambridge.

Vaart, A. W. V. d. and Wellner, J. A. (1996). Weak Convergence and Empirical Processes: WithApplications to Statistics. Springer Verlag, New York.

Witting, H. (1985). Mathematische Statistik I: Parametrische Verfahren Bei Festem Stichprobenum-fang. Teubner, Stuttgart.

Wolff, E. F. (1981). n-dimensional measures of dependence. Stochastica, 4:175–188.

Yaari, M. (1987). The dual theory of choice under risk. Econometrica, 55:95–115.

Page 167: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

Curriculum Vitae

Name: Johana Neslehova

Date of bith: July 26, 1977, in Prague

Nationality: Czech

Address: Harzensweg 11, 22 305 HamburgGermanyTel: +49 +40 21904955

E-Mail: [email protected]

Education

1983-1991 Primary school education in Prague, with special focus on foreign languages

1991-1995 Jan-Neruda-Gymnasium Prague (equiv. to High School/Junior College)

29.5.1995 Czech graduation exam at the Jan-Neruda-Gymnasium Prague in Czech, German,Mathematics and History (equiv. to German Abitur and a High School/JuniorCollege Exam); grade: excellent in all four subjects

1995 Admission to the Charles University of Prague (university entrance exam waivedbecause of ”excellent” rating on graduation exam)

9/1995-9/1997 Study of Mathematics at the Charles University Prague

9/1997-10/2000 Study of Mathematics and Computer Science at the University of Hamburg

11.2. 1999 Vordiplom in Mathematics and Computer Science (equiv. to B.Sc. degree)

5.10.2000 Diploma in Mathematics at the University of Hamburg (equiv. to M.Sc. degree);”compilation” grade: very good (”sehr gut”)Thesis on ”Asymptotically optimal permutation two sample tests for bivariatesurvival times under univariate censorship”supervisor: Prof. Dr. Georg Neuhausreferees: Prof. Dr. G. Neuhaus and Prof. Dr. D. Pfeifer

9/2002 - present PhD studies in Mathematics at the Carl von Ossietzky University of Oldenburgsupervisor: Prof. Dr. D. Pfeifer

167

Page 168: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit
Page 169: Dependence of Non-Continuous Random Variables · Dependence of Non-Continuous Random Variables Von der Fakult at f ur Mathematik und Naturwissenschaften der Carl von Ossietzky Universit

Erklarung

Hiermit erklre ich, dass ich diese Arbeit selbstandig und nur mit Hilfe der angegebenen Hilfsmittelangefertigt habe. Kapitel 8 (Negatively Correlated Bivariate Poisson Distributions) ist bereits teil-weise veroffentlicht worden, und zwar im Tagungsband des XXXIV International ASTIN Colloquium2003. Daruber hinaus wurde der Artikel beim Astin Bulletin (Ed. Paul Embrechts) eingereicht, istjedoch noch nicht angenommen worden.

Weiter erklare ich, dass die Dissertation weder in ihrer Gesamtheit noch in Teilen einer anderen wis-senschaftlichen Hochschule zur Begutachtung in einem Promotionsverfahren vorliegt oder vorgelegenhat.

Hamburg, den 11. Mai 2004

Johana Neslehova

169