Copulas in Probability Theory

Success breeds a disregard of the possibility of failure.

The term “Copula” (from the Latin for “link”) was coined by Abe Sklar in his 1959 article, which was written in French1. Copulas have been widely used, though not always applied properly, in financial and econometric modeling, especially in pricing securities that depend on many underlying securities. Formally put, for an $n$-variate function $F$, the copula associated with $F$ is a distribution function \(C: [0, 1]^{n} \rightarrow [0, 1]\) that satisfies:

\[F(y_{1}, \dots , y_{n}) = C(F_{1}(y_{1}), \dots , F_{n}(y_{n}) ; \theta),\]

where \(\theta\) is a parameter of the copula called the dependence parameter, which means dependence between the marginals. This is a frequent starting point of empirical applications.

Table of Contents

Basic Concepts

Properties of copulas are analogous to properties of joint distributions. The joint distribution of a set of random variables \((Y_{1}, \dots , Y_{m})\) is defined as:

\[F(y_{1}, \dots , y_{m}) = \mathbb{P}\{Y_{i} \leq y_{i}; i = 1, \dots , m\},\]

and the survival function corresponding to \(F(y_{1}, \dots , y_{m})\) is given by:

\[\overline{F}(y_{1}, \dots , y_{m}) = \mathbb{P}\{Y_{i} > y_{i}; i = 1, \dots , m\}.\]

Definition 1. An \(n\)-dimensional copula (briefly, an n-copula) is a function \(C\) from the unit \(n\)-cube \([0, 1]^{n}\) to the unit interval \([0, 1]\) which satisfies the following conditions:

  1. \(C(1, \dots , 1, a_{m}, 1, \dots , 1) = a_{m}\) for each \(m \leq n\) and all \(a_{m} \in [0, 1]\).

  2. \(C(a_{1}, \dots , a_{n}) = 0\) if \(a_{m} = 0\) for any \(m \leq n\).

  3. \(C\) is \(n\)-increasing in the sense that the \(C\)-volume of any \(n\)-dimensional interval is non-negative. In particular, if \(C\) is a \(2\)-dimensional copula, then

  4. \(C(a_{2}, b_{2}) - C(a_{1}, b_{2}) - C(a_{2}, b_{1}) + C(a_{1}, b_{1}) \geq 0\), whenever \(a_{1} \leq a_{2}, b_{1} \leq b_{2}\).

Property 1 says that if the realizations of $n - 1$ variables are known each with marginal probability one, then the joint probability of the $n$ outcomes is the same as the probability of the remaining uncertain outcome. Property 2 (sometimes referred to as the grounded property of a copula) says that the joint probability of all outcomes is zero if the marginal probability of any outcome is zero. An $n$-copula may be viewed, or equivalently defined as an $n$-dimensional cumulative probability distribution function whose support is contained in \([0, 1]^{n}\) and whose one-dimensional margins are uniform on \([0, 1]\)2. In other words, an $n$-copula is an \(n\)-dimensional distribution function with all \(n\) univariate margins being \(U(0, 1)\)3. Every distributon function has at least one copula function, uniquely defined on the image set \((F_{1}(y_{1}), \dots , F_{n}(y_{n}))\) of the continuity points \((x_{1}, \dots , y_{n})\). Denote

\[F_{i}(y_{i}) = \mathbb{P}\{Y_{i} \leq y_{i}\}, \quad 1 \leq i \leq n\]

as the marginal function. If all the marginal functions are continuous, then the copula function is unique.

Standard distributions, such as the multivariate normal or Student \(t\) distribution, often require that all univariate and multivariate marginal distributions are of the same type and only allow for highly symmetric dependence structures, which are rarely satisfied in real-world applications. For large multivariate data sets, they are too inflexible to adequately describe the multivariate tail behavior. Using Sklar’s theorem stated below, flexible multivariate distributions can be constructed from \(n\)-dimensional copulas.

Sklar’s Theorem

Sklar’s basic result is now the following:

Theorem 1. If \(H\) is an $n$-dimensional probability distribution function with one-dimensional margins \(F_{1}, \dots , F_{n}\), then there exists an \(n\)-dimensional copula \(C\) such that, for all \(x_{1}, \dots , x_{n} \in \mathbb{R}\), \(H(x_{1}, \dots , x_{n}) = C(F_{1}(x_{1}), \dots , F_{n}(x_{n})).\)

Moreover, letting \(u_{i} = F_{i}(x_{i}), i = 1, \dots , n\) yields:

\[C(u_{1}, \dots , u_{n}) = H(F^{-1}(u_{1}), \dots , F_{n}^{-1}(u_{n})),\]

where, for any one-dimensional distribution function \(F\),

\[F^{-1}(t) = \sup \{ x \mid F(x) < t \},\]

or equivalently,

\[F^{-1}(t) = \inf \{ x \mid F(x) \geq t \}\]

is the quasi-inverse, or the “quantile transform”. In practice, econometricians often know a great deal about marginal distributions of individual variables but little about their joint behavior. These results show that much of the study of joint distribution functions can be reduced to the study of copulas!

To prove Sklar’s theorem, we shall introduce the notion of “distributional transform”.

Definition 2. Let \(Y\) be a real random variable with distribution function \(F\) and let \(V\) be a random variable independent of \(Y\), such that \(V \sim U(0, 1)\). The modified distribution function \(F(x, \lambda)\) is defined by

\[F(x, \lambda) := \mathbb{P}\{X < x\} + \lambda \mathbb{P}(Y = x).\]

We call \(U := F(Y, V)\) the (generalized) distributional transform of \(Y\).

Proof of Theorem 1. Let \(X = (X_{1}, \dots , X_{n})\) be a random vector on a probability space with distribution function \(F\) and let \(V \sim U(0, 1)\) be independent of \(X\). Considering the distributional transforms \(U_{i} := F_{i}(X_{i}, V), 1 \leq i \leq n\), we have \(U_{i} \overset{d}{=} U(0, 1)\) and \(X_{i} = F_{i}^{-1}(U_{i})\) a.s., \(1 \leq i \leq n\). Thus, by defining \(C\) to be the distribution function of \(U = (U_{1}, \dots , U_{n})\), we obtain:

\[\begin{align} F(x) &= \mathbb{P}\{X \leq x\} \\ &= \mathbb{P}\{F_{i}^{-1}(U_{i}) \leq x_{i}, 1 \leq i \leq n\} \\ &= \mathbb{P}\{U_{i} \leq F_{i}(x_{i}), 1 \leq i \leq n\} \\ &= C(F_{1}(x_{1}), \dots , F_{n}(x_{n})), \end{align}\]

i.e., \(C\) is a copula of \(F\).

As an application of Sklar’s theorem, copula enables one to decompose the joint PDF \(h\) into the product of the marginal densities and the copula density, denoted as \(c\):

\[h(x_{1}, \dots , x_{n}) = c(F_{1}(x_{1}), \dots , F_{n}(x_{n})) \prod\limits_{i = 1}^{n} f_{i}(x_{i}),\]

where \(f_{i}\) for \(i = 1, \dots , n\) are the marginal PDFs of the joint CDF \(H\). When the variables are dependent, the copula density will differ from unity (in the case of independent random variables) and can be thought of as reweighting the product of the marginal densities to produce a joint density for dependent random variables4. The copula is sometimes called the dependence function, as it completely describes the dependence between two random variables: Any measure of dependence that is scale invariant (i.e., is not affected by strictly increasing transformations of the underlying variables) can be expressed as a function of the copula alone. Such dependence measures include:

For more details, please refer to Mari and Kotz (2001)5. Pearson’s linear correlation coefficient cannot be expressed in terms of the copula alone; it also depends on the marginal distributions. Sklar’s theorem can also be extended to apply to conditional distributions, which is useful for forecasting and time series applications6. The complication that arises when applying Sklar’s theorem to conditional distributions is that the information set used for the margins and the copula must be the same; if different information sets are used, then the resulting function \(H_{t}\) can no longer be interpreted as a multivariate conditional distribution.

Copula Examples

Some common bivariate copulas

From Sklar’s theorem for \(n = 2\), the conditional density \(f_{1 \mid 2}\) and distribution function \(F_{1 \mid 2}\) can be expressed as:

\[\begin{align} f_{1|2}(x_{1}|x_{2}) &= c_{12} \big( F_{1}(x_{1}), F_{2}(x_{2}) \big) \cdot f_{2}(x_{2}), \\ F_{1|2}(x_{1}|x_{2}) &= \frac{\partial}{\partial F_{2}(x_{2})} C_{12} \big( F_{1}(x_{1}), F_{2}(x_{2}) \big) = \frac{\partial}{\partial v} C_{12} \big( F_{1}(x_{1}), v \big) \mid_{v = F_{2}(x_{2})}. \end{align}\]

Archimedean copulas

An important particular case of copulas are Archimedean copulas. If \(\varphi\) (called a generator) is a convex decreasing function with a positive second derivative and $\varphi(1) = 0$, and such that

\[\varphi : (0, 1] \rightarrow [0, +\infty).\]

One can then define an inverse (or quasi-inverse if \(\varphi(0) < \infty\)) by:

\[\phi^{-1}(z) = \begin{cases} \varphi^{-1}(z), & 0 \leq z \leq \varphi(0),\\ 0, & \varphi(0) < z < +\infty. \end{cases}\]

An Archimedean copula is then defined as:

\[C(u_{1}, u_{2}) = \varphi^{-1}[\varphi(u_{1}) + \varphi(u_{2})].\]

Archimedean copulas are symmetric in the sense that \(C(u_{1}, u_{2}) = C(u_{2}, u_{1})\) and associative in the sense that \(C(C(u_{1}, u_{2}), u_{3}) = C(u_{1}, C(u_{2}, u_{3}))\). The density of the bivariate Archimedean copula is

\[c(u_{1}, u_{2}) = \frac{\varphi''(C(u_{1}, u_{2}))\varphi^{\prime}(u_{1})\varphi^{\prime}(u_{2})}{[\varphi^{\prime}(C(u_{1}, u_{2}))]^{3}},\]

where the derivatives do not exist on the boundary \(\varphi(u_{1}) + \varphi(u_{2}) = \varphi(0)\)9. The conditional density of the Archimedean copula is

\[\frac{\partial}{\partial u_{2}}C(u_{1}, u_{2}) = \frac{\varphi^{\prime}(u_{2})}{ \varphi^{\prime}\left( C\left( u_{1}, u_{2} \right) \right) }.\]

Quantifying dependence is relatively straightforward for Archimedean copulas because Kendall’s tau simplifies to a function of the generator function:

\[\tau = 1 + 4\int_{0}^{1}\frac{\varphi(t)}{\varphi^{\prime}(t)}\,\mathrm{d}t.\]

Thus, we can estimate the dependence parameter in Archimedean copulas knowing Kendall’s tau and the generator function.

The Gumbel-Hougaard family (also known as bivariate logistic extreme value distribution) \(\varphi(z) = \left( -\log(z) \right)^{1/\theta}, 0 < \theta < 1\) is obtained by the Laplace transform of a positive stable distribution \(\varphi^{-1}(z) = e^{-z^{\theta}}\) The Gumbel copula is written as:

\[C(u_{1}, u_{2}) = e^{\{-[(-\log(u_{1}))^{1/\theta} + (-\log(u_{2}))^{1/\theta}]^{\theta}\}}.\]

Theorem 2. A copula \(C\) is Archimedean if and only if there exists a mapping \(f : (0, 1) \rightarrow (0, \infty)\) such that:

\[\frac{\frac{\partial C(u, v)}{\partial u}}{\frac{\partial C(u, v)}{\partial v}} = \frac{f(u)}{f(v)} \quad \forall (u, v); 0 < u, v < 1.\]

The function \(\varphi\) is given (up to a constant) by:

\[\varphi(z) = \int_{z}^{1} f(u)\,\mathrm{d}u.\]

Making use of this theorem, we can verify that FGM family is not Archimedean.

Proposition 1.10 Let \((U, V)\) be a random vector from an Archimedean copula, with generator \(\varphi\). Set \(W = \frac{\varphi(U)}{\varphi(U) + \varphi(V)}\) and \(Z = C(U, V)\). Then

  1. \(W\) is distributed uniformly on \((0, 1)\);

  2. \(Z\) is distributed as \(K(z) = z - \lambda(z)\), where \(\lambda(z) = \frac{\varphi(z)}{\varphi^{\prime}(z)}\);

  3. \(Z\) and \(W\) are independent.

Proof of Proposition 1. Assume that \(C\) is absolutely continuous, and let \(g(w, z)\) be the density of \((U, V)\) and

\[G(z, w) = \mathbb{P}\{Z \leq z, W \leq w\},\]

then:

\[G(z, w) = \int_{0}^{z} \int_{0}^{w} g(w, z)\,\mathrm{d}w\,\mathrm{d}z = \int\int c(u, v) |\frac{\partial(u, v)}{\partial(z, w)}|\,\mathrm{d}u\,\mathrm{d}v\]

where \(c(u, v) = \frac{\partial^{2}z}{\partial u \partial v}\) is the density of the copula and \(\frac{\partial(u, v)}{\partial(z, w)}\) is the Jacobian of the transformation \((u, v) \rightarrow (z, w)\). Here we have \(\varphi(u) = w \cdot \varphi(z)\), \(\varphi(v) = (1 - w) \cdot \varphi(z)\), and the Jacobian equal to \(-\frac{\varphi(z)\varphi^{\prime}(z)}{\varphi^{\prime}(u)\varphi^{\prime}(v)}\), which gives us

\[\frac{\partial^{2}z}{\partial u \partial v} = \frac{-\varphi'(u)\varphi'(v)\varphi''(z)}{\varphi'(z)^{3}}.\]

Hence:

\[G(z, w) = \int_{0}^{z}\int_{0}^{w} \frac{-\varphi^{\prime}(u)\varphi^{\prime}(v)\varphi''(z)}{\varphi^{\prime}(z)^{3}} \cdot \biggl( -\frac{\varphi(z)\varphi^{\prime}(z)}{\varphi^{\prime}(u)\varphi^{\prime}(v)} \biggr)\,\mathrm{d}z\,\mathrm{d}w = w \biggl[ z - \frac{\varphi(z)}{\varphi^{\prime}(z)} \biggr]_{0}^{z} = wK(z).\]


Family \(\varphi\) Range of \(\theta\) \(-\lambda\) Kendall’s tau
Calyton \((z^{-\theta} - 1) / \theta\) \((0, \infty)\) \(z(1 - z^{\theta}) / \theta\) \(\theta / (\theta + 2)\)
Frank \(\log\biggl( \frac{1 - e^{-\theta}}{1 - e^{-\theta z}} \biggr)\) \((-\infty, \infty)\) \(\frac{1 - e^{-\theta z}}{\theta e^{-\theta z}}\log\biggl( \frac{1 - e^{-\theta}}{1 - e^{-\theta z}} \biggr)\) \(1 + 4(\int_{0}^{\theta} \frac{t}{\theta (e^{t} - 1)}\,\mathrm{d}t - 1) / \theta\)11
Gumbel \(\biggl( -\log(z) \biggr)^{\theta + 1}\) \([0, \infty)\) \(-z \log(z) / (\theta + 1)\) \(\theta / (\theta + 1)\)

Table 1: Examples of Families of Archimedean Copulas

The Frechet-Hoeffding Inequality and Correlation Bounds for a Copula

Given an arbitrary \(n\)-variate joint CDF \(F(y_{1}, \dots , y_{n})\) with univariate marginal CDFs \(F_{1}, \dots , F_{n}\), the joint CDF is bounded below and above by the Frechet-Hoeffding lower and upper bounds, \(F_{L}\) and \(F_{U}\), defined as:

\[\begin{align} F_{L}(y_{1}, \dots , y_{n}) &= \max \{ \sum\limits_{j = 1}^{n} F_{j} - n + 1, 0 \}, \\ F_{U}(y_{1}, \dots , y_{n}) &= \min \{ F_{1}, \dots , F_{n} \}, \end{align}\]

which satisfy:

\[\max \{ \sum\limits_{j = 1}^{n} F_{j} - n + 1, 0 \} \leq F(y_{1}, \dots , y_{n}) \leq \min \{ F_{1}, \dots , F_{n} \}.\]

The upper bound is always a CDF while the lower bound is a CDF for \(n = 2\).

The Frechet-Hoeffding inequality also applies to copulas:

\[\max \{ \sum\limits_{j = 1}^{n} F_{j} - n + 1, 0 \} \leq C(y_{1}, \dots , y_{n}) \leq \min \{ F_{1}, \dots , F_{n} \}\]

The Frechet-Hoeffding lower bound of a bivariate copula is Archimedean. Specifically \(C_{L}(u, v)\) is generated by \(\varphi(z) = 1 - z, 0 \leq z \leq 1\). Knowledge of Frechet-Hoeffding bounds is important in selecting an appropriate copula. The desirable feature of a copula is that it should cover the sample space between the lower and the upper bounds and that as \(\theta\) approaches the lower (upper) bound of its permissible range, the copula approaches the Frechet-Hoeffding lower (upper) bound.

Oftentimes, on may be interested in parameters that are functionals of the bivariate CDF \(H\) with marginal CDFs \(F_{1}\) and \(F_{2}\), such as the correlation coefficient of \(X_{1}\) and \(X_{2}\). Suppose \(X_{1}\) and \(X_{2}\) have finite means \(\mu_{1}\) and \(\mu_{2}\), finite variances \(\sigma_{1}^{2}\) and \(\sigma_{2}^{2}\), and correlation coefficient \(\rho\). The covariance of \(X_{1}\) and \(X_{2}\) has an alternative expression:

\[cov(X_{1}, X_{2}) = \int \int [H(x_{1}, x_{2}) - F_{1}(x_{1})F_{2}(x_{2})]\,\mathrm{d}x_{1}\,\mathrm{d}x_{2}.\]

Suppose the marginal distributions are fixed. Then applying the Frechet-Hoeffding inequality imples sharp bounds on the \(cov(X_{1}, X_{2})\): \(\rho_{L} \leq \rho \leq \rho_{U}\), where

\[\rho_{L} = \frac{\int \int [F_{L} - F_{1}(x_{1})F_{2}(x_{2}]\,\mathrm{d}x_{1}\,\mathrm{d}x_{2}}{\sigma_{1}\sigma_{2}},\] \[\rho_{U} = \frac{\int \int [F_{U} - F_{1}(x_{1})F_{2}(x_{2}]\,\mathrm{d}x_{1}\,\mathrm{d}x_{2}}{\sigma_{1}\sigma_{2}}.\]

Li’s Paper on Default Correlation

The original paper by David X. Li can be retrieved through this link12.

Let’s introduce a random variable \(T\) representing the time-until-default, or simply survival time, for a security \(A\), to denote the length of time before the occurrence of default. Let \(F(t)\) denote the distribution function of \(T\):

\[F(t) = \mathbb{P}\{T \leq t\}, \quad t \geq 0,\]

and set the survival function:

\[S(t) = 1 - F(t) = \mathbb{P}\{T > t\}, \quad t \geq 0.\]

Assume that \(F(0) = 0\), which implies that \(S(0) = 1\). The probability density function \(f(t)\) can thus be written as:

\[f(t) = F^{\prime}(t) = -S^{\prime}(t) = \lim\limits_{\Delta \rightarrow 0^{+}} \frac{\mathbb{P}\{t \leq T < t + \Delta\}}{\Delta}.\]

We also have the marginal default probability \(q_{x} = \mathbb{P}\{ T - x \leq 1 \mid T > x \}\), which represents the probability of default in the next year conditional on the survival until the beginning of the year. A credit curve is then simply defined as the sequence of \(q_{0}, q_{1}, \dots , q_{n}\) in discrete models.

The default correlation of two entities \(A\) and \(B\) can then be defined with respect to their survival times \(T_{A}\) and \(T_{B}\):

\[\rho_{AB} = \frac{cov(T_{A}, T_{B})}{\sqrt{var(T_{A}) \cdot var(T_{B})}}.\]

To introduce a correlation structure into an \(n\) credit portfolio, we must determine how to specify a joint distribution of survival times with given marginal distributions.

Suppose the one-year default probabilities for securities \(A\) and \(B\) are \(q_{A}\) and \(q_{B}\). Obtain \(Z_{A}\) and \(Z_{B}\) such that \(q_{A} = \mathbb{P}\{Z < Z_{A}\}\), \(q_{B} = \mathbb{P}\{Z < Z_{B}\}\), where \(Z\) is a standard normal random variable. If \(\rho\) is the asset correlation, the joint default probability for credit \(A\) and \(B\) is calculated as

\[\mathbb{P}\{Z < Z_{A}, Z < Z_{B}\} = \int_{-\infty}^{Z_{A}} \int_{-\infty}^{Z_{B}} \varphi_{2}(x, y \mid \rho)\,\mathrm{d}x\,\mathrm{d}y = \varPhi_{2}(Z_{A}, Z_{B}, \rho),\]

where \(\varphi_{2}(x, y \mid \rho)\) is the standard bivariate normal density function with a correlation coefficient \(\rho\), and \(\varPhi_{2}\) is the bivariate accumulative normal distribution function.

If we use a bivariate normal copula function with a correlation parameter \(\gamma\), the joint default probability can be calculated as follows:

\[\mathbb{P}\{T_{A} < 1, T_{B} < 1\} = \varPhi_{2} \left( \varPhi^{-1}\left(F_{A}(1)\right), \varPhi^{-1}\left(F_{B}(1)\right), \gamma \right),\]

where \(F_{A}\) and \(F_{B}\) are the distribution functions for the survival times respectively. Note that for \(i = A, B\):

\[q_{i} = \mathbb{P}\{T_{i} < 1\},\]

and

\[Z_{i} = \varPhi^{-1}(q_{i}).\]

If \(\rho = \gamma\), the above two methods give the same joint default probability.

References

  1. Abe Sklar, Random Variables, Distribution Functions, and Copulas, Kybernetika, Vol. 9, No. 6 (1973), pp. 449-460. 

  2. Berthold Schweizer, Thirty Years of Copulas, In: G. Dall’ Aglio, S. Kotz, and G. Salinetti (eds.), Advances in Probability Distributions with Given Marginals: Beyond the Copulas, The Netherlands: Kluwer Academic Publishers. 

  3. Pravin K. Trivedi and David M. Zimmer, Copula Modeling: An Introduction for Practitioners Foundations and Trends in Econometrics, Vol. 1, No. 1 (2005) pp. 1-111. 

  4. Yanqin Fan and Andrew J. Patton, Copulas in Econometrics, Annu. Rev. Econ. 2014. 6:179-200. 

  5. Dominique Drouet Mari and Samuel Kotz, Correlation and Dependence, Imperial College Press, 2001. 

  6. Andrew J. Patton, Modelling Asymmetric Exchange Rate Dependence, International Economic Review, Vol. 47, No. 2 (2006), pp. 527-556. 

  7. James E. Prieger, A Flexible Parametric Selection Model for Non-Normal Data with Application to Health Care Usage, Journal of Applied Econometrics, Vol. 17, No. 4 (2002), pp. 367-392. In this article, FGM copula is chosen for sample selection in that “the marginal distributions in he FGM system may be of any form, as long as their cdfs are absolutely continuous.” 

  8. Lung-Fei Lee, Generalized Econometric Models with Selectivity, Econometrica, Vol. 51, No. 2 (1983), pp. 507-512. 

  9. Christian Genest and Jock MacKay, The Joy of Copulas: Bivariate Distributions with Uniform Marginals, The American Statistician, Vol. 40, No. 4 (1986), pp. 280-283. 

  10. Christian Genest and Louis-Paul Rivest, Statistical Inference Procedures for Bivariate Archimedean Copulas, Journal of the American Statistical Association, Vol. 88, No. 423 (1993), pp. 1034-1043. 

  11. Mohamed N. Jouini and Robert T. Clemen, Copula Models for Aggregating Expert Opinions, Operations Research, Vol. 44, No. 3 (1996), pp. 444-457. 

  12. David X. Li, On Default Correlation: A Copula Function Approach, Social Science Research Network Working Paper Series, December 1999.