Copulas allow for modeling the dependency between variables separately from their individual behaviors, which is necessary in finance where nonlinear and complex relationships are common. Consider the race times (in mins) of Alice and Bob over five races:
| Race | Alice’s Time (X1) | Bob’s Time (X2) |
| 1 | 25 | 26 |
| 2 | 23 | 24 |
| 3 | 22 | 22 |
| 4 | 27 | 28 |
| 5 | 26 | 25 |
There is an apparent dependency between their times: when one runs faster, so does the other.
To use a copula, we first transform each runner’s race times into uniform variables using their empirical cumulative distribution functions (CDFs). The rank-based CDFs for Alice (F1) and Bob (F2)
are as follows:
| Time (xi) | F1(xi) for Alice | F2(xi) for Bob |
| 22 | 0.2 | 0.2 |
| 23 | 0.4 | 0.4 |
| 24 | - | 0.6 |
| 25 | 0.6 | 0.8 |
| 26 | 0.8 | - |
| 27 | 1.0 | - |
| 28 | - | 1.0 |
These CDFs transform the times into uniform variables, U1 and U2. For example, in Race 3, where both Alice and Bob ran in 22 minutes, their corresponding uniform values are U1 = F1(22) = 0.2 and
U2 = F2(22) = 0.2.
The Clayton copula, a type of Archimedean [1] copula, captures asymmetric dependence, particularly in the lower tail (where both variables experience extreme low values simultaneously). Its
formula is:
C(u1, u2) = (u1^(-θ) + u2^(-θ) - 1)^(-1/θ),
where u_1 and u_2 are values between 0 and 1, and θ (theta) controls the strength of dependence.
For our example, let’s assume θ = 2. For Race 3 (U1 = 0.2, U2 = 0.2), the copula value is calculated as follows:
1. u_1^(-θ) = 0.2^(-2) = 25,
2. u_2^(-θ) = 0.2^(-2) = 25,
3. u_1^(-θ) + u2^(-θ) - 1 = 25 + 25 - 1 = 49.
So, C(0.2, 0.2) = 49^(-1/2) = 1/7 ≈ 0.1429.
This value represents the joint probability that both runners will finish with performances below their 20th percentile, indicating strong dependence when they both perform
well.
Linear correlation measures linear dependence but does not capture nonlinear relationships or dependencies in distribution tails.
For instance if Alice and Bob tend to run either very fast or very slow together, a strong dependence in the tails would not be fully captured by linear correlation.
[1] The term "Archimedean" reflects how these copulas use a simple generator function to capture complex dependencies.
Écrire commentaire