Joint distribution function

The joint distribution function extends the concept of the CDF to two variables simultaneously. It gives the probability that both \(X\) and \(Y\) fall below specified thresholds at the same time, and is the foundation for computing any probability involving a pair of random variables.

Definition

The joint distribution function (joint CDF) of two random variables \(X\) and \(Y\) is:

\[ F_{X,Y}(x, y) = P(X \leq x,\ Y \leq y) \]

It gives the probability that \(X\) takes a value at most \(x\) and \(Y\) takes a value at most \(y\), simultaneously.

Properties

The joint CDF always satisfies:

  • Non-decreasing: \(F_{X,Y}(x,y)\) is non-decreasing in both \(x\) and \(y\) separately.
  • Boundary limits:
    • \(\lim_{x \to -\infty} F_{X,Y}(x,y) = 0\) for any fixed \(y\).
    • \(\lim_{y \to -\infty} F_{X,Y}(x,y) = 0\) for any fixed \(x\).
    • \(\lim_{x \to \infty,\, y \to \infty} F_{X,Y}(x,y) = 1\).
  • Right-continuous in both arguments.
  • Marginal recovery: letting one argument go to \(+\infty\) gives the marginal CDF of the other variable:

\[F_X(x) = \lim_{y \to \infty} F_{X,Y}(x,y), \qquad F_Y(y) = \lim_{x \to \infty} F_{X,Y}(x,y)\]

⚠️ The joint CDF gives corner probabilities, not rectangle probabilities

(F_{X,Y}(x,y)) is the probability of the lower-left quadrant ((-\infty, x] \times (-\infty, y]). To get the probability of a rectangle (P(a < X \leq b,\ c < Y \leq d)), you need the inclusion-exclusion formula:

\[P(a < X \leq b,\ c < Y \leq d) = F_{X,Y}(b,d) - F_{X,Y}(a,d) - F_{X,Y}(b,c) + F_{X,Y}(a,c)\]

This is the two-dimensional analogue of \(P(a < X \leq b) = F(b) - F(a)\).

Discrete case

For discrete variables, the joint CDF is obtained by summing the joint PMF over all pairs \((x', y')\) with \(x' \leq x\) and \(y' \leq y\):

\[F_{X,Y}(x,y) = \sum_{x' \leq x} \sum_{y' \leq y} P(X = x',\ Y = y')\]

The joint PMF can be recovered from the joint CDF using the two-dimensional difference formula:

\[p_{X,Y}(x,y) = F_{X,Y}(x,y) - F_{X,Y}(x-1,y) - F_{X,Y}(x,y-1) + F_{X,Y}(x-1,y-1)\]

Step-by-step: joint CDF from the exercise/health table

Using the joint PMF from the previous section:

\(Y=1\) \(Y=2\) \(Y=3\)
\(X=0\) 0.15 0.10 0.05
\(X=1\) 0.05 0.20 0.15
\(X=2\) 0.02 0.08 0.20

Computing \(F_{X,Y}(1, 2) = P(X \leq 1,\ Y \leq 2)\):

\[F_{X,Y}(1,2) = p(0,1) + p(0,2) + p(1,1) + p(1,2) = 0.15 + 0.10 + 0.05 + 0.20 = 0.50\]

Computing the full joint CDF table:

\(y=1\) \(y=2\) \(y=3\)
\(x=0\) \(0.15\) \(0.25\) \(0.30\)
\(x=1\) \(0.25\) \(0.55\) \(0.70\)
\(x=2\) \(0.27\) \(0.65\) \(1.00\)

Each cell accumulates all the joint probabilities in the top-left rectangle up to that point.

Example icon
Joint CDF as a heatmap: values accumulate from the top-left corner, reaching 1 at the bottom-right

Figure 1: Joint CDF as a heatmap: values accumulate from the top-left corner, reaching 1 at the bottom-right

Continuous case

For continuous variables, the joint CDF is the double integral of the joint PDF:

\[F_{X,Y}(x,y) = \int_{-\infty}^{x} \int_{-\infty}^{y} f_{X,Y}(x', y')\, dy'\, dx'\]

The joint PDF is recovered by differentiating:

\[f_{X,Y}(x,y) = \frac{\partial^2 F_{X,Y}(x,y)}{\partial x\, \partial y}\]

Joint CDF of two independent standard normal variables: the surface rises from 0 at the bottom-left corner to 1 at the top-right

Figure 2: Joint CDF of two independent standard normal variables: the surface rises from 0 at the bottom-left corner to 1 at the top-right

Rectangle probability for continuous variables

Let (X) and (Y) be independent standard normal variables. What is (P(-1 \leq X \leq 1,\ 0 \leq Y \leq 1))?

Using the inclusion-exclusion formula and the fact that \(F_{X,Y}(x,y) = F_X(x) \cdot F_Y(y)\) for independent variables:

\[P(-1 \leq X \leq 1,\ 0 \leq Y \leq 1) = [F_X(1) - F_X(-1)] \times [F_Y(1) - F_Y(0)]\]

\[= [0.841 - 0.159] \times [0.841 - 0.500] = 0.683 \times 0.341 \approx 0.233\]

About 23% of observations fall in that rectangle.

Example icon

Independence via the joint CDF

\(X\) and \(Y\) are independent if and only if their joint CDF factorizes:

\[F_{X,Y}(x,y) = F_X(x) \cdot F_Y(y) \quad \text{for all } x, y\]

This is equivalent to the factorization of the joint PMF or joint PDF, but expressed in terms of CDFs. The continuous example above uses this directly: since \(X\) and \(Y\) are independent normals, \(F_{X,Y}(x,y) = \Phi(x)\cdot\Phi(y)\).