Marginalization in Gaussian Distribution

Moklesur Rahman
2 min readJul 21, 2023

Marginalization is a fundamental operation in probability and statistics that involves integrating out or summing over a variable to obtain the probability distribution of the remaining variables. When dealing with Gaussian distributions, marginalization is a straightforward process because the Gaussian distribution is closed under marginalization and conditioning operations.

Let’s consider a joint Gaussian distribution of two random variables X and Y:

p(X, Y) ~ N(μ, Σ)

where `μ` is the mean vector and `Σ` is the covariance matrix of the joint distribution.

To marginalize out the variable Y and obtain the marginal distribution `p(X)`, you need to integrate over all possible values of Y:

p(X) = ∫ p(X, Y) dY

In the case of a joint Gaussian distribution, the marginal distribution `p(X)` will also be a Gaussian distribution. The mean and covariance matrix of the marginal distribution can be obtained by “slicing” the mean vector and covariance matrix of the joint distribution appropriately.

Let `μ_X` and `μ_Y` be the mean vectors of X and Y, respectively. Similarly, let `Σ_XX`, `Σ_YY`, and `Σ_XY` be the covariance matrices of X, Y, and the cross-covariance matrix between X and Y, respectively.

Then, the mean vector of the marginal distribution `p(X)` is simply the mean vector of X:

μ_X = μ[1:d] # Extract the first d elements of μ, where d is the dimensionality of X

The covariance matrix of the marginal distribution `p(X)` is given by:

Σ_XX = Σ[1:d, 1:d] - Σ[1:d, d+1:] * Σ[d+1:, 1:d] / Σ[d+1:, d+1:]

where `Σ[1:d, 1:d]` is the top-left d x d block of the covariance matrix Σ, `Σ[1:d, d+1:]` is the top-right d x (p-d) block, `Σ[d+1:, 1:d]` is the bottom-left (p-d) x d block, and `Σ[d+1:, d+1:]` is the bottom-right (p-d) x (p-d) block.

In summary, to marginalize out a variable Y from a joint Gaussian distribution, you need to compute the mean vector `μ_X` and covariance matrix `Σ_XX` using the formulas above. The resulting distribution `p(X)` will be a Gaussian distribution with mean `μ_X` and covariance matrix `Σ_XX`.

--

--

Moklesur Rahman

PhD student | Computer Science | University of Milan | Data science | AI in Cardiology | Writer | Researcher