Gaussian distribution have these nice property that under linear transformation the resulting distribution is still a Gaussian distribution. This property is successfully exploited by the Kalman Filter. In a previous post I had explored the details of Extended Kalman Filter.
Today I am exploring the intuitive meaning of Marginalization vs Conditioning for Gaussian distributions. It is very easy to confuse these two terms. While reading on Gausssian Processes (which is a common technique in robotics) I came across this excellent post from distill.pub. Most of the material in this post is borrowed from it (this is essentially a summarization of it). Highly recommend to read that article and especially look at interactive animations.
Note that Marginalization and Conditioning both deal with joint distribution but the effects are different.
Multivariate Gaussian Distribution Defination
In plain English, what this means is that, we are interested in probability distribution of X=x. For this we need to consider all possible values of y and average over it. Mathematically this is written,
For Gaussian distributions (jointly gaussian) in X, Y, the way to achieve it is to cherry pick the means and variances from the joint distribution corresponding to this partition. Mathematically one would write this as follows. It is however easy to confuse and look-over. Important point is that the resulting distribution is still a gaussian distribution (but with reduced dimensionality).
In plain English, conditioning is how the distribution of the other variable looks like when the first variable achieves a particular value. The resulting distribution after conditioning is also a Gaussian distribution. Means and variance of this new distribution are as follows:
Hopefully this post brings to front this subtle difference. This is the foundation stone for a class of methods called Gaussian Processes which are commonly used in robotics for estimation problems.