# Sampling from a Bivariate Normal Distribution

Some time ago, while while writing a (forthcomming) blog post on personas, I ran into a need generate random points on a plane, that were normally distributed around a mean.

Most math libraries, have a normal random number generator that generates random samples from a univariate normal distribution. For Java, libraries like Apache Commons Math, have such a random number generator. The task of generating random bivariate samples (i.e. 2D points on a plane, as opposed to scalar numbers on a number line), given a univariate normal random number generator, turns out to be a little more complex.

First off you might ask straight away: Why could one not simply generate random normally distributed samples for the x-coordniates, and the y-coordinates and simply combine them to create a bivariate normal distribution? The code for doing this would be quite trivial as shown below:

package m42.exp;

import org.apache.commons.math3.distribution.NormalDistribution;

public class TrivialBinormalDistribution {
public static void generateSample(int n) {
NormalDistribution nd = new NormalDistribution();
for (int i = 0; i < n; ++i) {
double x = nd.sample();
double y = nd.sample();
System.out.println(x+", "+y);
}
}

public static void main(String[] args) {
TrivialBinormalDistribution.generateSample(100);
}
}


Indeed, the code above would produce a set of points on the plane that would be binormally distributed, as shown below.

The sample has a mean x value of 0.0, a mean y value of 0.0, a standard deviation of 1.0 for x values, a standard deviation of 1.0 for y values, and a covariance between x and y values of 0.0.

As you can see, a bivariate distribution is characterized by more than just a mean and a standard deviation. As the Wikipedia article explains, the bivariate normal distribution is fully described by it's mean μ = [${\mu }_{x}$, ${\mu }_{y}$] and it's covariance matrix.

$\mathbit{\Sigma }=\left[\begin{array}{cc}{\sigma }_{x}^{2}& {\mathrm{var}}_{\mathrm{xy}}\\ {\mathrm{var}}_{\mathrm{xy}}& {\sigma }_{y}^{2}\end{array}\right]$

where ${\mu }_{x}$ is the mean of x-coordinate values, ${\mu }_{y}$ is the mean of y-coordinate values, ${\sigma }_{x}$ is the standard deviation of the x-coordinate values, ${\sigma }_{y}$ is the standard deviation of the y-coordinate values, and ${\mathrm{var}}_{\mathrm{xy}}$ is the covariance between x and y values. (Covariance ranges between -1 for a 100% negative correlation to +1 for a 100% positive correlation between x and y values.) The distribution shown above is a special case, with a mean μ = [0, 0], and covariance matrix:

$\mathbit{\Sigma }=\left[\begin{array}{cc}1& 0\\ 0& 1\end{array}\right]$

## Drawing random values from a bivariate normal distribution

Most math libraries have an implementation of either the Box-Muller transform or the Ziggurat algorithm that take a source of uniformly distributed random numbers (like the rand(3) C function,) and generate normally distributed random numbers. The Apache Commons Math library for Java has exactly such a generator: org.apache.math3.distribution.NormalDistribution.sample(). The challenge we address in this article, is that of generating a bivariate normally distributed random sample (a vector of two numbers,) given a normal random number generator such as that described above, and the nature of the bivariate distribution as specified by its mean, μ, and its covariance matrix Σ.

The steps for doing this are as follows:

1. Find a matrix A such that $\mathbf{A}{\mathbf{A}}^{T}=\mathbf{\Sigma }$. This is known as a Cholesky decomposition. It's sort of, although, not quite, like taking the square-root of a square matrix. For a two-dimensional covariance matrix, this is quite simple. The Cholesky decomposition for the covariance matrix of a bivariate normal distribution is shown below:
$\left[\begin{array}{cc}{\sigma }_{x}& 0\\ \frac{{\mathrm{var}}_{\mathrm{xy}}}{{\sigma }_{x}}& \frac{\sqrt{{\sigma }_{x}^{2}{\sigma }_{y}^{2}-{\mathrm{var}}_{\mathrm{xy}}}}{{\sigma }_{x}}\end{array}\right]\left[\begin{array}{cc}{\sigma }_{x}& 0\\ \frac{{\mathrm{var}}_{\mathrm{xy}}}{{\sigma }_{x}}& \frac{\sqrt{{\sigma }_{x}^{2}{\sigma }_{y}^{2}-{\mathrm{var}}_{\mathrm{xy}}}}{{\sigma }_{x}}\end{array}{\right]}^{T}=\left[\begin{array}{cc}{\sigma }_{x}^{2}& {\mathrm{var}}_{\mathrm{xy}}\\ {\mathrm{var}}_{\mathrm{xy}}& {\sigma }_{y}^{2}\end{array}\right]$

i.e.

$\mathbf{A}=\left[\begin{array}{cc}{\sigma }_{x}& 0\\ \frac{{\mathrm{var}}_{\mathrm{xy}}}{{\sigma }_{x}}& \frac{\sqrt{{\sigma }_{x}^{2}{\sigma }_{y}^{2}-{\mathrm{var}}_{\mathrm{xy}}}}{{\sigma }_{x}}\end{array}\right]$

2. Next, draw two numbers from the normal random number generator, (one for the x value, and one for the y value.) Both numbers should be drawn from a distribution that has a mean μ=0.0 and a standard deviation σ=1.0. Let the vector formed by those numbers be z=[x,y].
3. Let v = μ+Az. This point has the characteristics desired for a point sampled from a bivariate normal distribution with mean μ and covariance matrix Σ.

Java code to do this shown below:

import org.apache.commons.math3.distribution.NormalDistribution;

public class BivariateNormalDistribution {

private static NormalDistribution snd = new NormalDistribution();

private double ux;
private double sigmax;
private double uy;
private double sigmay;
private double covxy;

public BivariateNormalDistribution(
double ux, double sigmax,
double uy, double sigmay,
double covxy) {
this.ux = ux;
this.sigmax = sigmax;
this.uy = uy;
this.sigmay = sigmay;
this.covxy = covxy;
}

public double[] sample(double xn, double yn) {
double varx = sigmax*sigmax;
double vary = sigmay*sigmay;

double x = ux + sigmax*xn;
double y = uy + ((covxy *xn) + Math.sqrt(varx*vary - covxy*covxy)*yn)/sigmax;

double[] point = new double[2];
point[0] = x;
point[1] = y;
return point;
}

public double[] sample() {
double xn = snd.sample();
double yn = snd.sample();
return sample(xn, yn);
}
}


The scatter plot below shows 500 points sampled from a bivariate normal distribution with a mean μ=[2.5, 3.4] and a covariance matrix Σ given by:

$\mathbit{\Sigma }=\left[\begin{array}{cc}{4.2}^{2}& \mathrm{0.8}\\ \mathrm{0.8}& {2.7}^{2}\end{array}\right]$