# Estimating the Sample Mean of Normally Distributed Data

As I'm going through a reading course with Xin He in which I'm going through a few chapters in the book Bayesian Data Analysis . I've gotten to the chapter on multi-parameter models, and I thought that as a few of the points on working with normally distributed data.

##### The Setup

So let's start with 50 data points, and let's assume that they're normally distributed with $\mu=10$ and $\sigma^2=5$ (we're going to pretend that we don't actually know $\mu$ or $\sigma^2$, but

```
mu <- 10
v <- 5
samples <- 50
y <- rnorm(n=samples,mean=mu,sd=sqrt(v))
ybar <- mean(y)
svar <- var(y)
```

##### The Prior

Let's start with analyzing the model under a "noninformative" prior distribution. The idea here is that we want a prior belief that reflects our agnosticism about the population mean and variance. The one we're going to choose is uniform on $\mu$, and (for reasons I don't entirely understand), uniform on $\log(\sigma)$

$$p(\mu,\sigma^2)\propto (\sigma^2)^{-1}$$

##### The Likelihood

Now that we've worked out the prior and the data, it's time to figure out what our likelihood function looks like. As we've stated that our variable is normally distributed, let's start with the probability density function for the normal distribution

$$ f(y,\mu,\sigma)=\frac{1}{\sigma\sqrt{2\pi}}e^{-\frac{(y-\mu)^2}{2\sigma^2}}\propto \frac{1}{\sigma} e^{-\frac{(y-\mu)^2}{2\sigma^2}} $$

Our likelihood function is simply the product of the density at each of our $y_i$ points for $i\in(1..n)$, and it looks like this:

$$ f(y,\mu,\sigma) \propto \frac{1}{\sigma^2} e^{-\frac{1}{2\sigma^2} \sum_{i=1}^n ( yi - \mu)^2 } $$

Using some math I still don't entirely comprehend, we get to

$$ f(\vec{y},\mu,\sigma) \propto \frac{1}{\sigma^2} e^{-\frac{1}{2\sigma^2} \sum_{i=1}^n (yi-\bar{y})^2+ n(\bar{y}-\mu)^2} $$

Which in turn equals

$$ f(\vec{y},\mu,\sigma) \propto \frac{1}{\sigma^2} e^{-\frac{1}{2\sigma^2} [(n-1)s^2+ n(\bar{y}-\mu)^2]}$$

where $$s=\frac{1}{n-1} \sum_{i=1}^n (yi -\bar{y})^2 $$

##### The Joint and Marginal Posterior distributions

From Bayes theorum we know that the posterior is proportional to the product of the prior and the likelihood. The same is true in the multiparameter case, except we call it the joint posterior $p(\mu,\sigma^2|y)$:

$$p(\mu,\sigma^2|y) \propto \sigma^{-n-2} e^{-\frac{1}{2\sigma^2} [(n-1)s^2+ n(\bar{y}-\mu)^2]}$$

In our situation, what we really care about is the marginal posterior for the mean $\mu$, or $p(\mu|y). To get that, we can simply integrate the joint posterior over the possible values of $\sigma^2$

$$ p(\mu|y) = \int_0^\infty p(\mu,\sigma^2|y) d\sigma^2$$

As it turns out, you can use integration by substitution to arrive at the Student-t density

$$p(\mu|y) \propto \left[ 1+ \frac{n(\mu-\bar{y})^2}{(n-1)s^2} \right]^{-n/2} $$

(remember that the pdf for the t distribution is $\propto \left(1+ \frac{t^2}{v} \right)^{-\frac{v+1}{2}}$, so if we substitute $t=\frac{(\mu-\bar{y})}{s/\sqrt{n}}$ and $v=n-1$ we see that they're equivalent

##### Drawing from the Distribution

Let's return now to our data. We can sample from our marginal posterior distribution thusly

```
draws <- 500
mudist <- ybar+ (svar/sqrt(samples))*rt(500,samples-1)
print(mean(mudist))
```

```
[1] 9.576325
```