Derivation of one single-variable gaussian distribution

Considering a generative model, let $P$ denote the generated probability distribution, ${x}_{i=1}^n$ denote the data set, and $\mu$ and $\sigma$ denotes the parameters of a generated Gaussian Distribution. We will go over a max-likelihood process to find the parameters of the generated distribution.

Independence Assumption: $x_i\perp x_j\ \forall i,j \ \mbox{s.j} \ 1\le i,j\le n$

$$\begin{align}
P({x}{i=1}^n\mid \mu, \sigma)
& = \prod
{i=1}^n P(xi \mid \mu, \sigma ) \
\mbox{[Take Log]}& = \sum
{i=1}^n\log{P(xi \mid \mu, \sigma )} \
\mbox{[Substitute]}& = \sum
{i=1}^n\log{(\frac{1}{\sqrt{\pi}\sigma}e^{-\frac{(xi-\mu)^2}{2\sigma^2}})} \
\mbox{[Decompose]}& = \sum
{i=1}^n{-\log{(\sqrt{\pi}\sigma)}-{\frac{(x_i-\mu)^2}{2\sigma^2}}} \
\end{align}$$

$$
\therefore \mathop{\mbox{argmax}}{\mu,\sigma}{P({x}{i=1}^n\mid \mu, \sigma)} = \mathop{\mbox{argmin}}{\mu,\sigma}{\sum{i=1}^n{\log{(\sqrt{\pi}\sigma)}+{\frac{(x_i-\mu)^2}{2\sigma^2}}}}
$$

$$
\begin{align}
&\mbox{Let } L = \sum_{i=1}^n{\log{(\sqrt{\pi}\sigma)}+{\frac{(xi-\mu)^2}{2\sigma^2}}} \
\therefore \ & \frac{\partial L}{\partial \mu} = \sum
{i=1}^n(\frac{1}{\sigma^2}(xi-\mu)) \
&\frac{\partial L}{\partial \sigma} = \sum
{i=1}^n{(\frac{1}{\sigma}-\frac{(xi-\mu)^2}{\sigma^3})} \
\
& \mbox{Set partial derivaties equal to zero} \
\therefore \ & \hat{\mu} = \frac{1}{n}\sum
{i=1}^n{xi} \
& \hat{\sigma}^2 = \frac{1}{n} \sum
{i=1}^n{(x_i-\mu)^2}
\end{align}
$$

Result of one multi-variable gaussian distribution

Multi-variate Gaussian Distribution Probability Function

$$
g(\boldsymbol{x}) = \frac{1}{(2\pi)^{D/2}|\Sigma|^{1/2}}\exp\left{ -\frac{1}{2} (\boldsymbol{x}- \boldsymbol{\mu})^\text{T}\Sigma^{-1}(\boldsymbol{x}- \boldsymbol{\mu})\right}
$$

Result of EM algorithm

Gaussian Mixture

$$
p(\boldsymbol{x}) = \sum_{k=1}^K {w_k g_k(\boldsymbol{x} \mid \boldsymbol{\mu}_k, \boldsymbol{\Sigma}_k)}
$$

Latent Variable

$$
z_k^i = \frac{g_k( \boldsymbol{x}_i \mid \boldsymbol{\mu}_k, \boldsymbol{\Sigma}k) }{ \sum{l=1}^K{g_l( \boldsymbol{x}_i \mid \boldsymbol{\mu}_l, \boldsymbol{\Sigma}_l)}}
$$

Result

$$
\begin{align}
\hat{ \boldsymbol{\mu}}_k & = \frac{1}{zk}\sum{i=1}^n{z_k^i \boldsymbol{x}_i} \
\hat{ \boldsymbol{\Sigma}}_k & = \frac{1}{zk}\sum{i=1}^n{z_k^i (\boldsymbol{x}_i-\hat{ \boldsymbol{\mu}}_k) (\boldsymbol{x}_i-\hat{ \boldsymbol{\mu}}_k)^ \text{T}} \
zk & = \sum{i=1}^N{z_k^i}
\end{align}
$$

Ref

Coursera Robotics
Week 1 Lectures