Variance and Standard Deviation

Spread - Streuung

When looking at different sets of data, one way to compare them is to look at the difference of spread.

We use the mean $\overline{x}$ as a reference point to investigate the spread.

One possibility could be the sum of the differences around the mean.

$\sum (x - \overline{x})$

But, here the positive and negative differences always add up to zero.

One could consider using the absolute value of the differences $\sum \mid x - \overline{x}\mid$

or the sum of squares of the differences $\sum (x - \overline{x})^2$

To make this independent of the number of values, we take the mean of the latter. This is called the variance, and is often used in statistics.

Variance - Varianz

The variance and the square root of the variance called the standard deviation (s.d.) is used to measure the spread of data. In German, it is called Standardabweichung.

$s = \sqrt{V(x)}$ in a sample

or

$\sigma = \sqrt{V(x)}$ in a population

German definition:

exercise

formulas

$${V(x) = \frac{\sum (x - \overline{x})^2}{n}}$$

$${\text{standard deviation} = \sqrt{\frac{\sum (x - \overline{x})^2}{n}}}$$

alternative formula for variance and s.d.

$${\begin{align} V(x) & = \frac{\sum (x - \overline{x})^2}{n} \newline & = \frac{1}{n} \sum (x - \overline{x})(x - \overline{x}) \newline & = \frac{1}{n} \sum (x^2 - 2x\overline{x} + (\overline{x})^2) \newline & = \frac{\sum x^2}{n} - \frac{\sum 2x\overline{x}}{n} + \frac{\sum (\overline{x})^2}{n} \end{align}}$$

As $\overline{x}$ is a constant it can be taking ouside the sum.

$${\frac{\sum 2x\overline{x}}{n} = 2\overline{x}\frac{\sum x}{n} = 2\overline{x}\overline{x} = 2(\overline{x})^2}$$

Also, in $\frac{\sum (\overline{x})^2}{n}$ the term $(\overline{x})^2$ is added to itself n times and the result is divided by n. So,

$${\frac{\sum (\overline{x})^2}{n}= \frac{n(\overline{x})^2}{n}= (\overline{x})^2}$$

Now the equation for the variance becomes

$${\begin{align} V(x) & = \frac{\sum x^2}{n} - 2(\overline{x})^2 + (\overline{x})^2 \newline & = \frac{\sum x^2}{n} - (\overline{x})^2 \end{align}}$$

This form of the equation saves a lot of work when calculating the variance directly.

variance and standard deviation in a binomial distribution

$${\begin{align} V(x) = n \cdot p \cdot (1-p) \end{align}}$$

$${\begin{align} V(x) = n \cdot p \cdot q \end{align}}$$ for $q = 1 - p$

$E(x) = n \cdot p$


(c) 2019 sebastian.williams[at]sebinberlin.de - impressum und datenschutz - Powered by MathJax & XMin & HUGO & jsxgraph & mypaint