Category Archives: Statistics

Statistics is the science of formal studying uses and analysis from a representative sample of data, it seeks to explain the correlations and dependencies of a physical phenomenon or a natural occurrence in random or conditional

Uniform Distribution

Uniform Distribution

Uniform distribution arises when considering that all possible values within a range are equiprobable. Therefore, the probability that a random variable will take values on a subinterval is proportional to the length of the same

We say that a random variable has a Uniform distribution on an interval of finite [a, b], and we denote it as \xi \approx U(a, b) if its density function is:

f(x)=\begin{cases} \frac{1}{b -a},\text{ if }a\leq x\leq b \\ 0,\text{ in the rest} \end{cases}

Its probability function is:

P\{\xi < k \} = \begin{cases} 0, \text{ if }x < a \\ \frac{x-a}{b-a}, \text{ if }a\leq x\leq b \\ 1, \text{ if } x > b \end{cases}

E(\xi) = \frac{a+b}{2}

\sigma^2(\xi) = \frac{(a+b)^2}{12}

\sigma(\xi) = (a+b)\cdot +\sqrt{\frac{1}{12}}

Calculation a Uniform



Poisson Distribution

Poisson Distribution

The Poisson distribution is a discrete v.a. \xi which measures the number of times an event occurs in a time or space interval and is denoted as:

\xi \approx P(\lambda)

Its probability function is:

P(\xi = k ) = e^{-\lambda} \cdot \frac{\lambda^k}{k!}, k \in \{0, \cdots, n\}

E(\xi) = \lambda

\sigma^2(\xi) = \lambda

\sigma(\xi) = +\sqrt{\lambda}

Properties

  1. \xi = \xi_1 + \xi_2 \approx P(\lambda) with \lambda = \lambda_1 + \lambda_2 when \xi_1, \xi_2 are independent v.a.
  2. \xi = \xi_1 + \cdots + \xi_r \approx P(\lambda) with \lambda = \lambda_1 + \cdots + \lambda_r when \xi_1, \cdots, \xi_r are independent v.a.

Approximation of the Binomial to the Poisson

It \xi \approx P(\lambda)\approx B(n, p)

If \exists \lim\limits_{n\to\infty, p\to 0}n\cdot p = \lambda \Rightarrow \lim\limits_{n\to\infty, p\to 0}P(\xi = k ) = e^{-\lambda} \cdot \frac{\lambda^k}{k!}

with k \in \{0, \cdots, n\}

That is, a Binomial, where bernoulli's number of tests is large (n tends to infinity) and the probability of success in each test is small (p tends to 0) is approximately one Poisson parameter \lambda=n\cdot p

It is considered a good approximation when n \geq 50 and p \leq 0.1

Calculation of a Poisson



Normal distribution or Gaussian

Normal distribution or Gaussian

The normal distribution, Gauss distribution or Gaussian distribution, is discrete v.a. Z that measures the area included in the function represented by the Gaussian bell

Its probability function is:

\frac{1}{\sqrt{2 \cdot \pi}\cdot \sigma}\int^{+\infty}_Z e^m dx with m =-\frac{(x - \mu)^2}{2 \cdot \sigma^2}

E(\xi) = \mu

\sigma^2(\xi) = \sigma^2

\sigma(\xi) = \sigma

Properties of Normal

  1. Is symmetric with respect to the axis x = \mu, P(\sigma > \mu) = P(\sigma < \mu) = \frac{1}{2}
  2. When x \rightarrow \pm\infty we have an asymptote general y = 0
  3. It has points of inflection at x = \mu = \sigma
  4. Any v.a. built as a linear combination of normal v.a. also follows a normal distribution

Calculation a Normal



Normal typed

It \xi v.a. we will call typing another v.a. when:

Z = \frac{\xi - \mu \cdot \xi}{\sigma \cdot \xi}

If we type a z v.a. we have to:

If \xi \approx N(\xi, \sigma) \Rightarrow Z = \frac{\xi - \mu}{\sigma} \Rightarrow N(0, 1)

The probability function of the typed normal is:

P(Z > z) = \frac{1}{\sqrt{2 \cdot \pi}}\int^{+\infty}_Z e^m dx with m = -\frac{x^2}{2}, \forall x \in \mathbb{R}

E(\xi) = 0

\sigma^2(\xi) = 1

\sigma(\xi) = 1

Properties of the typed Normal

  1. Is symmetric with respect to the axis x = 0, P(\sigma > 0) = P(\sigma < 0) = \frac{1}{2}
  2. When x \rightarrow \pm\infty we have a horizontal asymptote
  3. It has points of inflection at x = \pm 1

Calculation of a typed Normal



Notes for Normal

It \xi_1, \cdots, \xi_n \approx v.to. (\mu_i, \sigma_i) with \xi = a_0 + a_1 \cdot \xi_1 + \cdots + a_n \cdot \xi_n, \forall i \in \{1, \cdots, n\}

  1. E[\xi] = a_0 + a_1 \cdot \mu_1 + \cdots + a_n \cdot \mu_n
  2. \sigma^2[\xi] = a_1^2 \cdot \sigma_1^2 + \cdots + a_n \cdot \sigma_n^2 + 2a_{1 2} \cdot Cov(\xi_1, \xi_2) + \cdots
  3. If the \xi_i are independent or only incorreladas \sigma^2[\xi] = a_1^2 \cdot \sigma_1^2 + \cdots + a_n \cdot \sigma_n^2
  4. If, in addition, \xi_i \approx N(\mu_i, \sigma_i), \forall i \in \{1, \cdots, n\} then:
    \begin{cases} \xi \approx N(\mu, \sigma)\text{ with }\mu = a_0 + a_1 \cdot \mu_1 + \cdots + a_n \cdot \mu_n \\ \sigma^2 = a_1^2 \cdot \sigma_1^2 + \cdots + a_n \cdot \sigma_n^2 + 2a_{1 2} \cdot Cov(\xi_1, \xi_2) + \cdots \end{cases}
  5. If, in addition, \xi_i \approx N(\mu_i, \sigma_i), \forall i \in \{1, \cdots, n\} and independent then:
    \begin{cases}\mu = a_0 + a_1 \cdot \mu_1 + \cdots + a_n \cdot \mu_n \\ \sigma^2 = a_1^2 \cdot \sigma_1^2 + \cdots + a_n \cdot \sigma_n^2 \end{cases}

It \xi_S = \xi_1 + \cdots + \xi_n

  1. E[\xi_S] = \mu_1 + \cdots + \mu_n
  2. \sigma^2[\xi_S] = \sigma_1^2 + \cdots + \sigma_n^2 + 2 \cdot Cov(\xi_1, \xi_2) + \cdots
  3. If the \xi_i are independent \sigma^2[\xi_S] = a_1^2 + \cdots + a_n^2
  4. If, in addition, \xi_i \approx N(\mu_i, \sigma_i), \forall i \in \{1, \cdots, n\} then:
    \begin{cases} \xi_S \approx N(\mu, \sigma)\text{ with }\mu = \mu_1 + \cdots + \mu_n \\ \sigma^2 = a_1^2 \cdot \sigma_1^2 + \cdots + a_n \cdot \sigma_n^2 + 2a_{1 2} \cdot Cov(\xi_1, \xi_2) + \cdots \end{cases}
  5. If, in addition, \xi_i \approx N(\mu_i, \sigma_i), \forall i \in \{1, \cdots, n\} and independent then:
    \begin{cases} \mu = \mu_1 + \cdots + \mu_n \\ \sigma^2 = \sigma_1^2 + \cdots + \sigma_n^2\end{cases}

It \xi_S v.a. independently and identically distributed with \xi_1, \cdots, \xi_n \approx v.a.i.i.d. (\mu, \sigma) and \xi_S = \xi_1 + \cdots + \xi_n \approx v.to.(n \cdot \mu, \sigma \cdot \sqrt{n})

  1. E[\xi_S] = \overbrace{\mu + \cdots + \mu}^{n\;\rm times} = n \cdot \mu
  2. \sigma^2[\xi_S] = \overbrace{\sigma^2 + \cdots + \sigma^2}^{n\;\rm times} = n \cdot \sigma^2
  3. If the \xi_i are v.a.i.i.d. and normal:
    \xi_S \approx N(n \cdot \mu, \sqrt{n} \cdot \sigma)

Approaches

Approximation of the Binomial to the Normal

It B \approx B(n, p)

With B \approx number of successes in n equal and independent Bernoulli tests with probability of success p then:

B \approx B(n\cdot p, \sqrt{n \cdot p \cdot q})

Theorem of Moivre

It B \approx B(n, p) then:

\frac{B - m}{\sqrt{n \cdot p \cdot q}}\rightarrow N(0, 1)

So we have to:

E(B) = n \cdot p

\xi^2(B) = n \cdot p \cdot q

\xi(B) = +\sqrt{n \cdot p \cdot q}

It is considered a good approximation when n \cdot p \geq 5 and n \cdot q \geq 5 and then the Moivre Theorem is fulfilled with:

B \approx B(n \cdot p, \sqrt{n \cdot p \cdot q})

However, a discontinuity correction will have to be made to get the value sought, taking -0.5 if we look for the strictest or +0.5 in any other case

Examples:

P\{B < 4\} we will use P\{B < 3.5\}

P\{B \leq 4\} we will use P\{B \leq 4.5\}

P\{B > 4\} we will use P\{B > 4.5\}

P\{B \geq 4\} we will use P\{B \geq 4.5\}

The central limit theorem

\xi_1, \cdots, \xi_n \approx v.a.i.i.d. (\mu, \sigma) then:

\xi_T = \xi_1 + \cdots + \xi_n \approx v.to. (n \cdot \mu, \sqrt{n} \cdot \sigma) always happens

Theorem Lery-Lidenberg

It \{\xi_i\}, i \in N succession of v.a.i.i.d. then:

S_n = \xi_1 + \cdots + \xi_n \approx \frac{S_n - n \cdot \mu}{\sqrt{n} \cdot \sigma} \rightarrow N(0, 1)

It is considered a good approximation when n \geq 30 and then the Lery-Lidenberg Theorem is fulfilled by approx. normal any probability of the type:

\frac{S_n - n \cdot \mu}{\sqrt{n} \cdot \sigma}

Taylor series for Normal distribution

Approximation of Abramowitz and Stegun (1964) known as "Best Hastings Approach"

\tiny P(x) = 1 - \phi(x)(b_1 \cdot t + b_2 \cdot t^2 + b_3 \cdot t^3 + b_4 \cdot t^4 + b_5 \cdot t^5) + \epsilon(x)

\begin{cases} \phi(x) = \frac{1}{\sqrt{2 \cdot \pi}} \cdot e^{-\left(\frac{1}{2}\right) \cdot x^2} \\ t = \frac{1}{1 + (b_0 \cdot x)} \\ b_0 = 0.2316419 \\ b_1 = 0.319381530 \\ b_2 = -0.356563782 \\ b_3 = 1.781477937 \\ b_4 = -1.821255978 \\ b_5 = 1.330274429 \\ \|\epsilon(x)\| < 7.5 \cdot 10^{-8} \end{cases}

Replacing us we have:

\tiny P(x) = 1 - \left(\frac{1}{\sqrt{2 \cdot \pi}} \cdot e^{-\left(\frac{1}{2}\right) \cdot x^2}\right) \cdot \left(\left(\frac{0.319381530}{1 + 0.2316419 \cdot x}\right) + \left(\frac{-0.356563782}{1 + 0.2316419 \cdot x}\right)^2 + \left(\frac{1.781477937}{1 + 0.2316419 \cdot x}\right)^3 + \left(\frac{-1.821255978}{1 + 0.2316419 \cdot x}\right)^4 + \left(\frac{1.330274429}{1 + 0.2316419 \cdot x}\right)^5 \right) + \epsilon(x)