Random Variable

Random Variable

A random variable is a function that associates each elementary event a perfectly defined number:

\xi | \Omega \rightarrow \mathbb{R}

Random Variable one-dimensional

It \Omega sample space and P its probability, we will call random variable one-dimensional (v.a.) to an application:

\begin{cases} \xi | \Omega \rightarrow \mathbb{R} \\ \omega \rightarrow \xi(\omega) \in \mathbb{R} \end{cases}

Example of random variable

\Omega \equiv \text{"all 3-bit words"}
\xi \equiv \text{"}n^{\underline{0}}\text{ of some in those words"}
\Omega \equiv \{000, 001, 010, 011, 100, 101, 110, 111\}

\xi | \Omega \rightarrow \mathbb{R}
000 \rightarrow 0
001 \rightarrow 1
010 \rightarrow 1
011 \rightarrow 2
100 \rightarrow 1
101 \rightarrow 2
110 \rightarrow 2
111 \rightarrow 3

P_\xi(0) = P\{\xi = 0\} = P\{000\} = \frac{1}{8} = 0.125
P_\xi(1) = P\{\xi = 1\} = P\{001, 010, 100\} = \frac{3}{8} = 0.375
P_\xi(2) = P\{\xi = 2\} = P\{011, 101, 110\} = \frac{3}{8} = 0.375
P_\xi(3) = P\{\xi = 3\} = P\{111\} = \frac{1}{8} = 0.125
P_\xi(-1) = P\{\xi = -1\} = P\{\emptyset\} = 0
P_\xi(0.75) = P\{\xi = 0.75\} = P\{\emptyset\} = 0

Distribution function

It \xi v.a. (random variable) we'll call the distribution function of \xi to a function:

\begin{cases}F|\mathbb{R} \rightarrow [0, 1] \\ x|F(x) \\ \exists F(x) = P(-\infty, x] = P(\xi \leq x) \text{ con }x \in \mathbb{R} \end{cases}

Example of distribution function

F(0) = P\{\xi \leq 0\} = \frac{1}{8} = 0.125
F(1) = P\{\xi \leq 1\} = \frac{1}{8} + \frac{3}{8} = \frac{1}{2} = 0.5
F(2) = P\{\xi \leq 2\} = \frac{1}{8} + \frac{3}{8} + \frac{3}{8} = \frac{7}{8} = 0.875
F(3) = P\{\xi \leq 3\} = \frac{1}{8} + \frac{3}{8} + \frac{3}{8} + \frac{1}{8} = 1
F(-1) = P\{\xi \leq -1\} = \emptyset = 0
F(0.75) = P\{\xi \leq 0.75\} = F(1) = \frac{1}{2} = 0.5

V.a. discrete type

It \xi v.a. one-dimensional we'll say it's discrete if the set D_\xi = \{x \in \mathbb{R} | P\{\xi = x\} > 0\} is a numberable set (finite or infinite numberable)

Being:

\begin{cases} D_\xi \equiv \text{"support for the v.a. }\xi\text{"} \\ x \in D_\xi \equiv \text{"points of mass of the v.a. }\xi\text{"} \\ P\{\xi = x\}\text{ with }x \in D_\xi \equiv \text{"probability function of n }\xi\text{"} \\ P_i = P\{\xi = x_i\}\text{ with }P_i > 1 \text{ y }\sum\limits_{i=1}^{n} P_i = 1 \end{cases}

V.a. continuous type

We'll say I'm going to see you. \xi type is continuous if the set of points with different probability of 0 is an uncountable set

Is defined if \exists f|\mathbb{R}\rightarrow\mathbb{R}^+\Rightarrow F(x)=\int^{+\infty}_{-\infty} f(t) \cdot dt

When taking a particular value, will be zero (P(\xi = x) = 0), and accordingly:

p(x_1 < \xi \leqslant x_2) = p(x_1 < \xi < x_2) = F(x_2) - F(x_1)

Density function

We call a density function a function from which we can calculate probabilities such as the area enclosed between it and the horizontal axis f(x)

Being:

\begin{cases} f(x) \geq 0, \forall x \in \mathbb{R},\text{ }f(x) \text{ integrable} \\ \int^{+\infty}_{-\infty} f(x) \cdot dx = 1 \end{cases}

Center position measurement: The Average \mu\text{ or }E[\xi]

It \xi v.a. we will call hope (or means) a value denoted as E[\xi]=\mu which in the case of discrete variables is:

E[\xi] = \mu = \sum\limits_{i=1}^{n} x_i \cdot P_i

And in the continuous ones:

E[\xi] = \mu = \int^{+\infty}_{-\infty} x \cdot f(x) \cdot dx

Properties of the mean

  1. E[k] = k\text{; if k is constant}
  2. E[\xi + a] = E[\xi] + a\text{; if a is constant (change of origin)}
  3. E[b\cdot\xi] = b\cdot E[\xi]\text{; if b is constant (change of scale)}
  4. E[a + b\cdot\xi] = a + b\cdot E[\xi]\text{ if a and b are constants (n linear transformation)}
  5. E[\xi_1 + \cdots + \xi_n] = E[\xi_1] + \cdots + E[\xi_n]
  6. k_1 \leq \xi \leq k_2 \Rightarrow k_1 \leq E[\xi] \leq k_2
  7. \xi_1 \leq \xi_2 \Rightarrow E[\xi_1] \leq E[\xi_2]

Absolute dispersion measurement: Variance \sigma^2 \text{ or } Var[\xi]

It \xi v. a. we will call variance to:

\sigma^2 = Var(\xi) = E[(\xi - \mu)^2]\text{ siendo }\mu = E[\xi]

For discrete variables, the following is calculated:

\sigma^2 = Var(\xi) = \sum\limits_{i=1}^{n} (\xi_i - \mu)^2 p_i

For continuous variables, the following is calculated:

\sigma^2 = Var(\xi) = \int^{+\infty}_{-\infty} (x - E(x))^2 \cdot f(x) \cdot dx

Properties of the variance

  1. \sigma^2 = Var(\xi) = E[\xi^2] - E^2[\xi]\text{ in general}
    \sigma^2 = \sum\limits_{i=1}^{n} x^2_i \cdot p_i - \left(\sum\limits_{i=1}^{n} x_i \cdot p_i\right)^2\text{ in the discrete variables}
    \sigma^2 = \int^{+\infty}_{-\infty} x^2 \cdot f(x) \cdot dx - \left(\int^{+\infty}_{-\infty} x \cdot f(x) \cdot dx\right)^2\text{ in the continuous variables}
  2. Var(\xi) \geq 0
  3. Var(\xi) = 0\text{ si }\xi\text{ is constant}
  4. Var(\xi + a) = Var(\xi)\text{ if a is constant}
  5. Var(b\cdot\xi) = b^2\cdot Var(\xi)\text{ if b is constant}
  6. Var(a + b\cdot\xi) = b^2\cdot Var(\xi)\text{ if a and b are constants}

Standard deviation \sigma

It \xi v. a. we will call standard deviation to:

\sigma = dt(\xi) = +\sqrt{Var(\xi)}

Is the positive square root of the variance

Inequality of Tchebycheff

If a v. a. \xi have half a \mu and standard deviation \sigma then for any k > 0 it is fulfilled that:

P\{|\xi - \mu| \leq k\cdot\sigma\} \geq 1 - \frac{1}{k^2}

Or what's the same:

P\{|\xi - \mu| > k\cdot\sigma\} \leq \frac{1}{k^2}

Random variable. Two-dimensional

A discrete two-dimensional v.a. is an application of:

\begin{cases} \sigma \rightarrow \mathbb{R}^2 \\ \omega \rightarrow (x, y) \in \mathbb{R}^2 \end{cases}

Where the set of points with probability > 0 it's numberable, being (x_i, y_j)

We will call points of mass points with probability \not= 0 in a discrete two-dimensional v.a. and we'll denote it (\xi_1, \xi_2), being \xi_1 and \xi_2 v.a. one-dimensional

We will call the probability function to the probabilities of the mass points, that is, to the values:

\begin{cases} P_{i j} = P\{(\xi_1, \xi_2) = (x_i, x_j)\} = P\{\xi_1 = x_i, \xi_2 = y_j\} \\ \sum\limits_{i=1}^{n} \sum\limits_{j=1}^{m} P_{i j} = 1\end{cases}

The sum of the probability functions should always be 1

Being able to obtain the following probabilities matrix:

\begin{pmatrix} \xi_1, \xi_2& y_1& \cdots& y_m& p_{i *} \\ x_1& p_{1 1}& \cdots& p_{1 m}& p_{1 *} \\ \cdots& \cdots& \cdots& \cdots& \cdots \\ x_n& p_{n 1}& \cdots& p_{n m}& p_{n *} \\ p_{* j}& p_{* 1}& \cdots& p_{* m}& 1 \end{pmatrix}

For a random variable two-dimensional discrete \xi_1, \xi_2) marginal distributions are the distributions of one-dimensional v.a. \xi_1\text{ and }\xi_2. In the case of discrete type v.a. the marginal probability functions:

\begin{cases} \xi_1 | p_i = p\{\xi_1 = x_i\} = \sum\limits_{j=1}^{n} p_{i, j} = \sum\limits_{j=1}^{n} p\{\xi_1 = x_i, \xi_2 = y_j\} \\ \xi_2 | p_j = p\{\xi_2 = x_j\} = \sum\limits_{i=1}^{n} p_{i, j} = \sum\limits_{i=1}^{n} p\{\xi_1 = x_i, \xi_2 = y_j\} \end{cases}

For a random variable two-dimensional discrete (\xi_1, \xi_2) conditional distributions are the distributions of one of the components of the two-dimensional v.a. (\xi_1\text{ or }\xi_2) given a value of the other component (\xi_2\text{ or }\xi_1 respectively). In the case of discrete type v.a. the conditional probability functions:

\begin{cases} \xi_1 \text{ given } \xi_2 | p(\xi_1 = x_i | \xi_2 = y_j) = \frac{p(\xi_1 = x_i, \xi_2 = y_j)}{p(\xi_2 = y_j)} = \frac{p_{i, j}}{p_{., j}} \\ \xi_2 \text{ given } \xi_1 | p(\xi_1 = x_i | \xi_2 = y_j) = \frac{p(\xi_1 = x_i, \xi_2 = y_j)}{p(\xi_1 = x_i)} = \frac{p_{i, j}}{p_{i, .}} \end{cases}

To get the mean, a vector of means in column is used:

\begin{pmatrix} E[\xi_1] \\ E[\xi_2] \end{pmatrix} = \begin{pmatrix} \mu_1 \\ \mu_2 \end{pmatrix}

Covariance

It (\xi_1, \xi_2) v.a. two-dimensional, we will call covariance between \xi_1 and \xi_2 a:

\sigma_{1, 2} = Cov(\xi_1, \xi_2) = E[(\xi_1 - \mu_1) \cdot (\xi_2 - \mu_2)] \text{ con }\mu_1 = E(\xi_1) \text{ y }\mu_2 = E(\xi_2)

For discrete variables, the following is calculated:

\sigma_{1, 2} = Cov(\xi_1, \xi_2) = \sum\limits_{i=1}^{n} \sum\limits_{j=1}^{n} \left((x_i - \mu_1) \cdot (y_j - \mu_2)\right) p_{i, j}

The covariance measures the linear relationship or covariation between two variables

It is useful to use a table of covariances:

\sum = \begin{pmatrix} Var(\xi_1) & Cov(\xi_1, \xi_2) \\ Cov(\xi_1, \xi_2) & Var(\xi_2) \end{pmatrix} = \begin{pmatrix} \sigma^2_1 & \sigma_{1, 2} \\ \sigma_{1, 2} & \sigma^2_1 \end{pmatrix}

Properties of the covariance

  1. Cov(\xi_1, \xi_2) = E[\xi_1 \xi_2] - E[\xi_1] E[\xi_2]\text{ donde }E[\xi_1 \xi_2] = \sum\limits_{i=1}^{n} \sum\limits_{j=1}^{n} \left((x_i y_j) \cdot (p_{i, j})\right)
  2. Cov(\xi_1 + a, \xi_2 + b) = Cov(\xi_1, \xi_2)\text{ ith constant a and b}
  3. Cov(a \cdot \xi_1, b \cdot \xi_2) = a \cdot b \cdot Cov(\xi_1, \xi_2)\text{ ith constant a and b}
  4. Cov(\xi_1 + \xi_2, \xi_3) = Cov(\xi_1, \xi_3) + Cov(\xi_2, \xi_3)
  5. Cov(\xi_1 + \xi_2, \xi_3 + \xi_4) = Cov(\xi_1, \xi_3) + Cov(\xi_1, \xi_4) + Cov(\xi_2, \xi_3) + Cov(\xi_2, \xi_4)
  6. Var(\xi_1 + \xi_2) = Var(\xi_1) + Var(\xi_2) + 2 \cdot Cov(\xi_1, \xi_2)
  7. Var(\xi_1 - \xi_2) = Var(\xi_1) + Var(\xi_2) - 2 \cdot Cov(\xi_1, \xi_2)
  8. Var(\xi_1 + \xi_2) = Var(\xi_1) + Var(\xi_2)\text{ if }\xi_1\text{ and }\xi_2\text{ they are uncorrelated}
  9. Var(\xi_1 - \xi_2) = Var(\xi_1) + Var(\xi_2)\text{ if }\xi_1\text{ and }\xi_2\text{ they are uncorrelated}

Coefficient of linear correlation

We will call coefficient of linear correlation between \xi_1\text{ and }\xi_2 a:

p_{1 2} = Corr(\xi_1, \xi_2) = \frac{Cov(\xi_1, \xi_2)}{\sqrt{Var(\xi_1) \cdot Var(\xi_2)}} = \frac{\sigma_{1 2}}{\sigma_1 \cdot \sigma_2}

The linear correlation coefficient measures the degree of linear relationship between two variables

Incorreladas

Are \xi_1\text{ and }\xi_2 v.a. we will say that they are uncorrected if they have no linear relationship, i.e.:

Cov(\xi_1, \xi_2) = 0

Correlated

Are \xi_1\text{ and }\xi_2 v.a. we will say that they are correlated if they have linear relationship, i.e.:

Cov(\xi_1, \xi_2) \neq 0

Pearson's correlation coefficient

p_{1 2} = Corr(\xi_1, \xi_2) = \frac{Cov(\xi_1, \xi_2)}{dt(\xi_1) \cdot dt(\xi_2)} = \frac{\sigma_{1 2}}{\sigma_1 \cdot \sigma_2}

Note:

\tiny\begin{cases} p_{1 2} = 0 \Leftrightarrow \sigma_{1 2} = 0 \Leftrightarrow \text{ without linear relationship, they are uncorrelated} \\ p_{1 2} \neq 0 \Leftrightarrow \sigma_{1 2} \neq 0 \Leftrightarrow \text{ with linear relationship, they are uncorrelated} \\ p_{1 2} > 0 \Leftrightarrow \sigma_{1 2} > 0 \Leftrightarrow \text{ with increasing linear relationship} \\ p_{1 2} < 0 \Leftrightarrow \sigma_{1 2} < 0 \Leftrightarrow \text{ with linear decreasing relationship} \end{cases}

\text{Given }-1 \leq p_{1 2} \leq 1:

\tiny\begin{cases} p_{1 2} = 1 \Leftrightarrow \text{ with perfect linear increasing relation} \\ p_{1 2} = -1 \Leftrightarrow \text{ with perfect linear decreasing relationship} \\ p_{1 2} = 0 \Leftrightarrow \text{ with weak linear relationship} \\ p_{1 2} = \pm 1 \Leftrightarrow \text{ with strong linear relation} \end{cases}

Independent

Are \xi_1\text{ and }\xi_2 v.a. we will say that they are independent if they do not have any kind of relationship, that is, if they meet any of the following similar conditions:

  1. p(\xi_1 = x_i|\xi_2 = y_j) = p(\xi_1 = x_i); \forall(x_i, y_j)
  2. p(\xi_2 = y_j|\xi_1 = x_i) = p(\xi_2 = y_j); \forall(x_i, y_j)
  3. p(\xi_1 = x_i|\xi_2 = y_j) = p(\xi_1 = x_i) p(\xi_2 = y_j); \forall(x_i, y_j)

Dependent

Are \xi_1\text{ and }\xi_2 v.a. we'll say they're dependent if they have some kind of relationship