Category Archives: Probability

The probability is a measure of the certainty associated with an event or future event and is usually expressed as a number between 0 and 1 (or between 0% and 100%)

Probability

Probability

We will call the probability of a \Omega sample space to any application that complies:

\begin{cases} \Omega \rightarrow R \\ \omega \rightarrow p(\omega) \in \left[0, 1\right] \end{cases}


where the value between 0 and 1 tries to quantify the possibility of that event occurring. It is also usually measured in percentage, so a probability of 1 equals 100% and one of 0 to 0%

  1. P(A) \ge 0, \forall A \text{ event}
  2. P( \Omega) = 1
  3. P(A \cup B) = P(A) + P(B)\text{ si }A \cap B = \emptyset

Properties

  1. P(A) \le 1
  2. P(\emptyset) = 0
  3. P(A^c) = 1 - P(A)
  4. Si B \subset A \Rightarrow P(A - B) = P(A) - P(B)
  5. P(A - B) = P(A) - P(A \cap B)
  6. P(A \cup B) = P(A) + P(B) - P(A \cap B)
  7. P(A_1 \cup \cdots \cup A_n) = P(A_1) + \cdots + P(A_n); \text{ Si } A_i \cap A_j = \emptyset; \forall \not= j
  8. P(A \cup B \cup C) = P(A) + P(B) + P(C) - P(A \cap B) - P(A \cap C)- P(B \cap C) + P(A \cap B \cap C)
  9. P(A \cup B \cup C \cup D) = P(A) + P(B) + P(C) + P(D) - P(A \cap B) - P(A \cap C) - P(A \cap D) - P(B \cap D) - P(C \cap D) + P(A \cap B \cap C \cap D) + P(B \cap C \cap D) - P(A \cap B \cap C \cap D)

Rule of addition

The addition rule or sum rule states that the probability of occurrence of any particular event is equal to the sum of individual probabilities, if the events are mutually exclusive, i.e. two cannot occur at the same time

P(A) \cup P(B) = P(A) + P(B) if A and B are mutually exclusive

P(A\cup B) = P(A) + P(B) - P(A\cap B) if A and B are not mutually exclusive

Being:

\scriptsize\begin{cases}\text{P(A) = probability of occurrence of the event A}\\ \text{P(B) = probability of occurrence of the event B}\\ P(A \cap B)\text{ = probability of simultaneous occurrence of events A y B}\end{cases}

Rule of multiplication

The rule of multiplication states that the probability of occurrence of two or more events are statistically independent is equal to the product of their probabilities, individual

P(A \cap B) = P(A\cdot B) = P(A) \cdot P(B) if A and B are independent

P(A \cap B) = P (A \cdot B) = P(A)\cdot P(B|A) if A and B are dependent

Being P(B|A) the probability that B will occur having given or verified event A

Rule of Laplace

It \Omega sample space where the sample points have the same possibility of occurrence, To event, then:

P(A) = \frac{n^{\underline{0}}\text{ of favorable cases}}{n^{\underline{0}}\text{ of possible cases}}

Frequency Probability (Von Mises)

It \Omega sample space associated with a random phenomenon, be A event. The frequency probability of A occurring is the relative frequency of the number of times it occurs when we repeat the random phenomenon \infty times

\lim\limits_{n\to\infty} \frac{n^{\underline{0}}\text{ of times it happens}}{n}

Event

Event

An event or set of events is each of the possible results of a random experiment

Randomized experiment

It is that which under similar conditions gives us different results

Examples of experiments random

  • Throw a coin and count the number of faces or crosses
  • Extract a card from a deck
  • Calculate the lifetime of a light bulb
  • Measure the temperature of a processor after an hour of work
  • Calculate the number of calls sent or received by a phone line after an hour

Sample space

The set is composed by all the possible outcomes associated with the randomized experiment

It is the entire set

It is represented with \Omega

Example of sample space

In the experiment of throwing a coin 3 times and counting the number of faces

The sample space will be \Omega=\{0,1,2,3\} for the number of faces obtained

Point sample

Single result obtained from a sample space

It is represented with \omega

Being A a set

And is defined p(\omega)=\{A|A\subseteq\Omega\}

Example of point sampling

In the experiment of throwing a coin 3 times and counting the number of faces

If after throwing the coin 3 times we have counted 2 faces, then the sample point is p(3)=2

Random event

It is a set of points sampling

It is represented with A

It is denoted with capital letters (A_i)_{i\in I} family (finite or infinite)

And is defined (A_i)_{i\in I} \in p(\Omega)

Example of a random event

In the experiment of throwing a coin 3 times and counting the number of faces

Let's repeat the experiment 5 times to get a random event, if after tossing the coin 3 times we have counted:

  • 2 faces, then sample point 1 is p(3_1)=2
  • 0 faces, then sample point 2 is p(3_2)=0
  • 3 faces, then sample point 3 is p(3_3)=2
  • 2 faces, then sample point 4 is p(3_4)=2
  • 1 face, then sample point 5 is p(3_5)=1

The random event is A=\{2,0,2,2,1\}

Occurrence of an event

We say that has occurred an event A if in a particular completion of the randomized experiment we obtain a point sample of P((A_i)_{i\in I})=\{A|A\subseteq\Omega\}

Example of the occurrence of an event

In the experiment of throwing a coin 3 times and counting the number of faces

We are going to repeat the experiment 5 times to obtain a random event

We are going to repeat the experiment 5 times to obtain a random event, if after flipping the coin 3 times we have obtained:

  • 2 faces, then the occurrence of the event is P(3_1)=2
  • 0 faces, then the occurrence of the event is P(3_2)=0
  • 2 faces, then the occurrence of the event is P(3_3)=2
  • 2 faces, then the occurrence of the event is P(3_4)=2
  • 1 face, then the occurrence of the event is P(3_5)=1

Event insurance

It is the one which happens always

It is represented with \Omega

Being A a set

It is denoted
p(\omega)=\{A|A\subseteq\Omega\}=\Omega
\Omega=\{x, x\in\Omega\}\not =\{\{x\},x\in\Omega\}\subseteq p(\Omega)

Example of event insurance

In the experiment of throwing a coin 3 times and counting the number of faces

Getting a number of faces (including 0) is a safe event because we can always count the number of faces (even if none comes out, because we've included 0)

The event sure, then it is
\omega=\{"get a number of faces"\}
p(\omega)=\Omega

Event impossible

It is the one that never happens

It is represented with \emptyset

Being A a set

It is denoted
p(\omega)=\{A|A\subseteq\Omega\}=\emptyset
p(\emptyset)=1

Example of event impossible

In the experiment of throwing a coin 3 times and counting the number of faces

Getting the color red, it is an impossible event because in the experiment we are taking into account the number of faces obtained, we are not taking into account the color of the die

The event sure, then it is
\omega=\{"get the color red"\}
p(\omega)=\emptyset

Event otherwise

We will call event otherwise A, to the event that occurs when it does not occur A

It is represented with A^c

It is denoted A^c=\Omega\backslash A

Example of an event contrary

In the experiment of tossing a coin and count the number of faces

Getting cross instead of face, is the opposite event because we're taking into account the number of faces, not crosses

If A=\{"number of faces obtained"\} then the event otherwise it is A^c=\{"number of crosses obtained"\}

Union of events

We will call event binding A and B, to the event that occurs or A or B or two

It is represented with A\cup B

Being A a set

It is denoted \underset{i\in I}{\bigcup} A_i\in p(\Omega)

Example of a union of events

In the experiment of tossing a coin and count the number of faces or crosses

Being
A=\{"number of faces obtained"\}=\{3,4\}
B=\{"number of crosses obtained"\}=\{2,4,6\}
A\cup B=\{"number of faces or crosses obtained"\}=\{2,3,4,6\}

Intersection of events

We will call the intersection of events A and B, to the event that occurs when it occurs A and B

It is represented with A\cap B

Being A a set

It is denoted \underset{i\in I}{\bigcap} A_i\in p(\Omega)

Example of intersection of events

In the experiment of tossing a coin and count the number of faces or crosses

Being
A=\{"number of faces obtained"\}=\{3,4\}
B=\{"number of crosses obtained"\}=\{2,4,6\}
A\cap B=\{"even number of faces and crosses obtained"\}=\{4\}

Difference event

We will call a difference event A and B, to the event that occurs when it occurs A or B but not both at the same time

It is represented with A \backslash B = A - B

It is denoted A - B = A - A \cap B = A \cap B^c

Example of difference of events

In the experiment of tossing a coin and count the number of faces or crosses

Being
A=\{"number of faces obtained"\}=\{3,4\}
B=\{"number of crosses obtained"\}=\{2,4,6\}
A-B=\{"odd number of faces or crosses obtained but not both at once"\}=A - A\cap B=\{3,4\}-\{4\}=\{3\}

Symmetric difference of events

We will call symmetric difference of events A and B, to the event of all events that occurs when it occurs A\cup B but not A\cap B

It is represented with A \triangle B

It is denoted A \triangle B = (A \cup B) - (A \cap B)

Example of symmetric difference of events

In the experiment of tossing a coin and count the number of faces or crosses

Being
A=\{"number of faces obtained"\}=\{3,4\}
B=\{"number of crosses obtained"\}=\{2,4,6\}
A\triangle B=\{"even number of faces or crosses obtained but not even number of faces and crosses"\}=(A \cup B) - (A \cap B)=\{2,3,4,6\}-\{4\}=\{2,3,6\}

Laws of Morgan

Laws proposed by Augustus De Morgan (1806-1871), an Indian-born British mathematician and logician, which set out the following fundamental principles of the algebra of logic:

  • The negation of the conjunction is equivalent to the disjunction of negations

  • The negation of the disjunction is equivalent to the conjunction of the negations

The following definitions of Morgan's laws can be used within the statistics:

Being A, B and C sets

  1. \left(A\cup B\right)^c = A^c\cap B^c
    whose generalized form is
    \left(\underset{i\in I}{\bigcup} A_i\right)^c = \underset{i\in I}{\bigcap} \left(A_i\right)^c
  2. \left(A\cap B\right)^c = A^c\cup B^c
    whose generalized form is
    \left(\underset{i\in I}{\bigcap} A_i\right)^c = \underset{i\in I}{\bigcup} \left(A_i\right)^c
  3. A\cap\left(B\cup C\right) = \left(A\cap B\right)\cup\left(A\cap C\right)
    whose generalized form is
    \underset{j\in I}{\bigcap}\left(\underset{i\in I}{\bigcup} A_i\right) = \underset{i j\in I}{\bigcup}\left(\underset{j\in I}{\bigcap} A_{i j, j}\right)
  4. A\cup\left(B\cap C\right) = \left(A\cup B\right)\cap\left(A\cup C\right)
    whose generalized form is
    \underset{j\in I}{\bigcup}\left(\underset{i\in I}{\bigcap} A_i\right) = \underset{i j\in I}{\bigcap}\left(\underset{j\in I}{\bigcup} A_{i j, j}\right)

Demonstration 1

We want to show that \left(A\cup B\right)^c = A^c\cap B^c

\omega\in\left(A\cup B\right)^c \Rightarrow \omega \not \in A\cup B \Rightarrow \begin{cases} \omega \not \in A \\ \omega \not \in B \end{cases} \Rightarrow \begin{cases} \omega \in A^c \\ \omega \in B^c \end{cases} \Rightarrow \omega \in A^c\cap B^c

With what we got to what we wanted, being tested

Demonstration 2

We want to show that \left(A\cap B\right)^c = A^c\cup B^c

\omega\in\left(A\cap B\right)^c \Rightarrow \omega \not \in A\cap B \Rightarrow \begin{cases} \omega \not \in A \\ \omega \not \in B \end{cases} \Rightarrow \begin{cases} \omega \in A^c \\ \omega \in B^c \end{cases} \Rightarrow \omega \in A^c\cup B^c

With what we got to what we wanted, being tested

Event incompatible

We will say that A and B these are events that are incompatible if they can not occur never at the same time

It is denoted
A \cap B = \emptyset
A \cap A^c = \emptyset

A family \left(A_i\right)_{i\in I} sets 2 to 2 disjoints (or mutually exclusive) if A_i\cup A_j = \emptyset when i\not = j

If a family \left(A_i\right)_{i\in I} is mutually exclusive, we'll denote it \underset{i\in I}{\sqcup}A_i := \underset{i\in I}{\cup}A_i

We say that a family \left(A_i\right)_{i\in I} is exhaustive if A_i\cap A_j = \Omega

Set denumerable

A set is said to denumerable if it is biyectivo with \mathbb{N}

Set accounting

A set is said to countable if a is denumerable or finite

Combinatorial

Combinatorial

The combinatorial is a branch of mathematics belonging to the area of discrete mathematics that studies the enumeration, construction and existence of configuration properties that satisfy certain established conditions

In addition, it studies the sorts or groupings of a certain number of elements. It is used in statistics to perform probabilistic calculations

Variations

Suppose we want to count the total number of possible injective applications that can be built from an X set, from k elements, into another set Y, of n elements (which will have to be k \le n)

An application f| X \rightarrow Y with injective f, it is completely determined whether we know each of the images of the k elements of X

If we consider the f application as a word of k letters of the Y alphabet, it will not have repeated letters. The f app will be f(x_1)f(x_2)\cdots f(x_n) then:

f(x_1) \in Y
f(x_2) \in Y \text{\ }\{f(x_1)\} = \{y \in Y | y \not= f(x_1)\}
f(x_3) \in Y \text{\ }\{f(x_1), f(x_2)\} = \{y \in Y | y \not= f(x_1), y \not= f(x_2)\}
\vdots
f(x_n) \in Y \text{\ }\{f(x_1), \cdots, f(x_{k - 1})\} = \{y \in Y | y \not= f(x_1), \cdots, y \not= f(x_{k - 1})\}

If we denote by V(n, k) to the total injective applications of X in Y and we call it variations of n elements taken from k to k, then we have by the principle of the product that:

V(n, k) = n \cdot (n - 1) \cdot (n -2)\cdots (n - k + 1) = \frac{n!}{(n-k)!}

Where n! = n \cdot (n - 1) \cdot (n -2) \cdot \cdots \cdot 2 \cdot 1 which is the product of all natural numbers from 1 to n (this amount is called factorial of n)

Example of variations

What is the probability that in a group of n people there will be 2 who celebrate the birthday on the same day?

Calculating the probability of the n sets is very tedious, we have to calculate the probability that they met it 0, 1, \cdots, (n -1) on the same day

That's why it's best to calculate the probability of the opposite event. That is, the probability that n people will celebrate their birthday on different days, it becomes the same as giving an orderly list of n days other than between 365 days of the year. Therefore we have to:

\text{Favorable cases = }V(365, n) = \frac{365!}{(365 - n)!}

Possible cases are all sorted lists of n days, so repetitions are allowed (they are variations with repetition). Therefore we have to:

\text{Possible cases = }VR(365, n) = 365^n

So the solution to our problem will be given by:

p = 1 - \frac{\text{favorable cases}}{\text{possible cases}} = 1 - \frac{V(365, n)}{VR(365, n)} = 1 - \frac{365!}{365^n \cdot (365 - n)!}

The following table shows the p probability that in a group of n people there are at least two who celebrate their birthday on the same day:

n p n p
5 0.027136 35 0.814383
10 0.116948 40 0.891223
15 0.252901 45 0.940976
20 0.411438 50 0.970374
21 0.443688 55 0.986262
22 0.475695 60 0.994123
23 0.507297 65 0.997683
24 0.538344 70 0.999160
25 0.568700 75 0.999720
26 0.598241 80 0.999914
27 0.626859 85 0.999976
28 0.654461 90 0.999994
29 0.680969 95 0.99999856
30 0.706316 100 0.99999969

Variations with repetition

Let's say we want to count the total number of possible applications that can be built from an X set, from k elements, into another set Y, of n elements

An application f| X \rightarrow Y is completely determined if we know every one of the images of the k elements of X

I mean, we need to know f(x_i) with 1 \le i \le k. This is equivalent to giving a k-tuple (f(x_1), f(x_2), \cdots, f(x_k)) set Y^k = \overbrace{Y x \cdots x Y}^{k\;\rm times}

It is also equivalent to a word of k letters of the alphabet Y (f(x_1), f(x_2), \cdots, f(x_k)) or give an ordered selection of k elements among Y's (Y elements can be repeated, i.e. it may happen that f(x_i) = f(x_i)\text{ con }i \not= j)

The only condition is that f(x_i) \in Y. Therefore, the total applications of X in Y, or the total variations with repetition of n elements taken from k to k, is equal to the cardinal of Y^k which, by the beginning of the product, is n^k. If we denote this number by VR(n, k), then:

VR(n, k) = n^k

Example of variations with repetition

What is the probability of hitting the plenary to fifteen in a quiniela?

Filling a quiniela is equivalent to giving a list of 15 symbols by choosing between 1, X and 2, that is, a word of length 15 constructed with alphabet 1, X and 2

So we have that the number of possible quinielas will be:

VR(3, 15) = 3^{15}

However, this is not the solution to our problem, which will be given by:

p = \frac{n^{\underline{0}}\text{ of favorable cases}}{n^{\underline{0}}\text{ of possible cases}} = \frac{1}{3^{15}} = 6,9691719376256323913730850719152 \cdot 10^{-8}

Permutations

We will call permutations of m elements, the number of variations without repetition of m elements that can be formed

P_m = m!

Example of Permutations

We have a bookshelf that fits three books and we want to sort them without any repeat. Each book has the cover of a different color: red, blue, and green. To distinguish them we will use the L set of books and their elements are the first letter of the color of their cover:

L=\{R, A, V\}
Management Number of permutation
L=\{R, A, V\} 1
L=\{R, V, A\} 2
L=\{V, R, A\} 3
L=\{V, A, R\} 4
L=\{A, V, R\} 5
L=\{A, R, V\} 6

To calculate the number of permutations we can see that they have been grouped together until we get all the possible variations. The first pass all 3 items have been used. The second is discarded 1 and only 2 are used. The third and final, 1 is discarded and the only remaining element is used

Therefore, to calculate the permutations we have to multiply the number of elements other than the 3 passes:

3\cdot 2 \cdot 1 = 6

Or what's the same:

P_3 = 3! = 3\cdot 2 \cdot 1 = 6

Permutations with repetition

We will call permutations with repetition of m elements, the number of variations of m elements that can form when some elements are repeated a finite number of times

PR_{m}^{n_{1}, n_{2}, \cdots, n_k} = \frac{m!}{n_{1}! \times n_{2}! \times \cdots \times n_k!}

Example of permutations with repetition

The result of a football match was 5-4

How many different ways could this result be achieved?

Any goal scored by the local team what we denote by L and any goal scored by the visiting team with V

The number of L or V-total has to be of length 5 + 4 = 9, so we look for any sorted list containing 5 L and 4 V in any order, representing the possible order of goals in the match

With what we have that the number of different ways to reach that result is:

PR_{9}^{5, 4} = \frac{9!}{5! \cdot 4!} = 126

Combinations

We call combinations of m elements taken from n in n the number of subsets that can be formed with n of these m elements without repeating any

C_{m, n} = {m \choose n} = \frac{m!}{n! \cdot (m - n)!}

Example of combinations

What is the probability of hitting the primitive lottery?

In the primitive lottery, there are 49 possible numbers to play and 6 of them can be chosen, regardless of the order in which they appear

To know how many possible combinations there are in this game, just calculate the number of subsets that can be formed with 6 of those 49 elements

With what we have that the number of possible lottery tickets will be:

C_{49, 6} = {49 \choose 6} = \frac{49!}{6! \cdot (49 - 6)!}=\frac{49!}{6! \cdot 43!}

However, this is not the solution to our problem, which will be given by:

p = \frac{n^{\underline{0}}\text{ of favorable cases}}{n^{\underline{0}}\text{ of possible cases}} = \frac{1}{\frac{49!}{6! \cdot 43!}} = \frac{6! \cdot 43!}{49!} = 7,1511238420185162619416617 \cdot 10^{-8}

Combinations with repetition

We will call combinations with repetition of m elements taken from n in n the number of subsets that can be formed with n of these m elements may repeat any

CR_{m, n} = {m + n - 1\choose n} = \frac{(m + n - 1)!}{n! \cdot (m - n)!}

Example of combinations with repetition

How many tiles does the domino have?

A domino is a rectangle divided into two equal pairs, and that each part contains a number of points chosen within the set \{0, 1, 2, 3, 4, 5, 6\}, where 0 is represented with the absence of points

The total number of dominoes matches the number of unordered selections of two elements, repeated or not, chosen from the set \{0, 1, 2, 3, 4, 5, 6\}

With what we have that the total number of dominoes will be:

CR_{7, 2} = {7 + 2 - 1\choose 2} = \frac{(7 + 2 - 1)!}{2! \cdot (7 + 2 - 1 - 2)!} = \frac{8!}{2! \cdot 6!} =28

Probability conditional

Probability conditional

The likelihood conditional of A given B with \Omega the sample space of A and B events with P(B)\not=0 will be the probability that A occurs knowing that event B has occurred:

P(A | B) = \frac{P(A \cup B)}{P(B)}

Properties

  1. P(\emptyset | A) = 0
  2. P(\Omega | A) = 1
  3. 0 \leq P(B | A) \leq 1
  4. P(B^c | A) = 1 - P(B | A)
  5. P(A \cup B | C) = P(A | C) + P(B | C) - P(A \cap B | C)
  6. P(A_1 \cap A_2) = P(A_1) \cdot P(A_2 | A_1)
  7. P(A_1 \cap A_2 \cap A_3) = P(A_1) \cdot P(A_2 | A_1) \cdot P(A_3) P(A_3 | A_1 \cap A_2)
  8. P(A_1 \cap \cdots \cap A_n) = P(A_1) \cdot P(A_2 | A_1) \cdots P(A_n) \cdot P(A_n | A_1 \cap A_2 \cap \cdots \cap A_{n-1})

Event independent

It \Omega sample space of A and B events, we will say that they are independent if any of the following equivalent properties hold:

  • P(A | B) = P(A)
  • P(B | A) = P(B)
  • P(A \cap B) = P(A) \cdot P(B)

So whenever \Omega the sample space of A_1, \cdots, A_n events, we will say that they are independent if and only if:

\text{1) }P(A_i \cap A_j) = P(A_i) P(A_j), \forall i \not= j
\text{2) }P(A_i \cap A_j \cap A_k) = P(A_i) P(A_j) P(A_k), \forall i \not= j, i \not= k, j \not= k
\cdots)
\text{n-1) }P(A_1 \cap \cdots \cap A_n) = P(A_1) \cdots P(A_n)

Dependent event

We will say that they are dependent if they are not dependent:

  • P(A | B) \not= P(A)
  • P(A | B) > P(A)
  • P(A | B) < P(A)

Dependency and incompatibility

If A and B have nonzero and incompatible probabilities, then they are dependent

Incompatible: A \cap B = \emptyset \Rightarrow P(A \cap B) = 0

Independent: P(A \cap B) = P(A) \cdot P(B)

Theorem of the probability

It \Omega the sample space of A_1, \cdots, A_n events, we will say that it forms a complete system of events (CSE) if and only if they fulfill:

  1. A_i \not= \emptyset, \forall i
  2. A_i \cap A_j \not= \emptyset, \forall i \not= j
  3. A_1 \cup \cdots \cup A_n = \Omega

Theorem of the probability total

It \Omega a sample space with A_1, \cdots, A_n a complete system of events and let B be another different event, then:

P(B) = P(B | A_1) \cdot P(A_1) + \cdots + P(B | A_n) \cdot P(A_n)

Demonstration

P(B) = P(B \cup A_1) + \cdots + P(B \cup A_n) \cdot P(B | A) = \frac{P(B \cup A)}{P(A)} P(B \cup A) = P(B | A) \cdot P(A) P(B) = P(B | A_1) \cdot P(A_1) + \cdots + P(B | A_n) \cdot P(A_n)

Bayes Theorem

It \Omega a sample space with A_1, \cdots, A_n a complete system of events and let B be another different event, then:

P(A_i | B) = \frac{P(B | A_i) P(A_i)}{P(B)}, \forall i \in \{1, \cdots, n\}

Example of Bayes Theorem

All the production of a company is carried out by 3 machines independently. The first does half the work, the second the fifth and the third the rest. These machines have so far produced 2%, 4% and 3% defective units, respectively. We want to calculate:

  1. The percentage of defective parts that the company produces
  2. If we pick a part at random and it turns out to be defective, what is the most likely machine to produce it?

Before making any calculations, we will sort the information that the problem gives us

Probability that a part is produced on a given machine:

Probability of the machine Result
P(M_1) \frac{1}{2} = 0.5
P(M_2) \frac{1}{5} = 0.2
P(M_3) 1 - \frac{1}{2} - \frac{1}{5} = \frac{10-5-2}{10}=\frac{3}{10}=0.3

Probability that a part is defective, depending on whether it is produced on a specific machine:

Probability to be faulty and the machine Result
P(D | M_1) 2\cdot \frac{1}{100} = 0.02
P(D | M_2) 4\cdot \frac{1}{100} = 0.04
P(D | M_3) 3\cdot \frac{1}{100} = 0.03

Now we move on to solve the questions

  1. We apply the theorem of the probability total

    P(D) = P(D | M_1) \cdot P(M_1) + P(D | M_2) \cdot P(M_2) + P(D | M_3) \cdot P(M_3)
    = 0.02 \cdot 0.5 + 0.04 \cdot 0.2 + 0.03 \cdot 0.3 = 0.027

    Therefore, the company produces a 0.027 \cdot 100 = 2.7\% of defective parts
  2. Before we can answer the question we need to calculate the probabilities of each machine individually and then choose the one that is greater. To do this, we will use Bayes Theorem

    P(M_1 | D) = \frac{P(D | M_1) \cdot P(M_1)}{P(D)} = \frac{0.02 \cdot 0.5}{0.027} = 0.3704

    P(M_2 | D) = \frac{P(D | M_2) \cdot P(M_2)}{P(D)} = \frac{0.04 \cdot 0.2}{0.027} = 0.2963

    P(M_3 | D) = \frac{P(D | M_3) \cdot P(M_3)}{P(D)} = \frac{0.03 \cdot 0.3}{0.027} = 0.3333

    Therefore, the most likely machine to produce the defective part is M_1

Random Variable

Random Variable

A random variable is a function that associates each elementary event a perfectly defined number:

\xi | \Omega \rightarrow \mathbb{R}

Random Variable one-dimensional

It \Omega sample space and P its probability, we will call random variable one-dimensional (v.a.) to an application:

\begin{cases} \xi | \Omega \rightarrow \mathbb{R} \\ \omega \rightarrow \xi(\omega) \in \mathbb{R} \end{cases}

Example of random variable

\Omega \equiv \text{"all 3-bit words"}
\xi \equiv \text{"}n^{\underline{0}}\text{ of some in those words"}
\Omega \equiv \{000, 001, 010, 011, 100, 101, 110, 111\}

\xi | \Omega \rightarrow \mathbb{R}
000 \rightarrow 0
001 \rightarrow 1
010 \rightarrow 1
011 \rightarrow 2
100 \rightarrow 1
101 \rightarrow 2
110 \rightarrow 2
111 \rightarrow 3

P_\xi(0) = P\{\xi = 0\} = P\{000\} = \frac{1}{8} = 0.125
P_\xi(1) = P\{\xi = 1\} = P\{001, 010, 100\} = \frac{3}{8} = 0.375
P_\xi(2) = P\{\xi = 2\} = P\{011, 101, 110\} = \frac{3}{8} = 0.375
P_\xi(3) = P\{\xi = 3\} = P\{111\} = \frac{1}{8} = 0.125
P_\xi(-1) = P\{\xi = -1\} = P\{\emptyset\} = 0
P_\xi(0.75) = P\{\xi = 0.75\} = P\{\emptyset\} = 0

Distribution function

It \xi v.a. (random variable) we'll call the distribution function of \xi to a function:

\begin{cases}F|\mathbb{R} \rightarrow [0, 1] \\ x|F(x) \\ \exists F(x) = P(-\infty, x] = P(\xi \leq x) \text{ con }x \in \mathbb{R} \end{cases}

Example of distribution function

F(0) = P\{\xi \leq 0\} = \frac{1}{8} = 0.125
F(1) = P\{\xi \leq 1\} = \frac{1}{8} + \frac{3}{8} = \frac{1}{2} = 0.5
F(2) = P\{\xi \leq 2\} = \frac{1}{8} + \frac{3}{8} + \frac{3}{8} = \frac{7}{8} = 0.875
F(3) = P\{\xi \leq 3\} = \frac{1}{8} + \frac{3}{8} + \frac{3}{8} + \frac{1}{8} = 1
F(-1) = P\{\xi \leq -1\} = \emptyset = 0
F(0.75) = P\{\xi \leq 0.75\} = F(1) = \frac{1}{2} = 0.5

V.a. discrete type

It \xi v.a. one-dimensional we'll say it's discrete if the set D_\xi = \{x \in \mathbb{R} | P\{\xi = x\} > 0\} is a numberable set (finite or infinite numberable)

Being:

\begin{cases} D_\xi \equiv \text{"support for the v.a. }\xi\text{"} \\ x \in D_\xi \equiv \text{"points of mass of the v.a. }\xi\text{"} \\ P\{\xi = x\}\text{ with }x \in D_\xi \equiv \text{"probability function of n }\xi\text{"} \\ P_i = P\{\xi = x_i\}\text{ with }P_i > 1 \text{ y }\sum\limits_{i=1}^{n} P_i = 1 \end{cases}

V.a. continuous type

We'll say I'm going to see you. \xi type is continuous if the set of points with different probability of 0 is an uncountable set

Is defined if \exists f|\mathbb{R}\rightarrow\mathbb{R}^+\Rightarrow F(x)=\int^{+\infty}_{-\infty} f(t) \cdot dt

When taking a particular value, will be zero (P(\xi = x) = 0), and accordingly:

p(x_1 < \xi \leqslant x_2) = p(x_1 < \xi < x_2) = F(x_2) - F(x_1)

Density function

We call a density function a function from which we can calculate probabilities such as the area enclosed between it and the horizontal axis f(x)

Being:

\begin{cases} f(x) \geq 0, \forall x \in \mathbb{R},\text{ }f(x) \text{ integrable} \\ \int^{+\infty}_{-\infty} f(x) \cdot dx = 1 \end{cases}

Center position measurement: The Average \mu\text{ or }E[\xi]

It \xi v.a. we will call hope (or means) a value denoted as E[\xi]=\mu which in the case of discrete variables is:

E[\xi] = \mu = \sum\limits_{i=1}^{n} x_i \cdot P_i

And in the continuous ones:

E[\xi] = \mu = \int^{+\infty}_{-\infty} x \cdot f(x) \cdot dx

Properties of the mean

  1. E[k] = k\text{; if k is constant}
  2. E[\xi + a] = E[\xi] + a\text{; if a is constant (change of origin)}
  3. E[b\cdot\xi] = b\cdot E[\xi]\text{; if b is constant (change of scale)}
  4. E[a + b\cdot\xi] = a + b\cdot E[\xi]\text{ if a and b are constants (n linear transformation)}
  5. E[\xi_1 + \cdots + \xi_n] = E[\xi_1] + \cdots + E[\xi_n]
  6. k_1 \leq \xi \leq k_2 \Rightarrow k_1 \leq E[\xi] \leq k_2
  7. \xi_1 \leq \xi_2 \Rightarrow E[\xi_1] \leq E[\xi_2]

Absolute dispersion measurement: Variance \sigma^2 \text{ or } Var[\xi]

It \xi v. a. we will call variance to:

\sigma^2 = Var(\xi) = E[(\xi - \mu)^2]\text{ siendo }\mu = E[\xi]

For discrete variables, the following is calculated:

\sigma^2 = Var(\xi) = \sum\limits_{i=1}^{n} (\xi_i - \mu)^2 p_i

For continuous variables, the following is calculated:

\sigma^2 = Var(\xi) = \int^{+\infty}_{-\infty} (x - E(x))^2 \cdot f(x) \cdot dx

Properties of the variance

  1. \sigma^2 = Var(\xi) = E[\xi^2] - E^2[\xi]\text{ in general}
    \sigma^2 = \sum\limits_{i=1}^{n} x^2_i \cdot p_i - \left(\sum\limits_{i=1}^{n} x_i \cdot p_i\right)^2\text{ in the discrete variables}
    \sigma^2 = \int^{+\infty}_{-\infty} x^2 \cdot f(x) \cdot dx - \left(\int^{+\infty}_{-\infty} x \cdot f(x) \cdot dx\right)^2\text{ in the continuous variables}
  2. Var(\xi) \geq 0
  3. Var(\xi) = 0\text{ si }\xi\text{ is constant}
  4. Var(\xi + a) = Var(\xi)\text{ if a is constant}
  5. Var(b\cdot\xi) = b^2\cdot Var(\xi)\text{ if b is constant}
  6. Var(a + b\cdot\xi) = b^2\cdot Var(\xi)\text{ if a and b are constants}

Standard deviation \sigma

It \xi v. a. we will call standard deviation to:

\sigma = dt(\xi) = +\sqrt{Var(\xi)}

Is the positive square root of the variance

Inequality of Tchebycheff

If a v. a. \xi have half a \mu and standard deviation \sigma then for any k > 0 it is fulfilled that:

P\{|\xi - \mu| \leq k\cdot\sigma\} \geq 1 - \frac{1}{k^2}

Or what's the same:

P\{|\xi - \mu| > k\cdot\sigma\} \leq \frac{1}{k^2}

Random variable. Two-dimensional

A discrete two-dimensional v.a. is an application of:

\begin{cases} \sigma \rightarrow \mathbb{R}^2 \\ \omega \rightarrow (x, y) \in \mathbb{R}^2 \end{cases}

Where the set of points with probability > 0 it's numberable, being (x_i, y_j)

We will call points of mass points with probability \not= 0 in a discrete two-dimensional v.a. and we'll denote it (\xi_1, \xi_2), being \xi_1 and \xi_2 v.a. one-dimensional

We will call the probability function to the probabilities of the mass points, that is, to the values:

\begin{cases} P_{i j} = P\{(\xi_1, \xi_2) = (x_i, x_j)\} = P\{\xi_1 = x_i, \xi_2 = y_j\} \\ \sum\limits_{i=1}^{n} \sum\limits_{j=1}^{m} P_{i j} = 1\end{cases}

The sum of the probability functions should always be 1

Being able to obtain the following probabilities matrix:

\begin{pmatrix} \xi_1, \xi_2& y_1& \cdots& y_m& p_{i *} \\ x_1& p_{1 1}& \cdots& p_{1 m}& p_{1 *} \\ \cdots& \cdots& \cdots& \cdots& \cdots \\ x_n& p_{n 1}& \cdots& p_{n m}& p_{n *} \\ p_{* j}& p_{* 1}& \cdots& p_{* m}& 1 \end{pmatrix}

For a random variable two-dimensional discrete \xi_1, \xi_2) marginal distributions are the distributions of one-dimensional v.a. \xi_1\text{ and }\xi_2. In the case of discrete type v.a. the marginal probability functions:

\begin{cases} \xi_1 | p_i = p\{\xi_1 = x_i\} = \sum\limits_{j=1}^{n} p_{i, j} = \sum\limits_{j=1}^{n} p\{\xi_1 = x_i, \xi_2 = y_j\} \\ \xi_2 | p_j = p\{\xi_2 = x_j\} = \sum\limits_{i=1}^{n} p_{i, j} = \sum\limits_{i=1}^{n} p\{\xi_1 = x_i, \xi_2 = y_j\} \end{cases}

For a random variable two-dimensional discrete (\xi_1, \xi_2) conditional distributions are the distributions of one of the components of the two-dimensional v.a. (\xi_1\text{ or }\xi_2) given a value of the other component (\xi_2\text{ or }\xi_1 respectively). In the case of discrete type v.a. the conditional probability functions:

\begin{cases} \xi_1 \text{ given } \xi_2 | p(\xi_1 = x_i | \xi_2 = y_j) = \frac{p(\xi_1 = x_i, \xi_2 = y_j)}{p(\xi_2 = y_j)} = \frac{p_{i, j}}{p_{., j}} \\ \xi_2 \text{ given } \xi_1 | p(\xi_1 = x_i | \xi_2 = y_j) = \frac{p(\xi_1 = x_i, \xi_2 = y_j)}{p(\xi_1 = x_i)} = \frac{p_{i, j}}{p_{i, .}} \end{cases}

To get the mean, a vector of means in column is used:

\begin{pmatrix} E[\xi_1] \\ E[\xi_2] \end{pmatrix} = \begin{pmatrix} \mu_1 \\ \mu_2 \end{pmatrix}

Covariance

It (\xi_1, \xi_2) v.a. two-dimensional, we will call covariance between \xi_1 and \xi_2 a:

\sigma_{1, 2} = Cov(\xi_1, \xi_2) = E[(\xi_1 - \mu_1) \cdot (\xi_2 - \mu_2)] \text{ con }\mu_1 = E(\xi_1) \text{ y }\mu_2 = E(\xi_2)

For discrete variables, the following is calculated:

\sigma_{1, 2} = Cov(\xi_1, \xi_2) = \sum\limits_{i=1}^{n} \sum\limits_{j=1}^{n} \left((x_i - \mu_1) \cdot (y_j - \mu_2)\right) p_{i, j}

The covariance measures the linear relationship or covariation between two variables

It is useful to use a table of covariances:

\sum = \begin{pmatrix} Var(\xi_1) & Cov(\xi_1, \xi_2) \\ Cov(\xi_1, \xi_2) & Var(\xi_2) \end{pmatrix} = \begin{pmatrix} \sigma^2_1 & \sigma_{1, 2} \\ \sigma_{1, 2} & \sigma^2_1 \end{pmatrix}

Properties of the covariance

  1. Cov(\xi_1, \xi_2) = E[\xi_1 \xi_2] - E[\xi_1] E[\xi_2]\text{ donde }E[\xi_1 \xi_2] = \sum\limits_{i=1}^{n} \sum\limits_{j=1}^{n} \left((x_i y_j) \cdot (p_{i, j})\right)
  2. Cov(\xi_1 + a, \xi_2 + b) = Cov(\xi_1, \xi_2)\text{ ith constant a and b}
  3. Cov(a \cdot \xi_1, b \cdot \xi_2) = a \cdot b \cdot Cov(\xi_1, \xi_2)\text{ ith constant a and b}
  4. Cov(\xi_1 + \xi_2, \xi_3) = Cov(\xi_1, \xi_3) + Cov(\xi_2, \xi_3)
  5. Cov(\xi_1 + \xi_2, \xi_3 + \xi_4) = Cov(\xi_1, \xi_3) + Cov(\xi_1, \xi_4) + Cov(\xi_2, \xi_3) + Cov(\xi_2, \xi_4)
  6. Var(\xi_1 + \xi_2) = Var(\xi_1) + Var(\xi_2) + 2 \cdot Cov(\xi_1, \xi_2)
  7. Var(\xi_1 - \xi_2) = Var(\xi_1) + Var(\xi_2) - 2 \cdot Cov(\xi_1, \xi_2)
  8. Var(\xi_1 + \xi_2) = Var(\xi_1) + Var(\xi_2)\text{ if }\xi_1\text{ and }\xi_2\text{ they are uncorrelated}
  9. Var(\xi_1 - \xi_2) = Var(\xi_1) + Var(\xi_2)\text{ if }\xi_1\text{ and }\xi_2\text{ they are uncorrelated}

Coefficient of linear correlation

We will call coefficient of linear correlation between \xi_1\text{ and }\xi_2 a:

p_{1 2} = Corr(\xi_1, \xi_2) = \frac{Cov(\xi_1, \xi_2)}{\sqrt{Var(\xi_1) \cdot Var(\xi_2)}} = \frac{\sigma_{1 2}}{\sigma_1 \cdot \sigma_2}

The linear correlation coefficient measures the degree of linear relationship between two variables

Incorreladas

Are \xi_1\text{ and }\xi_2 v.a. we will say that they are uncorrected if they have no linear relationship, i.e.:

Cov(\xi_1, \xi_2) = 0

Correlated

Are \xi_1\text{ and }\xi_2 v.a. we will say that they are correlated if they have linear relationship, i.e.:

Cov(\xi_1, \xi_2) \neq 0

Pearson's correlation coefficient

p_{1 2} = Corr(\xi_1, \xi_2) = \frac{Cov(\xi_1, \xi_2)}{dt(\xi_1) \cdot dt(\xi_2)} = \frac{\sigma_{1 2}}{\sigma_1 \cdot \sigma_2}

Note:

\tiny\begin{cases} p_{1 2} = 0 \Leftrightarrow \sigma_{1 2} = 0 \Leftrightarrow \text{ without linear relationship, they are uncorrelated} \\ p_{1 2} \neq 0 \Leftrightarrow \sigma_{1 2} \neq 0 \Leftrightarrow \text{ with linear relationship, they are uncorrelated} \\ p_{1 2} > 0 \Leftrightarrow \sigma_{1 2} > 0 \Leftrightarrow \text{ with increasing linear relationship} \\ p_{1 2} < 0 \Leftrightarrow \sigma_{1 2} < 0 \Leftrightarrow \text{ with linear decreasing relationship} \end{cases}

\text{Given }-1 \leq p_{1 2} \leq 1:

\tiny\begin{cases} p_{1 2} = 1 \Leftrightarrow \text{ with perfect linear increasing relation} \\ p_{1 2} = -1 \Leftrightarrow \text{ with perfect linear decreasing relationship} \\ p_{1 2} = 0 \Leftrightarrow \text{ with weak linear relationship} \\ p_{1 2} = \pm 1 \Leftrightarrow \text{ with strong linear relation} \end{cases}

Independent

Are \xi_1\text{ and }\xi_2 v.a. we will say that they are independent if they do not have any kind of relationship, that is, if they meet any of the following similar conditions:

  1. p(\xi_1 = x_i|\xi_2 = y_j) = p(\xi_1 = x_i); \forall(x_i, y_j)
  2. p(\xi_2 = y_j|\xi_1 = x_i) = p(\xi_2 = y_j); \forall(x_i, y_j)
  3. p(\xi_1 = x_i|\xi_2 = y_j) = p(\xi_1 = x_i) p(\xi_2 = y_j); \forall(x_i, y_j)

Dependent

Are \xi_1\text{ and }\xi_2 v.a. we'll say they're dependent if they have some kind of relationship