Probability

What is probability?

If you are familiar with probability you may skip this section.

Probability is a numerical measure of how likely an event is to occur. Imagine you have a fair coin and want to find out the probability (likelihood) that the coin lands on heads or tails in $N$ flips: $\frac{Heads}{Flips}$ and $\frac{Tails}{Flips}$. You proceed to run five trials of an experiment, flipping a coin ten times more each trial and marking all the times it has landed on heads or tails.

Flips	Heads	Tails	$\frac{Heads}{Flips}$	$\frac{Tails}{Flips}$
$1*10^0$	1	0	1	0
$1*10^1$	3	7	$\frac{3}{10}$	$\frac{7}{10}$
$1*10^2$	52	48	$\frac{52}{100}$	$\frac{48}{100}$
$1*10^3$	487	513	$\frac{487}{1000}$	$\frac{513}{1000}$
$1*10^4$	4983	5017	$\frac{4983}{10000}$	$\frac{5017}{10000}$

With this data, you can see a pattern starting to merge. The more times we flip the coin, the closer $\frac{Heads}{Flips}$ and $\frac{Tails}{Flips}$ approaches $\frac{1}{2}$. The probability that heads or tails occurs is derived by assuming that nearly infinite flips have occurred, which would give us $\frac{1}{2}$ for both heads and tails. This is known as the Law of Large Numbers.

However, rather than using fractional notation, mathematician instead formally define it as $P(E)$, the probability of a variable event $E$ occurring. With $H = Heads$ and $T = Tails$ as the two outcomes (events) of the coin, we can then say that the $P(H) = \frac{Heads}{Flips} = \frac{1}{2}$ and $P(T) = \frac{Tails}{Flips} = \frac{1}{2}$.

Sample and Event Space Visualization — Visualization of events in a sample space.

Assume there is an unfair coin with outcomes heads and tails such that $P(H) = 0.75$ and $P(T) = 0.25$. If we flip this coin 1000 times, which of the following outcomes is most likely?

A

Between 200-300 heads

B

Exactly 750 heads

C

Between 700 and 800 heads

D

Exactly 250 heads

$Heads = \frac{Heads}{Flips} \times Flips = P(H) \times Flips$

For an unfair coin with $P(H) = 0.75$ and $P(T) = 0.25$, we would expect approximately $75%$ flips to be heads. Using the fractional definition, we get that $Heads = \frac{Heads}{Flips} \times Flips = P(H) \times Flips$. With $1000$ $Flips$, the expected number of heads would be $1000 × 0.75 = 750$. While it's possible to get exactly 750 heads, the actual outcome is likely to be within a range around this value. Therefore, getting between 700 and 800 heads (option C) is most likely, as this range contains the expected value and accounts for some variation.

Axioms of Probability

Probability theory is built on three fundamental axioms:

Non-negativity: The probability of an event is always greater than or equal to $0$.
- $P(E) \geq 0$ for any event $E$.
- Example: The probability of rolling a $6$ on a die cannot be negative.
Normalization: The probability of the entire sample space (all possible outcomes) is $1$.
- $P(S) = 1$, where $S$ is the sample space.
- Example: When flipping a coin, $P(\text{Heads}) + P(\text{Tails}) = 1$.
Additivity: For mutually exclusive (disjoint) events, the probability of either event occurring is the sum of their individual probabilities.
- If $E_1$ and $E_2$ are mutually exclusive, then $P(E_1 \cup E_2) = P(E_1) + P(E_2)$. Furthermore, $P(E_1 \cap E_2) = P(\emptyset) = 0$.
- Example: The probability of rolling either a $1$ or a $2$ on a fair die is $P(1) + P(2) = \frac{1}{6} + \frac{1}{6} = \frac{1}{3}$.

These axioms form the mathematical foundation for probability and are essential for understanding how we update our beliefs in Probabilistic Occupancy Grid Mapping.

Probabilistic OGM utilizes probability theory for a more adaptive approach, incorporating uncertainty through probabilities and updating the map based on likelihoods rather than binary labels.

Multiplication Rule

There are two cases to the multiplication rule: independent and dependent. The independent version states that the probability of two independent events: $P(A)$ and $P(B)$ occurring is $P(A) \cdot P(B)$. The dependent version states that the probability of two dependent events: $P(A)$ and $P(B)$ occurring is $P(A|B)$, probability of $A$ occurs given $B$ occurs, times $P(B)$. We will go over what $P(A|B)$ means in the next section. Also, mathematicians don't like using words, so they replace and with $\cap$. Thus, we write these rules formally as:

$\text{Independent: } P(A \cap B) = P(A) \cdot P(B)$
$\text{Dependent: } P(A \cap B) = P(A|B) \cdot P(B)$

Consider two fair coins: what is the probability that both coins land on heads?

A

0.25

B

0.50

C

0.20

D

0.60

$P(A \cap B) = P(A) \cdot P(B)$

We define $H_1 = \text{coin 1 lands on heads}$ and $H_2 = \text{coin 2 lands on heads}$. Since it is a fair coin, we know that $P(H_1) = P(H_2) = \frac{1}{2}$. Then, the probability that $P(H_1 \cap H_2) = \frac{1}{2} \cdot \frac{1}{2} = \frac{1}{4} = 0.25$. Thus, the answer is A.

Conditional Probability

Conditional probability is how the knowledge that one even occurs influences the probability of another. Formally, it is written as:

$P(A|B) = \frac{P(A \cap B)}{P(B)}$

the probability that $A$ occurs given $B$, where $A$ and $B$ are events of one sample space $\Omega$.

Another way to understand this is to treat $P(B)$ as your probability space and calculate the $P(A)$ takes in $P(B)$. For example, $P(E|\Omega)$, the probability that an event $E$ occurs if any event occurs, is $P(E|\Omega) = \frac{P(E \cap \Omega)}{P(\Omega)} = \frac{P(E)}{1} = P(E)$. However, With just the probabilities $P(A)$ and $P(B)$, there is no way to calculate $P(A|B)$. Therefore, we introduce the Bayes' Theorem, which gives us more ways to solve for $P(A|B)$.

Conditional Probability Visualization — Conditional probability visualization.

Eddy is a superstitious gambler. He visits the casino every night except on rainy days, which occur on average once every four days. However, he also believes that on every fifth rainy day, his luck will increase dramatically, prompting him to visit the casino that night. What is the probability that on a day, Eddy visits the casino and it is raining?

A

0.50

B

0.20

C

0.25

D

0.05

$P(A|B) = \frac{P(A \cap B)}{P(B)}$

In this problem, we are attempting to solve for $P(A \cap B)$. To do this, we are given three essential pieces of information. Eddy visits the casino every night, it rains every four days, and every fifth rainy day Eddy goes to the casino. Thus, we can define the events $C = \text{Eddy visits the casino}$, $R = \text{it is raining}$. $P(C) = 1$, $P(R) = 0.2$, and $P(C|R) = 0.2$, the probability that Eddy visits the casino given that it is raining. Thus, using the formula: $P(C|R) = \frac{P(C \cap R)}{P(R)} \rightarrow P(C|R) \cdot P(R) = P(C \cap R) = 0.2 * 0.25 = 0.05$. Which is answer D.

Bayes' Theorem

Bayes' Theorem is a formula that gives another way to compute for $P(A|B)$ from known information. Written out, the formula is:

$P(A|B) = \frac{P(B|A) P(A)}{P(B)}$

$P(A|B) = \text{Posterior}$: what you believe after seeing the data.
$P(B|A) = \text{Likelihood}$: how well each hypothesis predicts the new data.
$P(A) = \text{Prior}$: what you believed before new data.
$P(B) = \text{Marginal}$: normalizing factor to ensure all posteriors sum to 1.

Let $P(A|B) := \frac{P(A \cap B)}{P(B)}$ and $P(B|A) := \frac{P(B \cap A)}{P(A)}$. We want to show that $P(A|B) = \frac{P(B|A) P(A)}{P(B)}$. $$ P(A|B) = \frac{P(A \cap B)}{P(B)} = \frac{\frac{P(A \cap B)}{P(A)}P(A)}{P(B)} = \frac{P(B|A) P(A)}{P(B)} $$ Thus, we have proved that $P(A|B) = \frac{P(B|A) P(A)}{P(B)}$.

So why is Bayes' Theorem useful? The key property of the theorem is that we can invert $P(B|A)$ to $P(A|B)$, allowing us to re-evaluate our beliefs with the introduction of new data. One way to view Bayes' Theorem is through the lens of a cookie jar analogy. Let $A = \text{hypothesis}$ and $B = \text{Data}$. Then Bayes' Theorem becomes:

$P(\text{Hypothesis}|\text{Data}) = \frac{P(\text{Data}|\text {Hypothesis})P(\text{Hypothesis})}{P(\text{Data})}$

Term	In the formula	Cookie-jar analogy
Prior	$P(\text{Hypothesis})$	Before you look, your guess at "What flavor is most likely in my hand?" based on what you know about the jar.
Likelihood	$P(\text{Data}\\|\text{Hypothesis})$	"If the cookie is chocolate-chip, how likely am I to see these chocolate crumbs on my sleeve?"
Marginal	$P(\text{Data})$	"Overall chance I’d get crumbs on my sleeve if I randomly grabbed any cookie"—averaging over all flavors.
Posterior	$P(\text{Hypothesis}\|\text{Data})$	After seeing crumbs, your updated guess about the flavor in your hand.

Bayes' Theorem is used as a foundational concept in Probabilistic Occupancy Grid Mapping, and is how we compute the probability of each cell.

Law of total probability

The law of total probability states that the probability of any event $P(A)$ can be represented as a sum of its disjoint parts. Formally, it is written as:

$P(A) = P(A \cap B_1) + P(A \cap B_2) ... + P(A \cap B_k) = \sum_{i=1}^{k}P(A \cap B_i)$

From our previous knowledge of conditional probability, we also know that $P(A|B) = \frac{P(A \cap B)}{P(B)}$. Therefore, we can also write this equation as:

$P(A) = P(A|B_1)P(B_1) + P(A|B_2)P(B_2) ... + P(A|B_k)P(B_k) = \sum_{i=1}^{k}P(A|B_i)P(B_i)$

This fact forms the basis as to why we can use Bayes' Theorem to compute probability in Probabilistic OGM.

Total probability visualization — Total probability: adding up the four squares, we get $P(A)$

After a particularly devastating night, Eddy has now toned down on his gambling. Now, He visits the casino every other night. He still doesn't go on rainy days, which still occur once every four days, and remains in firm belief that on every fifth rainy day, his luck will increase dramatically. What is the probability that it is raining given that Eddy is at the casino?

A

$\frac{5}{54}$

B

$\frac{2}{17}$

C

$\frac{1}{20}$

D

$\frac{3}{15}$

$P(A|B) = \frac{P(B|A) P(A)}{P(B)}$

This problem is a lot more complex than the first problem. However, using Bayes' Theorem and the Law of Total Probability, we can still solve it. We begin by defining the events $C = \text{Eddy goes to the Casino}$ and $R = \text{it is raining}$. We know that $P(C|R) = \frac{1}{5}$ and that $P(R) = \frac{1}{4}$. However, we cannot say that $P(C) = \frac{1}{2}$, as that is the probability he goes to the casino when it is not raining.

Thus, we define one more event: $\neg R = \text{it is not raining}$. Then, we know that $P(C|\neg R) = 0.50$. Since we also know that on any given day, it can either be raining or not raining, we then know that $P(\neg R) = \frac{4}{5}$, as $P(R) + P(\neg R) = 1$. Furthermore, we can now use the law of total probability to solve for $P(C)$ $$P(C) = P(C|R)P(R) + P(C|\neg R)P(\neg R) = \frac{1}{5} \cdot \frac{1}{4} + \frac{1}{2} \cdot \frac{4}{5} = \frac{2}{4} + \frac{15}{40} = \frac{17}{40}$$ Then, using Bayes' Theorem, we solve for $P(R|C)$ $$P(R|C) = \frac{P(C|R)P(R)}{P(C)} = \frac{\frac{1}{5}\frac{1}{4}}{\frac{17}{40}} = \frac{\frac{1}{20}}{\frac{17}{40}} = \frac{1}{20} \cdot \frac{40}{17} = \frac{2}{17}$$ Thus, the answer is B.

Flips	Heads	Tails	\(\frac{Heads}{Flips}\)	\(\frac{Tails}{Flips}\)
\(1*10^0\)	1	0	1	0
\(1*10^1\)	3	7	\(\frac{3}{10}\)	\(\frac{7}{10}\)
\(1*10^2\)	52	48	\(\frac{52}{100}\)	\(\frac{48}{100}\)
\(1*10^3\)	487	513	\(\frac{487}{1000}\)	\(\frac{513}{1000}\)
\(1*10^4\)	4983	5017	\(\frac{4983}{10000}\)	\(\frac{5017}{10000}\)

Term	In the formula	Cookie-jar analogy
Prior	\(P(\text{Hypothesis})\)	Before you look, your guess at "What flavor is most likely in my hand?" based on what you know about the jar.
Likelihood	\(P(\text{Data}\\|\text{Hypothesis})\)	"If the cookie is chocolate-chip, how likely am I to see these chocolate crumbs on my sleeve?"
Marginal	\(P(\text{Data})\)	"Overall chance I’d get crumbs on my sleeve if I randomly grabbed any cookie"—averaging over all flavors.
Posterior	\(P(\text{Hypothesis}\|\text{Data})\)	After seeing crumbs, your updated guess about the flavor in your hand.