Bernoulli Trials
To continue our exploration of discrete distributions, we will look at situations that have two disjoint possibilities.
For math symbols to represent a Bernoulli trial, the events
For example, for one flip of a coin
Arrangements
Permutations (and the number of permutations) are the arrangements when order matters
Combinations (and the number of combinations) are the arrangements when order does not matter
Flipping 3 fair coins, what is the probability that heads will be observed exactly twice?
Possibility Spaces
<- c("H", "T")
coin <- data.frame(expand.grid(coin, coin, coin)) |>
df ::unite("obs", c("Var1", "Var2", "Var3"),
tidyrsep = "", remove = FALSE)
# print
dput(df$obs)
c("HHH", "THH", "HTH", "TTH", "HHT", "THT", "HTT", "TTT")
<- c("H", "T")
coin <- data.frame(expand.grid(coin, coin, coin, coin)) |>
df ::unite("obs", c("Var1", "Var2", "Var3", "Var4"),
tidyrsep = "", remove = FALSE)
# print
dput(df$obs)
c("HHHH", "THHH", "HTHH", "TTHH", "HHTH", "THTH", "HTTH", "TTTH",
"HHHT", "THHT", "HTHT", "TTHT", "HHTT", "THTT", "HTTT", "TTTT"
)
<- c("H", "T")
coin <- data.frame(expand.grid(coin, coin, coin, coin, coin)) |>
df ::unite("obs", c("Var1", "Var2", "Var3", "Var4", "Var5"),
tidyrsep = "", remove = FALSE)
# print
dput(df$obs)
c("HHHHH", "THHHH", "HTHHH", "TTHHH", "HHTHH", "THTHH", "HTTHH",
"TTTHH", "HHHTH", "THHTH", "HTHTH", "TTHTH", "HHTTH", "THTTH",
"HTTTH", "TTTTH", "HHHHT", "THHHT", "HTHHT", "TTHHT", "HHTHT",
"THTHT", "HTTHT", "TTTHT", "HHHTT", "THHTT", "HTHTT", "TTHTT",
"HHTTT", "THTTT", "HTTTT", "TTTTT")
Choose
- said ``n choose k’’
- This choose operator keeps track of the number of permutations in a certain combination
- note
(to avoid dividing by zero)
Binomial Distribution
, where and are whole numbers
Example: Squirtle
, where and are whole numbers
Historically, Squirtle defeats Charizard 32% of the time. If there are 5 battles, what is the probability that Squirtle wins exactly 2 times?
Battle Arena
<- c("S", "C")
battle <- data.frame(expand.grid(battle, battle, battle, battle, battle)) |>
df ::unite("obs", c("Var1", "Var2", "Var3", "Var4", "Var5"),
tidyrsep = "", remove = FALSE)
# print
dput(df$obs)
c("SSSSS", "CSSSS", "SCSSS", "CCSSS", "SSCSS", "CSCSS", "SCCSS",
"CCCSS", "SSSCS", "CSSCS", "SCSCS", "CCSCS", "SSCCS", "CSCCS",
"SCCCS", "CCCCS", "SSSSC", "CSSSC", "SCSSC", "CCSSC", "SSCSC",
"CSCSC", "SCCSC", "CCCSC", "SSSCC", "CSSCC", "SCSCC", "CCSCC",
"SSCCC", "CSCCC", "SCCCC", "CCCCC")
But these observations have different weights!
Example: Charizard
Historically, Charizard defeats Squirtle 68% of the time. If there are 5 battles, what is the probability that Charizard wins exactly 3 times?
Symmetry
The previous two examples had the same answer, which is true due to a symmetry property in the choose operator:
<- 0:5
k <- dbinom(k, 5, 0.32)
pk <- k == 2
k_bool <- data.frame(k, pk, k_bool)
df
|>
df ggplot(aes(x = k, y = pk,
color = k_bool, fill = k_bool)) +
geom_bar(stat = "identity") +
geom_label(aes(x = k, y = pk,
label = round(pk, 4)),
color = "black", fill = "white") +
labs(title = "2 Squirtle Wins",
subtitle = "n = 5, k = 2, p = 0.32, P(k = 2) = 0.3220",
caption = "Math 32",
x = "wins",
y = "probability") +
scale_color_manual(values = c("black", "#ca7721")) +
scale_fill_manual(values = c("gray70", "#297383")) +
theme(
legend.position = "none",
panel.background = element_blank()
)
# plotly::ggplotly(ex2_plot)
<- 0:5
k <- dbinom(k, 5, 0.68)
pk <- k == 3
k_bool <- data.frame(k, pk, k_bool)
df
|>
df ggplot(aes(x = k, y = pk,
color = k_bool, fill = k_bool)) +
geom_bar(stat = "identity") +
geom_label(aes(x = k, y = pk,
label = round(pk, 4)),
color = "black", fill = "white") +
labs(title = "3 Charizard Wins",
subtitle = "n = 5, k = 3, p = 0.68, P(k = 3) = 0.3220",
caption = "Math 32",
x = "wins",
y = "probability") +
scale_color_manual(values = c("black", "#de5138")) +
scale_fill_manual(values = c("gray70", "#e53800")) +
theme(
legend.position = "none",
panel.background = element_blank()
)
# plotly::ggplotly(ex2_plot)
At first, it does not matter how you define the binomial setting for what corresponds to
Parameters
The notation
Parameters
The notation
We are assuming that the
In other words, we are sampling the Bernoulli trial
Looking Ahead
- due Fri., Feb. 10:
- WHW4
- JHW2
- Demographics Survey Part 1
- Be mindful of before-lecture quizzes
No lecture session for Math 32:
- Feb 20, Mar 10, Mar 24