Exam 1 - Review#
The following is written by Gabriella Shalumov. I will add some remarks based on what we discussed in class, change a few things here and there and add some exercises.
Events and Bayes Theorem#
Tip
What to know
Know how to compute probabilities if an experiment is described, and the outcomes are equally likely to occur:
\(\displaystyle P(E)=\frac{n(\textrm{outcomes you like})}{n(\textrm{outcomes you can choose from})} = \frac{n(E)}{n(\Omega)}\)
\(\displaystyle P(E_1 \cup E_2)=\frac{n(\textrm{outcomes you like})}{n(\textrm{outcomes you can choose from})}=\frac{n(\textrm{outcomes in one event})+n(\textrm{outcomes in the other event)} -n(\textrm{outcomes that were counted twice})}{n(\textrm{all outcomes you can choose from})}=\frac{n(E_1)+n(E_2)-n(E_1E_2)}{n(\Omega)}\)
2’. if the outcomes are not equally likely to occur:
\(P(E_1 \cup E_2) =P(E_1)+P(E_2)-P(E_1E_2)\))\(\displaystyle P(E|F) = \frac{n(\textrm{outcomes you like})}{n(\textrm{outcomes you can choose from})} = \frac{n(EF)}{n(F)}\)
3’. if the outcomes are not equally likely to occur:
\(\displaystyle P(E|F)=\frac{P(EF)}{P(F)}\)
Know the axioms of probability in general:
\(P(E)\geq 0\)
\(P(\Omega)=1\)
\(P(E\cup F)=P(E)+P(F)\), if \(EF=\emptyset\)
\(P(E|F)P(F)=P(EF)\) (which is equivalent of moving on a probability tree)
Know Bayes’ theorem:
\(\displaystyle \textrm{posterior}=\frac{\textrm{likelihood}\cdot \textrm{prior}}{\textrm{marginal}}\)\(\displaystyle P(E|F)=\frac{P(F|E)P(E)}{P(F)}\)
Know how to compute probabilities on contingency tables, know how to work with probability trees.
Know what independence of (multiple) events means.
Discrete Random Variables#
There are countably (possibly finitely many) outcomes. Each outcome is labeled by a number (by the random variable \(X\)). Therefore the labels are discrete. For each label (i.e. possible values of \(X\)), you can create a row in a probability distribution table, or its own bar in a histogram.
\(\displaystyle E[X]=\sum_k k\cdot P(X=k)\)
\(\displaystyle E[f(X)]=\sum_k f(k)\cdot P(X=k)\)
\(\displaystyle E[X+Y]=E[X]+E[Y]\)
\(\displaystyle E[XY]=E[X]\cdot E[Y]\), if \(X\) and \(Y\) are independent, where \(X\) and \(Y\) are independent if \(P(X=k,Y=l)=P(X=k)\cdot P(Y=l)\) for any \(k,l\).
(The converse is not true.)
\(Cov(X,Y)=E[XY]-E[X]E[Y]=E[X-E[X]]\cdot E[Y-E[Y]]\) measures the degree two which the two variables change together. \(Cov(X,Y)\) suggests there’s no linear relationship between two variables, but doesn’t necessarily mean two variables are independent (they may have a nonlinear relationship).
\(\displaystyle Var(X)=\sum_k (k-E[X])^2\cdot P(X=k)=E[X^2]-(E[X])^2\)
Popular discrete distributions:#
Name |
Experiment |
Random variable |
\(P(X=k)\) |
Expected value |
Variance |
Example |
---|---|---|---|---|---|---|
Pick a random number from among \(\{1,2,3,\dots,n\}\) |
\(\textrm{Unif}(n)=\) the number you picked |
\(P(\textrm{Unif}(n)=k)=\frac{1}{n}\) for \(k=1,2,\dots,n\) |
\(E[\textrm{Unif}(n)]=\frac{1+n}{2}\) |
\(Var(\textrm{Unif}(n))=\frac{n^2-1}{12}\) |
Roll a die and look at the number on top. |
|
Toss a coin. |
\(\textrm{Bern}(p)=1\), for heads and \(\textrm{Bern}(p)=0\) for tails. |
\(P(\textrm{Bern}(p)=1)=p\), \(P(\textrm{Bern}(p)=0)=(1-p)\) for \(k=0\) or \(1\). |
\(E[\textrm{Bern}(p)]=p\) |
\(Var(\textrm{Bern}(p))=p(1-p)\) |
Is a single patient sick or not? |
|
Repeat a Bernoulli trial \(n\) times. |
\(\textrm{Binom}(n,p)=\) the number of heads you observe |
\(P(\textrm{Binom}(n,p)=k)={n \choose k}p^k(1-p)^{n-k}\) for \(k=0,1,\dots,n\) |
\(E[\textrm{Binom}(n,p)]=np\) |
\(Var(\textrm{Binom}(n,p))=np(1-p)\) |
Out of 100 patients, how many are sick? |
|
Repeat Bernoulli trials until you get a heads. |
\(\textrm{Geom}(p)=\) number tosses you observe. |
\(P(\textrm{Geom}(p)=k)=(1-p)^{k-1}p\) for \(k=1,2,3,\dots\) |
\(E[\textrm{Geom}(p)]=\frac{1}{p}\) |
\(Var(\textrm{Geom}(p))=\frac{1-p}{p^2}\) |
How many patients do you need to see until you find the first sick patient? |
|
Consider a large number of Bernoulli trials, where the probability of success is relatively small such that the product \(np=\lambda\) is still a moderate number. |
\(\textrm{Pois}(\lambda)=\) number of successes in those trials. |
\(\displaystyle P(X=k)=\frac{e^{-\lambda}\lambda^k}{k!}\) for \(k=0,1,2,\dots\) |
\(E[\textrm{Pois}(\lambda)]=\lambda\) |
\(Var(\textrm{Pois}(\lambda))=\lambda\) |
A large city hospital, on average, admits 10 patients with a specific rare disease every month. The number of admissions can vary from month to month, but over a long period, the average remains around 10 patients. How many patients will be admitted this month? |
Popular continuous distributions:#
Name |
Experiment |
Random variable |
\(f(X=x)\) |
Expected value |
Variance |
Example |
---|---|---|---|---|---|---|
Pick a random number from an interval \([a,b]\) |
\(\textrm{Unif}(a,b)=\) the number you picked |
\(f(\textrm{Unif}(a,b)=x)=\frac{1}{b-a}\) for \(x\in [a,b]\), 0 otherwise. |
\(E[\textrm{Unif}(a,b)]=\frac{a+b}{2}\) |
\(Var(\textrm{Unif}(n))=\frac{(b-a)^2}{12}\) |
Drop a dart on a line segment, measure how far the dart is. |
|
Record the time between events in a process in which events occurs continously and independently at a constant average rate. |
\(\textrm{Exp}(\lambda)=\) the time between two consecutive events. |
\(f(\textrm{Exp}(\lambda)=x)=\lambda e^{-\lambda x}\) for \(x\geq 0\), 0 otherwise. |
\(E[\textrm{Exp}(\lambda)]=\frac{1}{\lambda}\) |
\(Var(\textrm{Exp}(\lambda))=\frac{1}{\lambda^2}\) |
Measuring the time between consecutive phone calls at a customer service center. |
|
Symmetric processes in the nature (thanks to evolution), also the continous limit of Binomial. And most importantly, the Central Limit Theorem. |
\(\textrm{Normal}(\mu,\sigma)\) |
\(\displaystyle f(\textrm{Normal}(\mu,\sigma)=x) = \frac{1}{\sigma\sqrt{2\pi}} e^{-\frac{(x-\mu)^2}{2\sigma^2}}\) |
\(E[\textrm{Normal}(\mu,\sigma)]=\mu\) |
\(Var(\textrm{Normal}(\mu,\sigma))=\sigma^2\) |
Biology. Measurement errors. Means of samples of fixed size. |