Tools: Matrix Algebra Tool | Markov Process Simulation | Markov Process Utility
|
Sample Space and Events
An experiment is an occurrence whose result, or outcome is uncertain. The set of all possible outcomes is called the sample space for the experiment. Given a sample space S, an event E is a subset of S. The outcomes in E are called the favorable outcomes. We say that E occurs in a particular experiment if the outcome of that experiment is one of the elements of E, that is, if the outcome of the experiment is favorable. On-Line Tutorial Beginning With This Topic |
Example
1. Experiment: Cast a die and observe the number facing up.
2. Here is an experiment that simulates tossing three fair distinguishable coins. To run the experiment, press the "Toss Coins" button and record the occurrences of heads and tails. The sample space is the set of eight possible outcomes:
Let E be the event that heads comes up at least twice.
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Combining Events
If E and F are events in an experiment, then: E' is the event that E does not occur. E E E and F are said to be disjoint or mutually exclusive if (E |
Example
Let S be the sample space for the coin tossing experiment in the box above, so that S = {HHH, HHT, HTH, HTT, THH, THT, TTH, TTT}. Let E be the event that heads comes up at least twice; E = {HHH, HHT, HTH, THH}, and let F be the event that tails comes up at least once;F = {HHT, HTH, HTT, THH, THT, TTH, TTT}. Then: E' = {HTT, THT, TTH, TTT}
E and F are not mutually exclusive, since E |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Estimated Probability
If an experiment is performed N times, and the event E occurs fr(E) times, then the ratio
The number fr(E) is called the frequency of E. N, the number of times that the experiment is performed, is called the number of trials or the sample size. If E consists of a single outcome s, then we refer to P(E) as the estimated probability of the outcome s, and write P(s). On-Line Tutorial on Estimated Probability |
Example
In the above experiment (toss three coins) take E to be the event that heads comes up at least twice. To compute its estimated probability, let us use the simulated experiment below. Every time you press "Toss Coins" the web page will compute both fr(E) and P(E).
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Some Properties of Estimated Probability
Let S = {s1, s2, ... , sn} be a sample space and let P(si) be the estimated probability of the event {si}. Then
(a) 0 ≤ P(si) ≤ 1
In words: (a) The estimated probability of each outcome is a number between 0 and 1.
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Theoretical Probability
The theoretical probability, P(E), of an event E is its probability determined from the nature of the experiment rather than through actual experimentation. The estimated probability approaches the theoretical probability as the number of trials gets larger and larger. Notes 1. We write P(E) for both estimated and theoretical probability. Which one we are referring to should always be clear from the context.
|
Example
In the above experiment, there are eight outcomes in S, and half of them are in E. Therefore, we expect E to occur half the time. In other words, the theoretical probability of E is 0.5. If you "toss the coins" in the simulated experiment arriba a large number of times, the estimated probability should approach the theoretical probabilty of 0.5. In the following simulation, you can toss the coins 50 times with each click on the button.
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Computing Theoretical Probability for Equally Likely Outcomes
In an experiment in which all outcomes are equally likely, the probability of an event E is given by
|
Example
In the above experiment (toss three coins) there are eight equally likely outcomes in S, and half of them are in E (the event that heads comes up at least twice). Therefore,
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Probability Distributions
Since the different kinds of probability share similar properties, we refer to estimated and theoretical probability collectively as just probability. What they have in common is the idea of a probability distribution: A finite sample space is just a finite set S. A probability distribution is an assignment of a number P(si) to each outcome si in a sample space S ={s1, s2, ... , sn} so that
(a) 0 ≤ P(si) ≤ 1
P(si) is called the probability of si. Given a probability distribution, we obtain the probability of an event E by adding up the probabilities of the outcomes in E. If P(E) = 0, we call E an impossible event. The event Notes 1. All the above properties apply to both estimated and theoretical probability. Thus, when we speak only of "probability," we could mean either estimated or theoretical probability, depending on the context. On-Line Tutorial on Probability Distributions |
Example
1. All the examples of estimated and theoretical probability above give examples of probability distributions. 2. Let us take S = {H, T} and make the assignments P(H) = 0.2, P(T) = 0.8. Since these numbers are between 0 and 1, and add to 1, they specify a probability distribution. 3. With S = (H, T} again, we could also take P(H) = 1, P(T) = 0, so that T is an impossible event. 4. The following table gives a probability distribution for the sample space S = {1, 2, 3, 4, 5, 6}.
It follows that
5. Suppose we toss three coins as above, but this time, we only look at the number of heads that come up. In other words, S = {0, 1, 2, 3}. The (theoretical) probability distribution is given by counting the number of combinations that give 0, 1, 2, and 3 heads:
The following simulation computes the estimated probability distribution of the above experiment. You will find that, after many coin tosses, these probabilities converge to the theoretical ones shown above.
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Addition Principle
Mutually Exclusive Events
General Addition Principle
|
Example
Let S be the sample space for the coin-tossing experiment above; S = {HHH, HHT, HTH, HTT, THH, THT, TTH, TTT} Let E be the event that heads comes up at exactly once;
Then E and F are mutually exclusive, and
Now let E be as above, and let F be the event that heads comes up at most once:
F is the event that heads comes up exactly once (= E). Therefore,
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Further Properties of Probability
The following are true for any sample space S and any event E.
|
Example
Continuing with S = {HHH, HHT, HTH, HTT, THH, THT, TTH, TTT} Let E be the event that heads comes up at exactly once;
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Conditional Probability
If E and F are two events, then the conditional probability, P(E | F), is the probability that E occurs, given that F occurs, and is defined by
We can rewrite this formula in a form known as the multiplication principle:
F) = P(F)P(E | F). Conditional Estimated Probability
Conditional Probability for Equally Likely Outcomes
On-Line Tutorial Beginning With This Topic |
Example
Let S be the original sample space for the experiment above; S = {HHH, HHT, HTH, HTT, THH, THT, TTH, TTT} Let E be the event that heads comes up at exactly once;
Then the probability that heads comes up exactly once, given that the first comes up heads is
Since the outcomes in this experiment are equally likely, we could also use the formula
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Independent Events
The events E and F are independent if
If two events E and F are not independent, then they are dependent. Given any number of mutually independent events (that is, each one of them is independent of the intersection of any combination of the others), the probability of their intersection is the product of the probabilities of the individual events. |
Example
As in the example immediately above, let S be the original sample space for the experiment above; S = {HHH, HHT, HTH, HTT, THH, THT, TTH, TTT} Let E be the event that heads comes up at exactly once;
To test these two events for independence, we check the formula
E F = {HTT}, so that P(E F) = 1/8.
1/8, |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Bayes' Theorem
The short form of Bayes' Theorem states that if E and F are events, then
We can often compute P(F | E) by instead constructing a probability tree. (To see how, go to the tutorial by following the link below.) An expanded form of Bayes' Theorem states that if E is an event, and if F1, F2, and F3 are a partition of the sample space S, then
A similar formula works for a partition of S into four or more events. On-Line Tutorial Beginning With This Topic |
Example
If P(E | F) = 0.95 P(E | F') = 0.15 P(F) = 0.1 P(F') = 0.9, then
This example comes from a scenario discussed in the tutorial (link on the adjacent box). |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Markov Systems, State Transition Diagram, Transition Matrix
A Markov system (or Markov process or Markov chain) is a system that can be in one of several (numbered) states, and can pass from one state to another each time step according to fixed probabilities. If a Markov system is in state i, there is a fixed probability, pij, of it going into state j the next time step, and pij is called a transition probability. A Markov system can be illustrated by means of a state transition diagram, which is a diagram showing all the states and transition probabilities. (See example opposite.)
The matrix P whose ijth entry is pij is called the transition matrix associated with the system. The entries in each row add up to 1. Thus, for instance, a 2 ![]() |
Example
Transition Diagram:
Matrix:
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Distribution Vectors and Powers of the Transition Matrix
A distribution vector is a row vector with one nonnegative entry for each state in the system. The entries can represent the number of individuals in each state of the system. A probability vector is a row vector in which the entries are nonnegative and add up to 1. The entries in a probability vector can represent the probabilities of finding a system in each of the states. If v is an initial distribution vector and P is the transition matrix for a Markov system, then the distribution vector after 1 step is the matrix product, vP.
P2 is the two-step transition matrix for the system. Similarly, P3 is the three-step transition matrix, and Pn is the n-step transition matrix. This means that the ijth entry in Pn is the probability that the system will pass from state i to state j in n steps. Try our on-line utility which computes powers of the transition matrix and its action on the distribution vector. Even better (and far more flexible) is our Matrix Algebra Tool, (updated in November 2003) where you can compute several things simultaneously, have the answers shown in fraction or decimal form, and also compute inverses and even complicated expressions involving several matrices. |
Examples
Let
and let v = [ 100 200 300 ] be an initial distribution. Then the distribution after one step is given by
The distribution one step later is given by
To obtain the two-step transition matrix, we calculate
Thus, for example, the probability of going from State 3 to State 1 in two steps is given by the 3,1-entry in P2, namely 0.3. |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Long Term Behavior of Markov Systems
If P is a transition matrix for a Markov system, and if v is a distribution vector with the property that vP = v, then we refer to v as a steady state (distribution) vector. To find a steady state distribution for a Markov System with transition matrix P, we solve the system of equations given by
Long Term Behavior If the higher and higher powers of P approach a fixed matrix P
Using technology, it is often possible to approximate P |
Example
The transition matrix
that we looked at above is regular, since P3 contains only non-zero entries. (Check this on your graphing calculator or on our on-line utility.) The steady state probability vector is given by
The long-term transition matrix is therefore
(To check this, calculate P256 on the on-line utility.) Using the Matrix Algebra Tool Open the Matrix Algebra Tool and enter the matrix P using the following format (note the commas between each pair of entries in each row.): (If you like, you can just copy the blue text below and paste it into the window.)
0.4, 0, 0.6 0.5, 0.5, 0] |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||