Summary of Chapter 8 in

Student Home
True/False Quiz Review Exercises OnLine Tutorial Summary Index Everything for Finite Math Everything for Calculus Everything for Finite Math & Calculus Chapter 7 Summary 
Tools: Histogram & Probability Distribution Generator  Binomial Distribution Tool (Bernoulli Trials)  Normal Distribution Utility  Normal Distribution Table
Random Variable
A random variable X is a rule that assigns a numerical value to each outcome in the sample space of an experiment. A discrete random variable can take on specific, isolated numerical values, like the outcome of a roll of a die, or the number of dollars in a randomly chosen bank account. A continuous random variable can take on any values within a continuum or an interval, like the temperature in Central Park, or the height of an athlete in centimeters. Discrete random variables that can take on only finitely many values (like the outcome of a roll of a die) are called finite random variables. 
Example
1. Finite Random Variable In an experiment to simulate tossing three coins, let X be the number of heads showing after each toss. X is a finite random variable that can assume the three values: 0, 1, 2, and 3.
2. Infinite Discrete Random Variable Roll a die until you get a get a 6; X = the number of times you roll the die. The possible values for X are 1, 2, 3, 4, ... (If you are extremely unlucky, it might take you a million rolls before you get a 6!) 3. Continuous Random Variable Measure the length of an object; X = its length in cm.


Probability Distribution
The probability P(X = x) is the probability of the event that X = x. Similarly, the probability that P(a < X < b) is the probability of the event that X lies between a and b. These probabilities may be estimated, empirical, or abstract (see Chapter 7 in Finite Mathematics or the Probability Summary for a discussion of these estimated, empirical, and abstract of probability.) For a finite random variable, the collection of numbers P(X = x) as x varies is called the probability distribution of X, and it is useful to graph the probability distribution as a histogram. Press here for an online utility that will generate any probability distribution and also show you the histogram. 
Example
Estimated Probability Distribution Let X be the number of heads showing after each toss of three coins (see above). The following simulation shows the estimated probability distribution (relative frequency distribution) of X.
Empirical Probability Distribution For the experiment above, the empirical probability distribution is given by the following histogram.
The empirical probability distribution is given by counting the number of combinations that give 0, 1, 2, or 3 heads. 

Bernoulli Trials and the Binomial Distribution
A Bernoulli trial is an experiment with two possible outcomes, called success and failure. Each outcome has a specified probability: p for success and q for failure (so that p+q = 1). If we perform a sequence of n independent Bernoulli trials, then some of them result in success and the rest of them in failure. The probability of exactly x successes in such a sequence is given by
For an online utility which allows you to compute and graph the probability distribution for Bernoulli trials, press here. If X is the number of successes in a sequence of n independent Bernoulli trials, with probability p for success and q for failure, then X is said to have a binomial distribution. This distribution is given by the above formula
For an online utility which allows you to compute and graph the probability distribution for Bernoulli trials, press here. 
Examples
Suppose we toss an unfair coin, with p = P(heads) = 0.8 and q = P(tails) = 0.2, three times. Take X = number of heads.
Then the distribution is given by
The histogram density function given above results from to tossing a fair coin three times, and is also a binomial distribution. Estimated Binomial Probability Distribution Here is a simulation of the above cointossing experiment.


Measures of Central Tendency:
Mean, Median, and Mode of a Set of Data A collection of specific values, or "scores", x_{1}, x_{2}, . . ., x_{n} of a random variable X is called a sample. If {x_{1}, x_{2}, . . ., x_{n}} is a sample, then the sample mean of the collection is
The sample median m is the middle score (in the case of an oddsize sample), or average of the two middle scores (in the case of an evensize sample), when the scores in a sample are arranged in ascending order. A sample mode is a score that appears most often in the collection. (There may be more than one mode in a sample.) If the sample x_{1}, x_{2}, . . ., x_{n} we are using consists of all the values of X from an entire population (for instance, the SAT of every graduating high school student who took the test), we refer to the mean, median, and mode above as the population mean, median, and mode. We write the population mean as μ instead of . 
Example
Consider the following collection of scores:
The sum is ∑_{i} = 40, and n = 8, so that
To get the sample median, arrange the scores in increasing order, and select the middle scores (two of them since n is even):
The sample median is the average, 3.5, of these middle scores. Since the score 3 appears most often, the sample mode is 3. 

Mean, Median, and Mode of a Random Variable
If X is a finite random variable taking on values x_{1}, x_{2}, . . ., x_{n}, the mean or expected value of X, written μ, or E(X), is
A mode of X is a number m such that P(X = m) is largest. This is the most likely value of X or one of the most likely values if X has several values with the same largest probability. For a continuous random variable, a mode is a number m such that the probability density function is highest at x = m. The expected value, median, and mode of a random variable are the average, median, and mode we expect to get if we have a large number of Xscores. Conversely, if all we know about X is a collection of Xscores, then the average, median and mode of those scores are our best estimates of the expected value, median and mode of X. 
Example
Suppose we toss an unfair coin, with p = P(heads) = 0.8 and q = P(tails) = 0.2, three times. Take X = number of heads. Then the distribution (see above) is given by
The expected value of X is given by E(X) = ∑ (x_{i}.P(X = x_{i})
The median is 3, since P(X ≤ 3) = 1 ≥ 1/2 and P(X ≥ 3) = 0.512 ≥ 1/2. Further, 3 is the least value of X with this property. The mode is also 3, since its probability is the greatest. 

Measures of Dispersion
Sample Variance and Sample Standard Deviation Given a set of numbers x_{1}, x_{2}, . . . , x_{n} the sample variance is
The sample standard deviation is the square root, s, of the sample variance. Population Variance and Population Standard Deviation The population variance and standard deviation have slightly different formulas from those of the corresponding statistics for samples. Given a set of numbers x_{1}, x_{2}, . . . , x_{n} the population variance, σ^{2}, is
The population standard deviation, σ, is the square root of the population variance. To read more about the difference between the sample and population variance and standard deviation, go to our online text: Sampling Distributions. 
Example
Consider the following collection of scores we looked at above.
We saw above that the smple mean is 5 (see the example "Mean, Median, and Mode of a Set of Data" above). The following table shows the squares of the differences from the mean, which we use to compute the sample variance and standard deviation.
The sum of the entires in the bottom row is ∑ (x_{i}  )^{2} = 103. Therefore,
For the population variance, we divide 103 by n = 8 instead of 7, getting
σ 3.588 

Variance and Standard Deviation of a Random Variable
If X is a random variable, its variance is defined to be
The variance and standard deviation of a random variable are the sample variance and sample standard deviation we expect to get if we have a large number of Xscores. Conversely, if all we know about X is a collection of Xscores, then the sample variance and sample standard deviation of those scores are our best estimates of the variance and standard deviation of X. 
Example
Let us look again at the experiment in which we toss an unfair coin, with p = P(heads) = 0.8 and q = P(tails) = 0.2, three times. (X = number of heads.) Here is the distribution with the x^{2} scores added.
We saw above that μ = 2.4. Further, E(X^{2}) = ∑ (x_{i}^{2}.P(X = x_{i})
Therefore,


Interpreting Standard Deviation
Chebyshev's Rule For a sets of data, the following is true.
Empirical Rule For a set of data whose frequency distribution is "bellshaped" and symmetric (as in the figure), the following is true.

Example
Looking at the binomial distribution immediately above, we have
s(X) = 0.48^{1/2} 0.69 Chebyshev's Rule now says:
However, we cannot apply the Empirical Rule to this disrtibution (look at the probability deistribution in the box above and notice that it is not symmeric). Example of Empirical Rule If the mean of a sample with a bellshaped symmetric distribution is 20 with standard deviation s = 2, then approximately 95% of the scores lie in the interval [16, 24]. 

Statistics of a Binomial Distribution
If X is the number of successes in a sequence of n independent Bernoulli trials, with probability p of success in each trial and probability q = 1p of failure, then
If n is large and p is not too close to 0 or 1, the median is approximately equal to the mean, np (which will also be the mode in this case). 
Example
Looking at the unfair coin experiment immediately above, with n = 3, p = P(heads) = 0.8 and q = P(tails) = 0.2, we find


Continuous Random Variable
A continuous random variable X may take on any real value whatsoever. The probabilities P(a ≤ X ≤ b) are specified by means of a probability density curve, a curve lying above the xaxis with the total area between the curve and the xaxis being 1. The probability P(a ≤ X ≤ b) is given by the area enclosed by the curve, the xaxis, and the lines x = a and x = b.

Examples
For a detailed discussion of several examples (the uniform, exponential, normal, and beta distributions) go to the online section on probability density functions. (To activate the links there, press the dots and not the words...) 

Uniform Distribution
A finite uniform distribution is one in which all values of X are equally likely. A continuous uniform distribution is one whose probability density function is a horizontal line. 
Example
The experiment: Cast a die and record the number uppermost


Normal Random Variable
The most important kind of continuous random variable is the normal random variable. It is one with a bellshaped probability density curve given by the following equation.
Standard Normal Variable
To compute areas under normal curves without having to use a table, try our Normal Distribution Utility. 
Example
If Z is the standard normal variable, then
For online interactive text on the role of the uniform distribution in measurements of sample means, go to our online text: Sampling Distributions. For a calculusbased discussion of this and other distributions (the uniform, exponential, and beta distributions) go to the online section on probability density functions. (To activate the links there, press the dots and not the words...) 

More on Normal Distributions
Probability of a Normal Distribution Being within k Standard Deviations of its Mean If X is a normal random variable with mean and standard deviation s, then
Normal Approximation to a Binomial Distribution
