![]() | 2. Probability Density Functions: Uniform, Exponential, Normal, and Beta | ![]() | Section 3 Exercises | ![]() | 4. You're the Expert ![]() | ![]() | Calculus and Probability Main Page | ![]() | "Real World" Page |
Mean
In the last section we saw that if saving and loan institutions are continuously failing at a rate of $5%$ per year, then the associated probability density function is
Mean or Expected Value
If $X$ is a continuous random variable with probability density function $f$ defined on an interval with (possibly infinite) endpoints $a$ and $b,$ then the mean or expected value of $X$ is
|
Solution
We have
$=$ ∫01$(3x^3) dx$
$= [3x^4/4] _0 ^1 = 3/4.$
$E(X) =$ ∫ab $x f(x) dx.$
Thus, the expected value of $X$ is $3/4.$
$=$ ∫01$(x)(3x^2) dx$
Before We Go On ... This reflects the fact that $X$ is more likely to take on values in the right part of the interval $[0, 1]$ than the left part. The figure shows this probability density function.
Given that troubled S&Ls are failing continuously at a rate of $5%$ per year, how long will the average troubled S&L last?
Solution
If $X$ is the number of years that a given S&L will last, we know that its probability density function is $f(x) = 0.05e^{-0.05x}.$ To answer the question we compute $E(X).$
$=$ M Using integration by parts, we get
Thus, the expected lifespan of a troubled S&L is 20 years.
$E(X) =$∫ab$x f(x) dx.$
$=$∫0+∞$ (0.05xe^{-0.05x}) dx$
+∞
∫0M $(0.05xe^{-0.05x}) dx$
$E(X) =$ M
+∞$-0.05 [ e^{-0.05x}(20x + 400)] _0 ^M = (0.05)(400) = 20.$
Before We Go On ... Notice that the answer, 20, is the reciprocal of the failure rate $0.05.$ This is true in general: if $f(x) = ae^{- ax},$ then $E(X) = 1/a.$
Question
Why is $E(X)$ given by that integral formula?Answer
Suppose for simplicity that the domain of $f$ is a finite interval $[a, b].$ Break up the interval into n subintervals $[x_{k-1}, x_k],$ each of length $Δx,$ as we did for Riemann sums. Now, the probability of seeing a value of $X$ in $[x_{k-1}, x_k]$ is approximately $f(x_k) Δx$ (the approximate area under the graph of $f$ over $[x_{k-1}, x_k]).$ Think of this as the fraction of times we expect to see values of $X$ in this range. These values, all close to $x_k,$ then contribute approximately $x_kf(x_k) Δx$ to the average, if we average together many observations of $X.$ Adding together all of these contributions, we get
Now these approximations get better as n∞, and we notice that the sum above is a Riemann sum converging to
which is the formula we have been using.
Question
What are the expected values of the standard distributions we discussed in the previous section?
Answer
Let's compute them one by one.
Mean of a Uniform Distribution
If $X$ is uniformly distributed on $[a, b],$ then
|
This is not surprising, if you think about it for a minute. We'll leave the actual computation as one of the exercises.
Mean of an Exponential Distribution
If $X$ has the exponential distribution function $f(x) = ae^{-ax},$ then
|
We saw how to compute this in Example 2.
Mean of an Normal Distribution
If $X$ is normally distributed with parameters $µ$ and $σ,$ then
|
Mean of a Beta Distribution
If $X$ has the beta distribution function $f(x) = (β+1)(β+2)x^β(1-x),$ then
|
Again, we shall leave this as an exercise.
A utilities industry consultant predicts a cutback in the Canadian Utilities industry during 2000-2005 by a percentage specified by a beta distribution with β = 0.25. What is the expected size of the cutback by Ontario Hydro?
Solution
Since $β = 0.25,$
Therefore, we can expect about a 38% cutback by Ontario Hydro.
$E(X) = (β + 1)/(β + 3) = 1.25/3.25 ≈ 0.38.$
Before We Go On ... What $E(X)$ really tells us is that the average downsizing of many utilities will be $38%.$ Some will cut back more, and some will cut back less.
Variance and Standard Deviation
Statisticians use the variance and standard deviation of a continuous random variable $X$ as a way of measuring its dispersion, or the degree to which is it "scattered." The definitions are as follows.
Variance and Standard Deviation
Let $X$ be a continuous random variable with density function $f$ defined on the interval $(a, b),$ and let $µ = E(X)$ be the mean of $X.$ Then the variance of $X$ is given by
|
Notes
(1) In order to calculate the variance and standard deviation, we need first to calculate the mean. (2) $Var(X)$ is the expected value of the function $(x-µ)^2,$ which measures the square of the distance of $X$ from its mean. It is for this reason that $Var(X)$ is sometimes called the mean square deviation, and $σ(X)$ is called the root mean square deviation. $Var(X)$ will be larger if $X$ tends to wander far away from its mean, and smaller if the values of $X$ tend to cluster near its mean. (3) The reason we take the square root in the definition of $σ(X)$ is that $Var(X)$ is the expected value of the square of the deviation from the mean, and thus is measured in square units. Its square root $σ(X)$ therefore gives us a measure in ordinary units. |
Question
What are the variances and standard deviations of the standard distributions we discussed in the previous section?
Answer
Let's compute them one by one. We'll leave the actual computations (or special cases) for the exercises.
Variance and Standard Deviation of a Uniform Distribution
If $X$ is uniformly distributed on $[a, b],$ then
and
|
Variance and Standard Deviation of an Exponential Distribution
If $X$ has the exponential distribution function $f(x) = ae^{-ax},$ then
and
|
Variance and Standard Deviation of a Normal Distribution
If $X$ is normally distributed with parameters $µ$ and $σ,$ then
and
(This is what you might have expected!) |
Variance and Standard Deviation of a Beta Distribution
If $X$ has the beta distribution function $f(x) = (β+1)(β+2)x^β(1-x),$ then
![]() |
Median
The median income in the U.S. is the income $M$ such that half the population earn incomes $≤ M$ (so the other half earn incomes $≥ M$). In terms of probability, we can think of income as a random variable $X.$ Then the probability that $X ≤ M$ is $1/2,$ and the probability that $X ≥ M$ is also $1/2.$
Median
Let $X$ be a continuous random variable. The median of $X$ is the number $M$ such that
|
for $M.$ Graphically, the vertical line $x = M$ divides the total area under the graph of $f$ into two equal parts. (See the figure).
Question
What is the difference between the median and the mean?
Answer
Roughly speaking, the median divides the area under the distribution curve into two equal parts, while the mean is the value of $X$ at which the graph would balance. If a probability curve has as much area to the left of the mean as to the right, then the mean is equal to the median. This is true of uniform and normal distributions, which are symmetric about their means. On the other hand, the medians and means are different for the exponential distributions and most of the beta distributions, because their areas are not distributed symmetrically.
The time in minutes between individuals joining the line at an Ottawa Post Office is a random variable with the exponential distribution
Find the mean and median time between individuals joining the line and interpret the answers.
Solution
The expected value for an exponential distribution $f(x) = ae^{-ax}$ is $1/a.$ Here, $a = 2,$ so $E(X) = 1/2.$ We interpret this to mean that, on average, a new person will join the line every half a minute, or $30$ seconds. For the median, we must solve
That is,
Evaluating the integral gives
or
or
Thus,
∫0M$f(x) dx = 1/2.$
∫0M$(2 e^{-2x}) dx = 1/2.$
$-[e^{-2x}] _0 ^M = 1/2,$
$1 - e^{-2M} = 1/2$
so
$e^{-2M} = 1/2,$
$-2M = ln (1/2)= - ln 2.$
$M = (ln 2)/2≈ 0.3466$ minutes.
This means that half the people get in line less than $0.3466$ minutes (about 21 seconds) after the previous person, while half arrive more than $0.3466$ minutes later. The mean time for a new person to arrive in line is larger than this because there are some occasional long waits between people, and these pull the average up.
Solution
Here,
Thus we must solve
That is,
So
or, multiplying through and clearing denominators,
$f(x) = (β+1)(β+2)x^β(1-x)$
$= 30x^4(1-x).$
∫0M$(30x^4(1-x)) dx = 1/2.$
$30$∫0M$(x^4 - x^5) dx = 1/2.$
$30[M^5/5 - M^6/6] = 1/2$
$12M^5 - 10M^6 - 1 = 0.$
This is a degree six polynomial equation that has no easy factorization. Since there is no general analytical method for obtaining the solution, the only method we can use is numerical. The figure shows three successive views of a graphing calculator plot of $Y = 12X^5 - 10X^6 - 1,$ obtained by zooming in towards one of the zeros.
We are interested only in the zero that occurs between $0$ and $1$ (why?), and find that $M ≈ 0.735$ to within $± 0.001.$
Before We Go On ...
Question
Answer
which corresponds to
$Y_1 = fnInt(30T^4(1-T),T,0,X)-0.5$
$y =$ ∫0x $(30t^4(1 - t) dt - 1/2,$
a function of $x.$ Since the median of $M$ is the solution obtained by setting $y = 0,$ we can obtain the answer by plotting $Y_1$ and finding its $x$-intercept. The plot should be identical to the one we obtained above (why?).
![]() | 2. Probability Density Functions: Uniform, Exponential, Normal, and Beta | ![]() | Section 3 Exercises | ![]() | 4. You're the Expert ![]() | ![]() | Calculus and Probability Main Page | ![]() | "Real World" Page |
Mail us at:
![]() | Stefan Waner (matszw@hofstra.edu) | ![]() | Steven R. Costenoble (matsrc@hofstra.edu) |