联系方式

  • QQ:99515681
  • 邮箱:99515681@qq.com
  • 工作时间:8:00-21:00
  • 微信:codinghelp

您当前位置:首页 >> Java编程Java编程

日期:2023-04-05 01:36

MAST20006/MAST90057 – Module 2. Discrete Distributions

Module 2. Discrete Distributions

Chapter 2 in the textbook

Sophie Hautphenne and Feng Liu

The University of Melbourne

2023

1/88

MAST20006/MAST90057 – Module 2. Discrete Distributions

Overview

1 Discrete random variables

2 Mathematical expectation

3 Mean, variance and standard deviation

4 Bernoulli trials and the binomial distribution

5 The moment-generating function

6 The Poisson distribution

2/88

MAST20006/MAST90057 – Module 2. Discrete Distributions

Discrete random variables

1. Discrete random variables

Recall that a fundamental objective of probability theory is to find

the probability of a given event B in the sample (outcome) space S.

It can be difficult to describe and analyse S, and accordingly B, if

the elements of S are not numerical.

However, one often deals with situations where one can associate

with each sample point (outcome) s in S a numerical measurement

x ; that makes life easier.

The numeric measurement x, when regarded as a function of sample

point s, is called a random variable, and is denoted as X or X(s).

3/88

MAST20006/MAST90057 – Module 2. Discrete Distributions

Discrete random variables

Definition 1

Given a random experiment with an outcome space S, a function X that

assigns to each element s in S a real number X(s) = x is called a

random variable (abbr. r.v.).

The range (or space) of X is the set of real numbers

{x : X(s) = x, s ∈ S}, where ‘s ∈ S’ means the element s belongs to the

set S.

Remarks : The range of X is often denoted as X(S) or SX .

Now each event (subset) B in S can be described by the subset

A := X(B) of real numbers assumed by some function (r.v.) X on

B.

Note that A is a subset of SX but not of S, and that X(B) does

not specify B for a general X.

4/88

MAST20006/MAST90057 – Module 2. Discrete Distributions

Discrete random variables

So X : S → SX ? R, such that s 7→ X(s) = x, and for every

A ? SX , there exists B ? S such that

A = X(B) = {x : x = X(s), s ∈ B}

and therefore B = {s ∈ S : X(s) ∈ A}.

Namely, for A ? SX ,

PX(A) = P (X ∈ A) = P ({s ∈ S : X(s) ∈ A}) = P (B)

In particular,

PX(SX) = P (X ∈ SX) = P ({s : s ∈ S : X(s) ∈ SX}) = P (S) = 1.

i.e. the probability of the range of X equals 1.

5/88

MAST20006/MAST90057 – Module 2. Discrete Distributions

Discrete random variables

Assigning probability to A = X(B) ∈ SX can be easier than

assigning probability to B ∈ S, as A is of numerical nature, while B

is not necessarily numerical.

Difficulties still remain :

1 How to assign a probability to a subset A = X(B) ∈ SX ?

2 How to define a r.v. X as a function of s ∈ S ?

The response to 2) is determined by the problem under

consideration, and is not unique.

To answer 1) we will focus on the discrete sample space at this

stage.

If S is discrete, SX is also discrete. So we would be able to calculate

PX(A) for any subset A in SX if we have assigned a probability

for each element in SX .

(Remember there exists a B ? S such that A = X(B).)

6/88

MAST20006/MAST90057 – Module 2. Discrete Distributions

Discrete random variables

Specifically,

PX(A) = PX(X(B)) = P (B) =

s∈B

P ({s}).

Also note

PX(A) =

x∈A

PX(x) =

x∈A

P (X = x).

Example 1. A marble is selected at random from a box containing 3 red,

4 yellow and 5 white marbles. The colour of the selected marble is

recorded.

The sample space is S = {R, Y,W}

And P ({R}) = 3

12

, P ({Y }) = 4

12

, P ({W}) = 5

12

.

Define a random variable

X = X(s) =

?????

1 if s = R,

2 if s = Y,

3 if s =W.

7/88

MAST20006/MAST90057 – Module 2. Discrete Distributions

Discrete random variables

Then the space of X is SX = {1, 2, 3}.

For A = {1, 2} which is an event in SX , there exists an event B in

S where B = {R, Y } such that

X(B) = X({R, Y }) = {X(R), X(Y )} = {1, 2} = A.

Note that both A and B represent the event that the selected

marble is not white.

Now,

PX(A) = PX({1, 2}) = PX(1) + PX(2) = P (X = 1) + P (X = 2)

= P (s = R) + P (s = Y ) = P ({R, Y }) = P (B)

=

3

12

+

4

12

=

7

12

Carefully read the above equation to make sure you understand

every step there.

The preceding discussions tell us that the set of probabilities

{P (X = x), x ∈ SX} are fundamental in that they determine the

probability of any event in SX .

8/88

MAST20006/MAST90057 – Module 2. Discrete Distributions

Discrete random variables

We often write

f(x) := PX({x}) = P (X = x) for any x ∈ SX ;

we call f(x) the probability mass function (pmf) of X.

Definition 2

The pmf f(x) of a discrete random variable X is a function that satisfies

the following properties :

1 f(x) > 0 for any x ∈ SX ;

2

x∈SX

f(x) = 1 ;

3 PX(A) = P (X ∈ A) =

x∈A

f(x), for any A ? SX .

9/88

MAST20006/MAST90057 – Module 2. Discrete Distributions

Discrete random variables

Remarks :

1 Provided that no confusion will be created, SX can simply be

rewritten as S (the sample space for X), and PX as P (or even just

Pr).

2 Note that P (X = x) = 0 if x /∈ SX . Therefore we define f(x) = 0

for any x /∈ SX .

3 If f(x) is constant on SX , we say X has a uniform distribution, or

f(x) is a uniform pmf. For example, “f(x) = 1/6, x = 1, 2, . . . , 6”

is a uniform pmf.

4 The pmf f(x) can be expressed in different ways. It can be

expressed as either a mathematics formula, table, bar graph or

probability histogram. You can use any one (usually the simplest

one, for the given situation) of these four forms to express the pmf.

10/88

MAST20006/MAST90057 – Module 2. Discrete Distributions

Discrete random variables

Example 2. Roll a four-sided die twice.

Let a random variable X equal the larger of the two face numbers

appeared if they are different and the common value if they are the

same.

Thus the sample space is

S = {(d1, d2) : d1 = 1, 2, 3, 4; d2 = 1, 2, 3, 4}.

We have X = X(d1, d2) = max(d1, d2), and the space of X is

SX = {1, 2, 3, 4}.

It is not difficult to see that

P (X = 1) = P ({(1, 1)}) = 1

16

P (X = 2) = P ({(1, 2), (2, 1), (2, 2)}) = 3

16

P (X = 3) = P ({(1, 3), (2, 3), (3, 3), (3, 1), (3, 2)}) = 5

16

P (X = 4) = P ({(1, 4), (2, 4), (3, 4), (4, 4), (4, 1), (4, 2), (4, 3)}) = 7

16

11/88

MAST20006/MAST90057 – Module 2. Discrete Distributions

Discrete random variables

Therefore, the pmf of X can either be given by the following table

x 1 2 3 4

f(x) = P (X = x)

or by the following mathematical formula

f(x) = P (X = x) = , x = 1, 2, 3, 4,

or by the following bar graph or probability histogram :

12/88

MAST20006/MAST90057 – Module 2. Discrete Distributions

Discrete random variables

R commands used for creating the above graphs :

13/88

MAST20006/MAST90057 – Module 2. Discrete Distributions

Discrete random variables

Example 3 : The hypergeometric distribution.

Let X be the number of “defective items” (“D”) in a sample of n items

randomly drawn without replacement from a population consisting of

N1 D’s and N2 G’s (“good items”). The population has in total

N1 +N2 = N items.

Assume that each item in the population has the same chance to be

drawn.

Then the possible values that the discrete r.v. X can take, i.e. the

space of X, are SX = {x : x ≥ 0, x ≤ n, x ≤ N1 and n? x ≤ N2}.

We say X has a hypergeometric distribution Hyper(N1, N2, n), with

the pmf being.

14/88

MAST20006/MAST90057 – Module 2. Discrete Distributions

Discrete random variables

Example 4 : Capture-recapture experiment. Ten animals of a certain

species have been captured, tagged, and released to mix into their

population. Suppose the population consists of 80 such animals. A new

sample of 15 animals is to be selected.

What is the probability that 3 in the new sample will come from the

tagged ?

Let X be the number of tagged animals in the new sample.

Then X has a hypergeometric distribution

Hyper(N1 = , N2 = , n = )

Therefore f(3) = P (X = 3) =

In R, use dhyper(x,N1, N2, n) to compute the hypergeometric pmf :

15/88

MAST20006/MAST90057 – Module 2. Discrete Distributions

Mathematical expectation

2. Mathematical expectation

The pmf f(x), x ∈ SX provides all the information about the

probability distribution of a random variable X.

Here we are interested in some numeric characteristics of X, which

are also numeric characteristics of f(x)

An important numeric characteristic is the mathematical

expectation of X.

16/88

MAST20006/MAST90057 – Module 2. Discrete Distributions

Mathematical expectation

Example 5. A young man devises a game. The game is to let the

participant cast a fair die and then receive a payment according to the

outcome :

He pays 1¢ if the event A = {1, 2, 3} occurs ; 5¢ if B = {4, 5}

occurs ; and 35¢ if C = {6} occurs.

It is easy to see that P (A) =

3

6

, P (B) =

2

6

and P (C) =

1

6

.

The average payment per cast is 1× 3

6

+ 5× 2

6

+ 35× 1

6

= 8¢.

In the long run, this is how much is paid in one play (use the “long

term relative frequency” interpretation of probability !).

The charge per cast should be more than 8¢ if the young man

wants to make a profit from this game over the long term.

17/88

MAST20006/MAST90057 – Module 2. Discrete Distributions

Mathematical expectation

The above discussion can be formulated more formally :

Let X be the outcome of a cast.

The pmf of X is the uniform one given by f(x) =

1

6

, x = 1, 2, . . . , 6.

In terms of the observed value x, the payment per cast is given by

the function

u(x) =

?????

1, x = 1, 2, 3

5, x = 4, 5

35, x = 6.

The mathematical expectation of the payment per cast is then

equal to

6∑

x=1

u(x)f(x) = 1× 1

6

+ 1× 1

6

+ 1× 1

6

+ 5× 1

6

+ 5× 1

6

+ 35× 1

6

= 1× 3

6

+ 5× 2

6

+ 35× 1

6

= 8.

18/88

MAST20006/MAST90057 – Module 2. Discrete Distributions

Mathematical expectation

Definition 3

Suppose f(x) is the pmf of a discrete random variable X with range SX ,

and u(X) is a function of X (note that u(X) is also a r.v.).

If the summation∑

x∈SX

u(x)f(x), which is sometimes written as

SX

u(x)f(x), exists,

then the sum is called the mathematical expectation or the expected

value of the function u(X), and it is denoted by E[u(X)].

That is,

E[u(X)] =

x∈SX

u(x)f(x).

19/88

MAST20006/MAST90057 – Module 2. Discrete Distributions

Mathematical expectation

Remarks :

1 It is possible that E[u(X)] is different from u(X) for any x ∈ SX .

2 To be mathematically rigorous, the definition of E[u(X)] requires

that

x∈SX |u(x)|f(x) converges and is finite (if SX is infinite, this

is a series).

3 There is another way to calculate E[u(X)] :

(a) Define Y = u(X) ; Y is also a random variable.

(b) Then find the pmf of Y ,

i.e. g(y) := P (Y = y) = P (u(X) = y) = P (X = u?1(y)).

(c) Then E[u(X)] = E[Y ] =

y∈SY

yg(y).

(d) So

x∈SX

u(x)f(x) =

y∈SY

yg(y).

20/88

MAST20006/MAST90057 – Module 2. Discrete Distributions

Mathematical expectation

Example 6. Let a r.v. X have the pmf f(x) = 13 , x ∈ S = {?1, 0, 1}.

Let u(X) = X2.

Then

E[u(X)] = E[X2] =

x∈S

x2f(x) = (?1)2× 1

3

+02× 1

3

+12× 1

3

=

2

3

.

On the other hand, we can define Y = X2.

Then P (Y = 0) = P (X = 0) = 13 , and

P (Y = 1) = P (X = ?1) + P (X = 1) = 23 .

So the pmf of Y is

g(y) =

???

1

3 , y = 0

2

3 , y = 1,

and the space of Y is SY = {0, 1}.

Hence E[Y ] =

y∈SY

y g(y) = 0× 1

3

+ 1× 2

3

=

2

3

.

21/88

MAST20006/MAST90057 – Module 2. Discrete Distributions

Mathematical expectation

In conclusion, we saw that there are two ways to compute E[u(X)],

and that in Example 6, E[u(X)] = E[Y ] = 23 .

Some useful properties about the mathematical expectation :

Theorem 1

When it exists, the mathematical expectation E satisfies the following

properties :

(a) If c is a constant, E(c) = c.

(b) If c is a constant and u is a function, E[c u(X)] = cE[u(X)].

(c) If c1 and c2 are constants and u1 and u1 are functions, then

E[c1 u1(X) + c2 u2(X)] = c1E[u1(X)] + c2E[u2(X)].

(d) Generalising part (c) above : E

[

k∑

i=1

ci ui(X)

]

=

k∑

i=1

ciE[ui(X)].

22/88

MAST20006/MAST90057 – Module 2. Discrete Distributions

Mathematical expectation

Example 7. Let X have the pmf f(x) =

x

10

, x = 1, 2, 3, 4. Then

E(X) =

E(X2) =

E[X(5?X)] =

23/88

MAST20006/MAST90057 – Module 2. Discrete Distributions

Mathematical expectation

Example 8. Let u(x) = (x? b)2, where b is an unknown constant.

Suppose E[(X ? b)2] exists. Find the value of b for which E[(X ? b)2]

is minimal.

First write g(b) = E[(X ? b)2]

Then g′(b) =

Set g′(b) = 0 and solve for b. It follows that b =

Since g′′(b) = , E[X] is the value of b that minimizes

E[(X ? b)2]

That is, E[(X ? E(X))2] ≤ E[(X ? b)2] for any b.

24/88

MAST20006/MAST90057 – Module 2. Discrete Distributions

Mathematical expectation

Example 9 : The expectation of a hypergeometric random variable.

Let X have a hypergeometric distribution Hyper(N1, N2, n), with the

pmf given by

f(x) = P (X = x) =

(

N1

x

)(

N2

n?x

)(

N

n

) ,

where x ≥ 0, x ≤ n, x ≤ N1, n? x ≤ N2.

Then we can show that

E(X) =

x∈S

(

N1

x

)(

N2

n?x

)(

N

n

) = nN1

N

.

This agrees with the intuition : the number of ‘defective’ items in

the sample is expected to be equal to the sample size n multiplied

with

N1

N

, the proportion of ‘defective’ items in the population.

25/88

MAST20006/MAST90057 – Module 2. Discrete Distributions

Mean, variance and standard deviation

3. Mean, variance and standard deviation

For a discrete r.v. X with pmf f(x) and space

SX = {u1, u2, . . . , uk}, the expectation is

E(X) =

x∈SX

xf(x) = u1f(u1) + u2f(u2) + . . . + ukf(uk).

The expectation can be regarded as a weighted mean of

u1, u2, . . . , uk, where the weights are f(u1), f(u2), . . . , f(uk).

For this reason, we also call E(X) the mean of the random variable

X, and also denote E(X) by the Greek letter μ.

In summary,

μ := E(X) =

x∈SX

xf(x) = u1f(u1) + u2f(u2) + . . . + ukf(uk).

A third name for E(X) is the first moment of X as the expression

of E(X) has an interpretation of moment in mechanics.

26/88

MAST20006/MAST90057 – Module 2. Discrete Distributions

Mean, variance and standard deviation

Similarly we call E(X2) the second moment of X.

Generally, for k ≥ 1, we call E(Xk) the k-th moment of X (about

the origin).

E[(X ? μ)k] is called the k-th moment of X about the mean μ

(central moment).

Statisticians find it valuable to compute E[(X ? μ)2] (the second

moment about the mean), because

E[(X ? μ)2] =

x∈SX

(x? μ)2f(x)

= (u1 ? μ)2f(u1) + (u2 ? μ)2f(u2) + . . . + (uk ? μ)2f(uk)

is the weighted mean of the squares of the differences

u1 ? μ, u2 ? μ, . . . , uk ? μ, which measures the variability of X

about its mean.

27/88

MAST20006/MAST90057 – Module 2. Discrete Distributions

Mean, variance and standard deviation

For this reason, we call E[(X ? μ)2] the variance of X (or of the

pmf of X).

We also use σ2 or Var(X) to denote the variance, i.e.

σ2 := Var(X) = E[(X ? μ)2]

We call σ :=

E[(X ? μ)2] the standard deviation of X (or of

the pmf of X).

The following property is useful :

σ2 = Var(X) = E[(X ? μ)2] = E[X2]? μ2 = E[X2]? (E[X])2.

28/88

MAST20006/MAST90057 – Module 2. Discrete Distributions

Mean, variance and standard deviation

Example 10. Let the pmf of X be defined as f(x) =

x

6

, x = 1, 2, 3.

Then

The mean of X is μ = E(X) = 1× 1

6

+ 2× 2

6

+ 3× 3

6

=

7

3

.

The second moment of X is

E(X2) = 12 × 1

6

+ 22 × 2

6

+ 32 × 3

6

= 6.

The variance of X is

σ2 = Var(X) = E(X2)? μ2 = 6?

(

7

3

)2

=

5

9

.

The standard deviation of X is σ =

Var(X) =

5

9

= 0.745.

29/88

MAST20006/MAST90057 – Module 2. Discrete Distributions

Mean, variance and standard deviation

Example 11. Suppose the pmf of X is given by

x -1 0 1

fX(x) 1/3 1/3 1/3

It is easy to find that the mean of X is μX = 0, and the variance of

X is σ2X = 2/3.

Suppose the pmf of Y is given by

y -2 0 2

fY (y) 1/3 1/3 1/3

It is easy to find that the mean of Y is μY = 0, and the variance of

Y is σ2Y = 8/3.

We see that Y = 2X, μY = 2μX , σ

2

Y = 2

2σ2X and σY = 2σX .

30/88

MAST20006/MAST90057 – Module 2. Discrete Distributions

Mean, variance and standard deviation

In general, if Y = aX + b where a and b are constants, and Y and X are

two random variables, then we have the following

a) μY = aμX + b

b) σ2Y = a

2σ2X and σY = aσX .

Example 12. If X has a discrete uniform distribution on the first m


MAST20006/MAST90057 – Module 2. Discrete Distributions

Mean, variance and standard deviation

Example 12 (cont.).



MAST20006/MAST90057 – Module 2. Discrete Distributions

Mean, variance and standard deviation

Example 13 : Empirical distribution, sample mean and sample variance.

Consider performing a random experiment n times which gives n

observations of a r.v. X : x1, x2, . . . , xn ; this is referred to as a sample

from the distribution of X.

It is possible that some values in the sample are the same, but we do

not worry about it at this time.

Often we don’t know the probability distribution of X. But we can

(artificially) assign a probability 1n to each of x1, x2, . . . , xn . The

distribution determined by these equal probabilities is called the

empirical distribution since it is determined by a particular sample

x1, x2, . . . , xn acquired in an experiment.

That is, the pmf for the empirical distribution is

femp(x) =

1

n

, x = x1, x2, . . . , xn.

33/88

MAST20006/MAST90057 – Module 2. Discrete Distributions

Mean, variance and standard deviation

The mean of femp(x) is

n∑

i=1

xifemp(xi) =


which is just the sample mean of the data x1, x2, . . . , xn.

Likewise, the variance of the empirical distribution is


times the sample variance of the data defined as

s2 :=

1

n? 1

n∑

i=1

(xi ? xˉ)2.

This example shows us the relationship between the mean and

variance of the empirical distribution and the sample mean and

sample variance of the data.

34/88

MAST20006/MAST90057 – Module 2. Discrete Distributions

Mean, variance and standard deviation

Example 14 : The mean and variance of a hypergeometric distribution.

Let X have a hypergeometric distribution Hyper(N1, N2, n), with the pmf


where x ≥ 0, x ≤ n, x ≤ N1, n? x ≤ N2.


MAST20006/MAST90057 – Module 2. Discrete Distributions

Bernoulli trials and the binomial distribution

4. Bernoulli trials and the binomial distribution

A random experiment with the below properties is called a binomial

experiment :

1 Each such experiment consists of n trials, with n being fixed in advance.

2 Each of the n trials has only two possible outcomes which are denoted by

‘success’ (S) and ‘failure’ (F). A trial of this type is called a Bernoulli

trial.

3 The n trials are independent of each other. That is, the outcome of one

trial does not affect the probability of occurrence of the outcome of other

trials.

4 The probability of ‘success’ (denoted by p) is the same for all the n trials.

36/88

MAST20006/MAST90057 – Module 2. Discrete Distributions

Bernoulli trials and the binomial distribution

Let Xi be a random variable associated with the i-th Bernoulli trial,

which is defined as Xi(success) = 1 and Xi(failure) = 0.

Xi is called a Bernoulli random variable.

The pmf of Xi is given by

f(xi) = p

x

i (1? p)1?xi , xi = 0, 1,

and

μi = E(Xi) = p,

σ2i = Var(Xi) = p(1? p).

37/88

MAST20006/MAST90057 – Module 2. Discrete Distributions

Bernoulli trials and the binomial distribution

Example 15. A coin is flipped independently five (n = 5) times. Call

outcome H (heads) as a “success” and T (tails) as a “failure”. Then this

is a binomial experiment.

Example 16. Suppose among 20 goblets in a box 2 have cosmetic flaws.

Now randomly take 10 goblets from the box without replacement. For

a selected goblet we are interested in whether it has any cosmetic flaws.

Then this is not a binomial experiment, because the outcomes of

the 10 trials are not independent with each other (it is a

hypergeometric experiment.)

If the 10 goblets are taken with replacement, then the experiment

is a binomial one.

38/88

MAST20006/MAST90057 – Module 2. Discrete Distributions

Bernoulli trials and the binomial distribution

Example 17. Suppose 10% of a stock of 10,000 goblets have defects,

and randomly take 10 goblets without replacement for inspection. Then

the outcomes of the 10 trials are not independent of each other, but the

dependence is so weak that it can be ignored.

Therefore, Properties 1–4 of a binomial experiment are

approximately satisfied, and the experiment can be approximately

modelled by a binomial experiment.

In general, if an experiment involves a ‘without replacement’ sampling but

the sample size (number of trials) is < 5% of the population size, then

the experiment can be analysed as though it was a binomial experiment.

Example 18. A company that produces fine crystal knows from

experience that 10% of its goblets have cosmetic flaws and must be

classified as “seconds”. Now a sample of 10 goblets is randomly taken

from the production line for inspection. Knowing that the objective is just

to see whether any of them has any cosmetic flaws, this experiment is

approximately a binomial one.

39/88

MAST20006/MAST90057 – Module 2. Discrete Distributions

Bernoulli trials and the binomial distribution

In a binomial experiment, we often are interested in the total

number of ‘successes’, denoted by X, in the n Bernoulli trials.

We then call X a binomial random variable, and say that X has a

binomial distribution, denoted as X

d

= b(n, p), where n and p are

parameters indicating the number of Bernoulli trials and the

probability of ‘success’ in each trial respectively.

Note that we are not interested in the order of occurrences of the

‘successes’ for a binomial distribution.

The possible values of X are 0, 1, 2, . . . , n.

X = X1 +X2 + . . . +Xn, i.e. the sum of the n Bernoulli r.v’s.

Each Bernoulli r.v. Xi has a special binomial distribution

Xi

d

= b(1, p).

40/88

MAST20006/MAST90057 – Module 2. Discrete Distributions

Bernoulli trials and the binomial distribution

Next we proceed to find the pmf and other characteristics of a binomial

r.v. X.

When n = 3, the probability for each possible outcome of X is given

below :

X Outcome Probability

3 SSS

2 SSF

SFS

FSS

1 SFF

FSF

FFS

0 FFF

41/88

MAST20006/MAST90057 – Module 2. Discrete Distributions

Bernoulli trials and the binomial distribution

From the above table we see that the pmf of b(3, p) is

P (X = 0) = (1? p)3,

P (X = 1) = 3p(1? p)2,

P (X = 2) = 3p2(1? p),

P (X = 3) = p3,

which can be equivalently expressed as


gives the number of ways of selecting x

positions for the x ‘successes’ in the n trials.

42/88

MAST20006/MAST90057 – Module 2. Discrete Distributions

Bernoulli trials and the binomial distribution

In general, the pmf for a binomial distribution b(n, p) is

f(x) = P (X = x) =


for x = 0, 1, 2, . . . , n.

Sometimes, it is of interest to find P (X ≤ x), the probability that

less than x or x ‘successes’ are obtained from n Bernoulli trials in a

binomial experiment.

We call the function defined by F (x) := P (X ≤ x) the cumulative

distribution function (or simply the distribution function) of X,

abbreviated as cdf of X.

43/88

MAST20006/MAST90057 – Module 2. Discrete Distributions

Bernoulli trials and the binomial distribution

For a r.v X having a binomial distribution b(n, p), the cdf is


np(1? p).

Remark : One can use the relation between binomial and Bernoulli

r.v.’s to find that

μX = E(X) = E(X1) +E(X2) + . . . +E(Xn) = p+ . . . + p = np.

44/88

MAST20006/MAST90057 – Module 2. Discrete Distributions

Bernoulli trials and the binomial distribution

Example 18. Probability bargraphs for several binomial distributions of

different n and p values are listed below :

45/88

MAST20006/MAST90057 – Module 2. Discrete Distributions

Bernoulli trials and the binomial distribution

R commands for creating the above plots :

Example 19. Suppose the probability of germination of a beet seed is

0.8, and 10 seeds are planted. Let X be the number of seeds to

germinate. Assume independence of germination of one seed from that of

another seed. Then

P (X = 8) =

That is, the probability of 8 seeds to germinate is 0.302.

46/88

MAST20006/MAST90057 – Module 2. Discrete Distributions

Bernoulli trials and the binomial distribution

P (X ≤ 8) =

8∑

k=0

(

10

k

)

(0.8)k(1? 0.8)10?k, or

P (X ≤ 8) = 1? P (X ≥ 9) =

That is, the probability of no more than 8 germinations is 0.624.

μ = E(X) = .

That is, on average 8 seeds are expected to germinate.

σ2 = Var(X) =

P (6 ≤ X < 9) = P (X < 9)? P (X ≤ 5)

=

That is, the probability of at least 6 but smaller than 9 seeds to

germinate is 0.591.

47/88

MAST20006/MAST90057 – Module 2. Discrete Distributions

Bernoulli trials and the binomial distribution

What is the probability of 3 out of the 10 seeds not to germinate ?

P (3 do not germinate) =

Alternatively, let Y be the number of non-germinations. Then

Y d = b(10, 0.2). So

P (3 do not germinate) =

Suppose there are 1000 pots and 10 beet seeds are planted in each

pot, with the probability of germination of each seed still being 0.8.

The number of germinations in each pot is to be recorded.

What will the 1000 records look like ?

48/88

MAST20006/MAST90057 – Module 2. Discrete Distributions

Bernoulli trials and the binomial distribution

These 1000 recordings would be like 1000 observations from a

b(10, 0.8) random variable.

We can use R to simulate 1000 observations, plot their histogram and

compare the histogram with the pmf of b(10, 0.8).

The heights of the dots give the pmf of X.

49/88

MAST20006/MAST90057 – Module 2. Discrete Distributions

Bernoulli trials and the binomial distribution

R commands for creating the above plot :

R commands for pmf, cdf and random number generating of binomial

distribution :

dbinom(x, size, prob)

pbinom(q, size, prob)

rbinom(n, size, prob)

Type ‘help(dbinom)’ in R for more information.

50/88

MAST20006/MAST90057 – Module 2. Discrete Distributions

Bernoulli trials and the binomial distribution

Example 20 : Comparison of binomial and hypergeometric distributions.

Suppose among 200 goblets in a box 20 have defects.

1 Randomly take 30 goblets from the box with replacement. Let X

be the number of defective goblets selected. It is easy to see that

X

d

= b(n = 30, p =

20

200

= 0.1).

2 Randomly take 30 goblets from the box without replacement. Let

Y be the number of defective goblets selected. Then

Y

d

= Hyper(N1 = 20, N2 = 180, n = 30).

51/88

MAST20006/MAST90057 – Module 2. Discrete Distributions

Bernoulli trials and the binomial distribution

Example 20. (cont.)

We have learned that


MAST20006/MAST90057 – Module 2. Discrete Distributions

Bernoulli trials and the binomial distribution

A comparison of the pmf’s of X and Y is given below :

It can be shown that when n and p =

N1

N

are fixed but N tends to

be very large, the hypergeometric distribution will be very close to

relevant binomial distribution.

53/88

MAST20006/MAST90057 – Module 2. Discrete Distributions

The moment-generating function

5. The moment-generating function

Mean, variance and standard deviation are important characteristics

of a distribution.

But it can be difficult to calculate E(X) and Var(X), e.g. when X

is binomial.

Here we introduce a function of t, called the moment-generating

function, which will help to generate the moments including mean

and variance of a distribution.

54/88

MAST20006/MAST90057 – Module 2. Discrete Distributions

The moment-generating function

Definition 4

Let X be a discrete random variable with pmf f(x) and range (or space)

S. If there is a positive number h such that

E(etX) =

x∈S

etxf(x) is finite for t = ±h

(and hence for ?h < t < h), then the function of t defined by

M(t) := E(etX) (or MX(t) := E(e

tX))

is called the moment-generating function (mgf) of X (or of the

distribution of X).

55/88

MAST20006/MAST90057 – Module 2. Discrete Distributions

The moment-generating function

Example 21. Consider a random variable X with the following pmf :

x b1 b2 b3 . . .

f(x) = P (X = x) f(b1) f(b2) f(b3) . . .

The mgf of X is M(t) =

When t = 0, M(0) =

Example 22. If X has the mgf M(t) =


MAST20006/MAST90057 – Module 2. Discrete Distributions

The moment-generating function

Examples 21 and 22 show that the mgf can be derived from the

pmf, and vice versa.

The pmf uniquely determines the mgf, and it was proved that the

mgf also uniquely determines the pmf.

That is, the same pmf fX(x) = fY (x)? the same mgf

MX(t) =MY (t).

We see that the mgf, as the pmf, provides another tool for

describing the distribution of a r.v..

However, note that fX(x) = fY (x) or MX(t) =MY (t) does not

imply X = Y .

Another issue is that the mgf may not exist for some r.v.’s, while the

pmf always exists (for discrete r.v.’s, of course).

57/88

MAST20006/MAST90057 – Module 2. Discrete Distributions

The moment-generating function

Example 23. Suppose the mgf of X is M(t) =

1

2e

t

1? 12et

, t < ln(2).

We show how Taylor’s expansion can help to find the pmf of X.

This mgf does not have the form as given in Examples 21 and 22

which allowed us to find the pmf easily.

Note the Maclaurin’s (or Taylor’s) series expansion of (1? z)?1 is

(1? z)?1 = 1 + z + z2 + z3 + . . . , ?1 < z < 1.

Therefore,

M(t) =

et

2

(1?e

t

2

)?1 =

when e

t

2 < 1 and thus t < ln(2).

From the above expansion, P (X = x) =

So the pmf of X is f(x) =

58/88

MAST20006/MAST90057 – Module 2. Discrete Distributions

The moment-generating function

Now we proceed to see how the mgf and moments are related.


x∈S

xrf(x) = E(Xr)

59/88

MAST20006/MAST90057 – Module 2. Discrete Distributions

The moment-generating function

In particular,

μ =M ′(0) and σ2 =M ′′(0)? [M ′(0)]2

In order to make use of the above technique to find the moments of

X, the mgf M(t) needs to have a closed form instead of the

expansion form.

60/88

MAST20006/MAST90057 – Module 2. Discrete Distributions

The moment-generating function

Example 24. The pmf of the binomial distribution is known to be

f(x) = P (X = x) =

(

n

x

)

px(1? p)n?x = n!

x!(n? x)!p

x(1? p)n?x,

for x = 0, 1, 2, . . . , n.

Thus the corresponding mgf is

M(t) = E(etX) =

from the binomial expansion of (a+ b)n with a = 1? p and b = pet.

61/88

MAST20006/MAST90057 – Module 2. Discrete Distributions

The moment-generating function

Example 24 (cont.).

The first two derivatives of M(t) are

M ′(t) = n[(1? p) + pet]n?1(pet)

M ′′(t) = n(n? 1)[(1? p)+ pet]n?2(pet)2+n[(1? p)+ pet]n?1(pet)

So

μ = E(X) = M ′(0) = np,

E(X2) = M ′′(0) = n(n? 1)p2 + np, and

σ2 = Var(X) = E(X2)? [E(X)]2

= n(n? 1)p2 + np? (np)2

= np(1? p)

62/88

MAST20006/MAST90057 – Module 2. Discrete Distributions

The moment-generating function

Bernoulli distribution

When n = 1, the binomial distribution becomes the Bernoulli

distribution with the mgf being M(t) = (1? p) + pet.

It is easy to see M ′(t) =M ′′(t) =M (3)(t) = . . . = pet.

So E(X) = E(X2) = E(Xk) = p for any k = 1, 2, 3, . . . for the

Bernoulli distribution.

63/88

MAST20006/MAST90057 – Module 2. Discrete Distributions

The moment-generating function

Negative binomial distribution :

Consider observing a sequence of i.i.d. Bernoulli trials until exactly r

successes occur.

Let the r.v. X be the number of trials needed to obtain r successes,

i.e. X is the trial number on which the r-th success is observed.

Writing q = 1? p, it can be seen that

P (X = x)

= P (r ? 1 successes in the first x? 1 trails)

×P (success in the x-th trial)

=

64/88

MAST20006/MAST90057 – Module 2. Discrete Distributions

The moment-generating function

Thus the pmf of X is

f(x) =

(

x? 1

r ? 1

)

pr(1?p)x?r =

(

x? 1

r ? 1

)

prqx?r, x = r, r+1, r+2, . . .

We say that X has a negative binomial distribution, i.e.

X

d

= NB(r, p)

The reason it is called the negative binomial is that the pmf is

similar to each term in the Maclaurin’s series expansion of the

binomial function 1? w to the negative exponent ?r, that is,

MAST20006/MAST90057 – Module 2. Discrete Distributions

The moment-generating function

Example 27. Probability bargraphs for several negative binomial

distributions of different r and p values are listed below :

73/88

MAST20006/MAST90057 – Module 2. Discrete Distributions

The moment-generating function

Finding the mgf using all moments

We have seen how to obtain the moments from the mgf. We can also find

the mgf using all the moments, which is based on the following result :

If the Maclaurin’s series expansion for a mgf M(t) exists, then

M(t) =M(0) +M ′(0)


tk

k!

)

because M(0) = 1,M ′(0) = E(X), and M (k)(0) = E(Xk),

k = 1, 2, 3, . . ..

74/88

MAST20006/MAST90057 – Module 2. Discrete Distributions

The moment-generating function

Example 28. Suppose the moments of X are given by

E(Xk) = 0.8, k = 1, 2, 3, . . ..

Find the mgf of X and further the pmf of X.

M(t) = 1 +


= 0.2e0t + 0.8e1t

Therefore P (X = 0) = 0.2, and P (X = 1) = 0.8. This means that

X has Bernoulli distribution with p = 0.8.

75/88

MAST20006/MAST90057 – Module 2. Discrete Distributions

The Poisson distribution

6. The Poisson distribution

The Poisson (pronounced ‘pwa sohn’, a French name meaning ‘fish’)

distribution is used to model the probability of the number of

occurrences of particular events for a variety of phenomena.

Examples include

the number of fishes captured in a catch,

the number of phone calls arriving at a switchboard between 9 and

10 am ;

number of insurance claims during a year ;

number of car accidents in a city during a day ;

number and pattern of bomb droppings over London in World War

II ;

number of customers entering a specific shop during one day ;

etc.

The count numbers here are random variables (exactly or

approximately) possessing the three properties from the following

definition.

76/88

MAST20006/MAST90057 – Module 2. Discrete Distributions

The Poisson distribution

Definition 5

Let the number of changes (or events) that occur in a given ”continuous

interval” be counted. We have a Poisson process with parameter λ > 0

if the following are satisfied :

(a) The numbers of changes occurring in non-overlapping intervals are

independent.

(b) The probability of exactly one change in a short interval of length h

is approximately λh.

(c) The probability of two or more changes in a short interval of length

h is much smaller than h.

Suppose we have a Poisson process, and let X denote the number of

changes in an interval of unit length.

We proceed to find the value P (X = x), where x is a nonnegative

integer.

We do this by calculating P (X = x) as the limit of binomial

probabilities.

77/88

MAST20006/MAST90057 – Module 2. Discrete Distributions

The Poisson distribution

We first partition the unit interval into n subintervals of equal length

1/n.

If n is sufficiently large, the probability that x changes occur in this

unit interval is basically the same as the probability of finding x of

the n subintervals each containing exactly one change and the other

subintervals containing no changes.

By (b) and (c), each subinterval contains either one change or no

change ; the probability of one change in a subinterval is

approximately λ(1/n), and the probability of no change in a

subinterval is 1? λ(1/n).

Thus, observing occurrence or nonoccurrence of a change in a

subinterval is a Bernoulli trial.

By (a), we have n independent Bernoulli trials here ; the probability

of ‘a change’ in each trial is λ/n.

78/88

MAST20006/MAST90057 – Module 2. Discrete Distributions

The Poisson distribution

Hence,


t?1), ?∞ < t <∞.

The first and second derivatives of the mgf, and accordingly the

mean and variance, are

M ′(t) = λeteλ(e

t?1) ; so the mean of X is

μ = E(X) =M ′(0) = λ.

M ′′(t) = (λet)2eλ(e

t?1) + λeteλ(e

t?1) ; so

E(X2) =M ′′(0) = λ2 + λ.

So the variance of X is σ2 = Var(X) = E(X2)? [E(X)]2 = λ.

80/88

MAST20006/MAST90057 – Module 2. Discrete Distributions

The Poisson distribution

That shows that the parameter λ of a Poisson process can be

interpreted as the mean number of occurrences in a unit length

interval.

We often call λ the rate of occurrence (or intensity) parameter.

That is, if Y is the number of occurrences in an interval of length T

with rate of occurrence λ, then Y

d

= Poi(λT ) with

E(Y ) = Var(Y ) = λT .

Both the mean and the variance of a Poi(λ) distribution equal λ.

Determining a value for λ is the key step in calculating Poisson

probabilities.

81/88

MAST20006/MAST90057 – Module 2. Discrete Distributions

The Poisson distribution

Example 29. Let X be the number of requests for assistance calls

received by a towering service during a peak-hour period (7am to 9am).

Suppose the average number of calls is 50 per hour. That is,

E(X) = λ = 2× 50 = 100 calls.

1 What is the probability that 120 calls will be received during the

peak-hour time ?

2 Find the probability that at most 108 calls will be received during

the 7am to 9pm?

3 What is the probability that no calls will be received during a 5-min.

break in this peak-hour time ?

Let Y denote the number of calls to be received during the break.

Then

82/88

MAST20006/MAST90057 – Module 2. Discrete Distributions

The Poisson distribution

R can be used to compute Poisson probabilities.

Given below are R commands for pmf, cdf and random number

generating of Poisson distribution :

dpois(x, lambda)

ppois(q, lambda)

rpois(n, lambda)

Type ‘help(dpois)’ for more information.

83/88

MAST20006/MAST90057 – Module 2. Discrete Distributions

The Poisson distribution

Example 30. Suppose there are 300 misprints in a 500-page book. A

misprint is equally likely to occur on any page, and each page can contain

zero, one or more than one misprints.

What is the probability that 3 misprints are found on a specified page ?

The rate of occurrence parameter λ = 300500 = 0.6, i.e. 0.6 misprints

per page.

Let X be the number of misprints on a specified page. Then

X

d

= Poi(0.6)

Therefore, P (X = 3) =

0.63e?0.6

3!

= 0.0198.

84/88

MAST20006/MAST90057 – Module 2. Discrete Distributions

The Poisson distribution

Example 31. To see the effect of λ on the pmf of a Poisson distribution,

we plot here a number of probability bargraphs of the pmf f(x) for 6

different values of λ.

85/88

MAST20006/MAST90057 – Module 2. Discrete Distributions

The Poisson distribution

Approximating binomial by Poisson

From the derivation at the beginning of this section, we see that a

Poisson distribution can be used to approximate probabilities for a

binomial distribution.


.

Namely, the binomial probability from b(n, p) can be approximated by the

respective Poisson probability from Poi(np).

86/88

MAST20006/MAST90057 – Module 2. Discrete Distributions

The Poisson distribution

Example 32. A manufacturer of Christmas tree light bulbs knows that

2% of its bulbs are defective. Let X be the number of defective bulbs in

a box of 100 of these bulbs. Assuming independence among defectives,

.859.

87/88

MAST20006/MAST90057 – Module 2. Discrete Distributions

The Poisson distribution

Plots of the pmf’s of binomial b(100, 0.02) and Poisson Poi(2) are

given below :


版权所有:编程辅导网 2021 All Rights Reserved 联系方式:QQ:99515681 微信:codinghelp 电子信箱:99515681@qq.com
免责声明:本站部分内容从网络整理而来,只供参考!如有版权问题可联系本站删除。 站长地图

python代写
微信客服:codinghelp