Exponential Random Variable

The exponential random variable is defined by the density function [see Fig.1-2b](1.4-5)P(x) = {a exp(–ax), if x≥0,0, if x>0,where a is any positive real number.

From: Markov Processes , 1992

Random Variables, Distributions, and Density Functions

Scott L. Miller , Donald Childers , in Probability and Random Processes, 2004

3.4.2 Exponential Random Variable

The exponential random variable has a probability density function and cumulative distribution function given (for any b > 0) by

(3.19a) f X ( x ) = 1 b exp ( - x b ) u ( x ) ,

(3.19b) f X ( x ) = [ 1 - exp ( - x b ) ] u ( x ) .

A plot of the PDF and the CDF of an exponential random variable is shown in Figure 3.9. The parameter b is related to the width of the PDF and the PDF has a peak value of 1/b which occurs at x = 0. The PDF and CDF are nonzero over the semi-infinite interval (0, ∞), which may be either open or closed on the left endpoint.

Figure 3.9. Probability density function (a) and cumulative distribution function (b) of an exponential random variable, b = 2.

Exponential random variables are commonly encountered in the study of queueing systems. The time between arrivals of customers at a bank, for example, is commonly modeled as an exponential random variable, as is the duration of voice conversations in a telephone network.

Read full chapter

URL:

https://www.sciencedirect.com/science/article/pii/B9780121726515500038

The Exponential Distribution and the Poisson Process

Sheldon M. Ross , in Introduction to Probability Models (Twelfth Edition), 2019

5.2.1 Definition

A continuous random variable X is said to have an exponential distribution with parameter λ, λ > 0 , if its probability density function is given by

f ( x ) = { λ e λ x , x 0 0 , x < 0

or, equivalently, if its cdf is given by

F ( x ) = x f ( y ) d y = { 1 e λ x , x 0 0 , x < 0

The mean of the exponential distribution, E [ X ] , is given by

E [ X ] = x f ( x ) d x = 0 λ x e λ x d x

Integrating by parts ( u = x , d v = λ e λ x d x ) yields

E [ X ] = x e λ x | 0 + 0 e λ x d x = 1 λ

The moment generating function ϕ ( t ) of the exponential distribution is given by

(5.1) ϕ ( t ) = E [ e t X ] = 0 e t x λ e λ x d x = λ λ t for t < λ

All the moments of X can now be obtained by differentiating Eq. (5.1). For example,

E [ X 2 ] = d 2 d t 2 ϕ ( t ) | t = 0 = 2 λ ( λ t ) 3 | t = 0 = 2 λ 2

Consequently,

Var ( X ) = E [ X 2 ] ( E [ X ] ) 2 = 2 λ 2 1 λ 2 = 1 λ 2

Example 5.1 Exponential Random Variables and Expected Discounted Returns

Suppose that you are receiving rewards at randomly changing rates continuously throughout time. Let R ( x ) denote the random rate at which you are receiving rewards at time x. For a value α 0 , called the discount rate, the quantity

R = 0 e α x R ( x ) d x

represents the total discounted reward. (In certain applications, α is a continuously compounded interest rate, and R is the present value of the infinite flow of rewards.) Whereas

E [ R ] = E [ 0 e α x R ( x ) d x ] = 0 e α x E [ R ( x ) ] d x

is the expected total discounted reward, we will show that it is also equal to the expected total reward earned up to an exponentially distributed random time with rate α.

Let T be an exponential random variable with rate α that is independent of all the random variables R ( x ) . We want to argue that

0 e α x E [ R ( x ) ] d x = E [ 0 T R ( x ) d x ]

To show this define for each x 0 a random variable I ( x ) by

I ( x ) = { 1 , if x T 0 , if x > T

and note that

0 T R ( x ) d x = 0 R ( x ) I ( x ) d x

Thus,

E [ 0 T R ( x ) d x ] = E [ 0 R ( x ) I ( x ) d x ] = 0 E [ R ( x ) I ( x ) ] d x = 0 E [ R ( x ) ] E [ I ( x ) ] d x by independence = 0 E [ R ( x ) ] P { T x } d x = 0 e α x E [ R ( x ) ] d x

Therefore, the expected total discounted reward is equal to the expected total (undiscounted) reward earned by a random time that is exponentially distributed with a rate equal to the discount factor.  

Read full chapter

URL:

https://www.sciencedirect.com/science/article/pii/B978012814346900010X

Operations on a Single Random Variable

Scott L. Miller , Donald Childers , in Probability and Random Processes (Second Edition), 2012

4.7 Characteristic Functions

In this section, we introduce the concept of a characteristic function. The characteristic function of a random variable is closely related to the Fourier transform of the PDF of that random variable. Thus, the characteristic function provides a sort of "frequency domain" representation of a random variable, although in this context there is no connection between our frequency variable ω and any physical frequency. In studies of deterministic signals, it was found that the use of Fourier transforms greatly simplified many problems, especially those involving convolutions. We will see in future chapters the need for performing convolution operations on PDFs of random variables, and hence frequency domain tools will become quite useful. Furthermore, we will find that characteristic functions have many other uses. For example, the characteristic function is quite useful for finding moments of a random variable. In addition to the characteristic function, two other related functions, namely, the moment-generating function (analogous to the Laplace transform) and the probability-generating function (analogous to the z-transform), will also be studied in the following sections.

Definition 4.7: The characteristic function of a random variable, X, is given by

(4.36) Φ X ( ω ) = E [ e j ω X ] = - e j ω x f X ( x ) d x .

Note the similarity between this integral and the Fourier transform. In most of the electrical engineering literature, the Fourier transform of the function fX (x) would be Φ(−ω). Given this relationship between the PDF and the characteristic function, it should be clear that one can get the PDF of a random variable from its characteristic function through an inverse Fourier transform operation:

(4.37) f X ( x ) = 1 2 π - e - j ω x Φ X ( ω ) d ω .

The characteristic functions associated with various random variables can be easily found using tables of commonly used Fourier transforms, but one must be careful since the Fourier integral used in Equation (4.36) may be different from the definition used to generate common tables of Fourier transforms. In addition, various properties of Fourier transforms can also be used to help calculate characteristic functions as shown in the following example.

Example 4.18

An exponential random variable has a PDF given by fX (x) = exp(−x)u(x). Its characteristic function is found to be

Φ X ( ω ) = - e j ω x f X ( x ) d x = 0 e j ω x e - x d x = - e - ( 1 - j ω ) x 1 - j ω | 0 = 1 1 - j ω .

This result assumes that ω is a real quantity. Now suppose another random variable Y has a PDF given by fY (y) = a exp(−ay)u(y). Note that fY (y) = afX (ay), thus using the scaling property of Fourier transforms, the characteristic function associated with the random variable Y is given by

Φ Y ( ω ) = a 1 | a | Φ X ( ω a ) = 1 1 - j ω / a = a a - j ω ,

assuming a is a positive constant (which it must be for Y to have a valid PDF). Finally, suppose that Z has a PDF given by fZ (z) = a exp(−a(zb))u(zb). Since fZ (z) = fY (zb), the shifting property of Fourier transforms can be used to help find the characteristic function associated with the random variable Z:

Φ Z ( ω ) = Φ Y ( ω ) e - j ω b = a e - j ω b a - j ω .

The next example demonstrates that the characteristic function can also be computed for discrete random variables. In Section 4.8, the probability-generating function will be introduced which is preferred by some when dealing with discrete random variables.

Example 4.19

A binomial random variable has a PDF which can be expressed as

f X ( x ) = k = 0 n ( n k ) p κ ( 1 - p ) n - k δ ( x - k ) .

Its characteristic function is computed as follows:

Φ X ( ω ) = - e j ω x ( k = 0 n ( n k ) p k ( 1 - p ) n - k δ ( x - k ) ) d x = k = 0 n ( n k ) p k ( 1 - p ) n - k - δ ( x - k ) e j ω x d x = k = 0 n ( n k ) p k ( 1 - p ) n - k e j ω k = k = 0 n ( n k ) ( p e j ω ) k ( 1 - p ) n - k = ( 1 - p + p e j ω ) n .

Since the Gaussian random variable plays such an important role in so many studies, we derive its characteristic function in Example 4.20. We recommend that the student commit the result of this example to memory. The techniques used to arrive at this result are also important and should be carefully understood.

Example 4.20

For a standard normal random variable, the characteristic function can be found as follows:

Φ X ( ω ) = - 1 2 π e - x 2 2 e J ω x d x = - 1 2 π exp ( - ( x 2 - 2 j ω x ) 2 ) d x .

To evaluate this integral, we complete the square in the exponent.

Φ X ( ω ) = exp ( - ω 2 2 ) - 1 2 π exp ( - ( x 2 - 2 j ω x - ω 2 ) 2 ) d x = exp ( - ω 2 2 ) - 1 2 π exp ( - ( x - j ω ) 2 2 ) d x .

The integrand in the above expression looks like the properly normalized PDF of a Gaussian random variable, and since the integral is over all values of x, the integral must be unity. However, close examination of the integrand reveals that the "mean" of this Gaussian integrand is complex. It is left to the student to rigorously verify that this integral still evaluates to unity even though the integrand is not truly a Gaussian PDF (since it is a complex function and hence not a PDF at all). The resulting characteristic function is then

Φ X ( ω ) = exp ( - ω 2 2 ) .

For a Gaussian random variable whose mean is not zero or whose standard deviation is not unity (or both), the shifting and scaling properties of Fourier transforms can be used to show that

f X ( x ) = 1 2 π σ 2 e - ( x - μ ) 2 2 σ 2 Φ X ( ω ) = exp ( j μ ω - ω 2 σ 2 2 ) .

Theorem 4.3: For any random variable whose characteristic function is differentiable at ω= 0,

(4.38) E [ X ] = - j d d ω Φ X ( ω ) | ω = 0 .

Proof: The proof follows directly from the fact that the expectation and differentiation operations are both linear and consequently the order of these operations can be exchanged.

d d ω Φ X ( ω ) = d d ω ( E [ e j ω X ] ) = E [ d d ω ( e j ω X ) ] = E [ j X e j ω X ] = j E [ X e j ω X ] .

Multiplying both sides by −j and evaluating at ω = 0 produces the desired result.

Theorem 4.3 demonstrates a very powerful use of the characteristic function. Once the characteristic function of a random variable has been found, it is generally a very straightforward thing to produce the mean of the random variable. Furthermore, by taking the kth derivative of the characteristic function and evaluating at ω = 0, an expression proportional to the kth moment of the random variable is produced. In particular,

(4.39) E [ X k ] = ( - j ) k d k d ω k Φ X ( ω ) | ω = 0 .

Hence, the characteristic function represents a convenient tool to easily determine the moments of a random variable.

Example 4.21

Consider the exponential random variable of Example 4.18 where fY (y) = a exp(−ay)u(y). The characteristic function was found to be

Φ Y ( ω ) = a a - j ω .

The derivative of the characteristic function is

d d ω Φ Y ( ω ) = j a ( a - j ω ) 2 ,

and thus the first moment of Y is

E [ Y ] = - j d d ω Φ Y ( ω ) | ω = 0 = a ( a - j ω ) 2 | ω = 0 = 1 a .

For this example, it is not difficult to show that the kth derivative of the characteristic function is

d k d ω k Φ Y ( ω ) = j k k ! a ( a - j ω ) k + 1 ,

and from this, the kth moment of the random variable is found to be

E [ Y k ] = ( - j ) k d k d ω k Φ Y ( ω ) | ω = 0 = k ! a ( a - j ω ) k + 1 | ω = 0 = k ! a k .

For random variables that have a more complicated characteristic function, evaluating the kth derivative in general may not be an easy task. However, Equation (4.39) only calls for the kth derivative evaluated at a single point (ω = 0), which can be extracted from the Taylor series expansion of the characteristic function. To see this, note that from Taylor's theorem, the characteristic function can be expanded in a power series as

(4.40) Φ X ( ω ) = k = 0 1 k ! ( d k d ω k Φ X ( ω ) | ω = 0 ) ω k .

If one can obtain a power series expansion of the characteristic function, then the required derivatives are proportional to the coefficients of the power series. Specifically, suppose an expansion of the form

is obtained. Then the derivatives of the characteristic function are given by

(4.42) d k d ω k Φ X ( ω ) | ω = 0 = k ! φ k .

The moments of the random variable are then given by

This procedure is illustrated using a Gaussian random variable in the next example.

Example 4.22

Consider a Gaussian random variable with a mean of μ = 0 and variance σ2. Using the result of Example 4.20, the characteristic function is ΦX(ω) = exp(−ω2σ2/2). Using the well-known Taylor series expansion of the exponential function, the characteristic function is expressed as

Φ X ( ω ) = n = 0 ( - ω 2 σ 2 / 2 ) n n ! = n = 0 ( -1 ) n σ 2 n 2 n n ! ω 2 n .

The coefficients of the general power series as expressed in Equation (4.41) are given by

φ k = { j k ( σ / 2 ) k ( k / 2 ) ! , k even , 0 , k odd .

Hence, the moments of the zero-mean Gaussian random variable are

E [ X k ] = { k ! ( k / 2 ) ! ( σ 2 ) k , k even , 0 , k odd .

As expected, E[X 0] = 1, E[X] = 0 (since it was specified that μ = 0), and E[X 2] = σ2 (since in the case of zero-mean variables, the second moment and variance are one and the same). Now, we also see that E[X 3] = 0 (as are all odd moments), E[X 4] = 3σ4, E[X 6] = 15σ6, and so on. We can also conclude from this that for Gaussian random variables, the coefficient of skewness is c s = 0 while the coefficient of kurtosis is c k = 3.

In many cases of interest, the characteristic function has an exponential form. The Gaussian random variable is a typical example. In such cases, it is convenient to deal with the natural logarithm of the characteristic function.

Definition 4.8: In general, we can write a series expansion of ln[Φ X (ω)] as

(4.44) ln [ Φ X ( ω ) ] = n = 1 λ n ( j ω ) n n ! ,

where the coefficients, λ n , are called the cumulants and are given as

(4.45) λ n = d n d ( j ω ) n { ln [ Φ X ( ω ) ] } | ω = C , n = 1 , 2 , 3 ,

The cumulants are related to the moments of the random variable. By taking the derivatives specified in Equation (4.45) we obtain

(4.48) λ 3 = E [ X 3 ] - 3 μ X E [ X 2 ] + 2 μ X 3 = E [ ( X - μ X ) 3 ] .

Thus, λ1 is the mean, λ2 is the second central moment (or the variance), and λ3 is the third central moment. However, higher-order cumulants are not as simply related to the central moments.

Read full chapter

URL:

https://www.sciencedirect.com/science/article/pii/B9780123869814500072

Operations on a Single Random Variable

Scott L. Miller , Donald Childers , in Probability and Random Processes, 2004

4.7 Characteristic Functions

In this section we introduce the concept of a characteristic function. The characteristic function of a random variable is closely related to the Fourier transform of the PDF of that random variable. Thus, the characteristic function provides a sort of "frequency domain" representation of a random variable, although in this context there is no connection between our frequency variable ω and any physical frequency. In studies of deterministic signals, it was found that the use of Fourier transforms greatly simplified many problems, especially those involving convolutions. We will see in future chapters the need for performing convolution operations on PDFs of random variables and hence frequency domain tools will become quite useful. Furthermore, we will find that characteristic functions have many other uses. For example, the characteristic function is quite useful for finding moments of a random variable. In addition to the characteristic function, two other related functions, namely, the moment-generating function (analogous to the Laplace transform) and the probability-generating function (analogous to the z-transform), will also be studied in the following sections.

DEFINITION 4.7: The characteristic function of a random variable, X, is given by

(4.35) Φ X ( ω ) = E [ e j ω X ] = - e j ω x f X ( x ) d x .

Note the similarity between this integral and the Fourier transform. In most of the electrical engineering literature, the Fourier transform of the function fX(x) would be Φ(– ω). Given this relationship between the PDF and the characteristic function, it should be clear that one can get the PDF of a random variable from its characteristic function through an inverse Fourier transform operation:

(4.36) f X ( x ) = 1 2 π - e - j ω x Φ X ( ω ) d ω .

The characteristic functions associated with various random variables can be easily found using tables of commonly used Fourier transforms, but one must be careful since the Fourier integral used in Equation 4.35 may be different from the definition used to generate common tables of Fourier transforms. In addition, various properties of Fourier transforms can also be used to help calculate characteristic functions as shown in the following example.

EXAMPLE 4.18: An exponential random variable has a PDF given by fx(x) = exp(–x)u(x). Its characteristic function is found to be

Φ X ( ω ) = - e j ω x f X ( x ) d x = 0 e j ω x e - x d x = - e - ( 1 - j ω ) x 1 - j ω | 0 = 1 1 - j ω .

This result assumes that ω is a real quantity. Now suppose another random variable Y has a PDF given by fY (y) = a exp(-ay)u(y). Note that fY (y) = afX (ay), thus using the scaling property of Fourier transforms, the characteristic function associated with the random variable Y is given by

Φ Y ( ω ) = a 1 | a | Φ X ( ω a ) = 1 1 - j ω / a = a a - j ω

assuming a is a positive constant (which it must be for Y to have a valid PDF). Finally, suppose that Z has a PDF given by fZ (z) = a exp(-a(z - b))u(z - b). Since fZ (z)=fY (z-b), the shifting property of Fourier transforms can be used to help find the characteristic function associated with the random variable Z:

Φ Z ( ω ) = Φ Y ( ω ) e - j ω b = a e - j ω b a - j ω .

The next example demonstrates that the characteristic function can also be computed for discrete random variables. In Section 4.8, the probability-generating function will be introduced and is preferred by some when dealing with discrete random variables.

EXAMPLE 4.19: A binomial random variable has a PDF that can be expressed as

f X ( x ) = k = 0 n ( n k ) p k ( 1 - p ) n - k δ ( x - k ) .

Its characteristic function is computed as follows:

Φ X ( ω ) = - e j ω x ( k = 0 n ( n k ) p k ( 1 - p ) n - k δ ( x - k ) ) d x = k = 0 n ( n k ) p k ( 1 - p ) n - k - δ ( x - k ) e j ω x d x = k = 0 n ( n k ) p k ( 1 - p ) n - k e j ω k = k = 0 n ( n k ) ( p e j ω ) k ( 1 - p ) n - k = ( 1 - p + p e j ω ) n .

Since the Gaussian random variable plays such an important role in so many studies, we derive its characteristic function in Example 4.20. We recommend that the student commit the result of this example to memory. The techniques used to arrive at this result are also important and should be carefully studied and understood.

EXAMPLE 4.20: For a standard normal random variable, the characteristic function can be found as follows:

Φ X ( ω ) = - 1 2 π e - x 2 2 e j ω x d x = - 1 2 π exp ( - ( x 2 - 2 j ω x ) 2 ) d x .

To evaluate this integral, we complete the square in the exponent.

Φ X ( ω ) = exp ( - ω 2 2 ) - 1 2 π exp ( - ( x 2 - 2 j ω x - ω 2 ) 2 ) d x = exp ( - ω 2 2 ) - 1 2 π exp ( - ( x - j ω ) 2 2 ) d x

The integrand in this expression looks like the properly normalized PDF of a Gaussian random variable, and since the integral is over all values of x, the integral must be unity. However, close examination of the integrand reveals that the "mean" of this Gaussian integrand is complex. It is left to the student to rigorously verify that this integral still evaluates to unity even though the integrand is not truly a Gaussian PDF (since it is a complex function and hence not a PDF at all). The resulting characteristic function is then

Φ X ( ω ) = exp ( - ω 2 2 ) .

For a Gaussian random variable whose mean is not zero or whose standard deviation is not unity (or both), the shifting and scaling properties of Fourier transforms can be used to show that

f X ( x ) = 1 2 π σ 2 e - ( x - μ ) 2 2 σ 2 Φ X ( ω ) = exp ( j μ ω - ω 2 σ 2 2 ) .

THEOREM 4.3: For any random variable whose characteristic function is differentiable at ω = 0,

(4.37) E [ X ] = - j d d ω Φ X ( ω ) | ω = 0 .

PROOF: The proof follows directly from the fact that the expectation and differentiation operations are both linear and consequently the order of these operations can be exchanged.

d d ω Φ X ( ω ) = d d ω E [ e j ω X ] = E [ d d ω e j ω X ] = E [ j X e j ω X ] = j E [ X e j ω X ] .

Multiplying both sides by –j and evaluating at ω = 0 produces the desired result.▪

Theorem 4.3 demonstrates a very powerful use of the characteristic function. Once the characteristic function of a random variable has been found, it is generally a very straightforward thing to produce the mean of the random variable. Furthermore, by taking the kth derivative of the characteristic function and evaluating at ω = 0, an expression proportional to the kth moment of the random variable is produced. In particular,

(4.38) E [ X k ] = ( - j ) k d k d ω k Φ X ( ω ) | ω = 0 .

Hence, the characteristic function represents a convenient tool to easily determine the moments of a random variable.

EXAMPLE 4.21: Consider the exponential random variable of Example 4.18 where fY (y)=aexp(-ay)u(y). The characteristic function was found to be

Φ Y ( ω ) = a a - j ω .

The derivative of the characteristic function is

d d ω Φ γ ( ω ) = j a ( a - j ω ) 2 ,

and thus the first moment of Y is

E [ Y ] = - j d d ω Φ Y ( ω ) | ω = 0 = a ( a - j ω ) 2 | ω = 0 = 1 a .

For this example, it is not difficult to show that the kth derivative of the characteristic function is

d k d ω k Φ Y ( ω ) = j k k ! a ( a - j ω ) k + 1 ,

and from this, the kth moment of the random variable is found to be

E [ Y k ] = ( - j ) k d k d ω k Φ Y ( ω ) | ω = 0 = k ! a ( a - j ω ) k + 1 | ω = 0 = k ! a k .

For random variables that have a more complicated characteristic function, evaluating the kth derivative in general may not be an easy task. However, Equation 4.38 only calls for the kth derivative evaluated at a single point = 0), which can be extracted from the Taylor series expansion of the characteristic function. To see this, note that from Taylor's Theorem, the characteristic function can be expanded in a power series as

(4.39) Φ X ( ω ) = k = 0 1 k ! ( d k d ω k Φ X ( ω ) | ω = 0 ) ω k .

If one can obtain a power series expansion of the characteristic function, then the required derivatives are proportional to the coefficients of the power series. Specifically, suppose an expansion of the form

(4.40) Φ X ( ω ) = k = 0 φ k ω k

is obtained. Then the derivatives of the characteristic function are given by

(4.41) d k d ω k Φ X ( ω ) | ω = 0 = k ! φ k .

The moments of the random variable are then given by

(4.42) E [ X k ] = ( - j ) k k ! φ k .

This procedure is illustrated using a Gaussian random variable in the next example.

EXAMPLE 4.22: Consider a Gaussian random variable with a mean of μ = 0 and variance σ 2. Using the result of Example 4.20, the characteristic function is ΦX(ω)=exp(-ω2σ2/2). Using the well-known Taylor series expansion of the exponential function, the characteristic function is expressed as

Φ X ( ω ) = n = 0 ( - ω 2 σ 2 / 2 ) n n ! = n = 0 ( - 1 ) n σ 2 n 2 n n ! ω 2 n .

The coefficients of the general power series as expressed in Equation 4.40 are given by

φ k = { j k ( σ / 2 ) k ( k / 2 ) ! k even 0 k odd .

Hence the moments of the zero-mean Gaussian random variable are

E [ X k ] = { k ! ( k / 2 ) ! ( σ 2 ) k k even 0 k odd .

As expected, E[X 0] = 1, E[X] = 0 (since it was specified that μ = 0), and E[X2]= σ 2 (since in the case of zero-mean variables, the second moment and variance are one and the same). Now, we also see that E[X 3]= 0 (i.e., all odd moments are equal to zero), E[X 4]=3σ4, E[X6] =15σ6, and so on. We can also conclude from this that for Gaussian random variables, the coefficient of skewness is cs = 0, while the coefficient of kurtosis is ck = 3.

In many cases of interest, the characteristic function has an exponential form. The Gaussian random variable is a typical example. In such cases, it is convenient to deal with the natural logarithm of the characteristic function.

DEFINITION 4.8: In general, we can write a series expansion of ln[Φx(ω)] as

(4.43) ln [ Φ X ( ω ) ] = n = 1 λ n ( j ω ) n n ! .

where the coefficients, λn, are called the cumulants and are given as

(4.44) λ n = d n d ( j ω ) n { ln [ Φ X ( ω ) ] } | ω = 0 n = 1 , 2 , 3 , .

The cumulants are related to the moments of the random variable. By taking the derivatives specified in Equation 4.44 we obtain

(4.45) λ 1 = μ X ,

(4.46) λ 2 = E [ X 2 ] - μ X 2 = σ X 2 ,

(4.47) λ 3 = E [ X 3 ] - 3 μ X E [ X 2 ] + 2 μ X 3 = E [ ( X - μ X ) 3 ] .

Thus, λ1 is the mean, λ2 is the second central moment (or the variance), and λ3 is the third central moment. However, higher order cumulants are not as simply related to the central moments.

Read full chapter

URL:

https://www.sciencedirect.com/science/article/pii/B978012172651550004X

Random Variables, Distributions, and Density Functions

Scott L. Miller , Donald Childers , in Probability and Random Processes (Second Edition), 2012

3.4.7 Rayleigh Random Variable

A Rayleigh random variable, like the exponential random variable, has a one-sided PDF. The functional form of the PDF and CDF is given (for any σ > 0) by

(3.28a) f X ( x ) = x σ 2 exp ( x 2 2 σ 2 ) u ( x ) ,

(3.28b) F X ( x ) = ( 1 - exp ( x 2 2 σ 2 ) ) u ( x ) .

Plots of these functions are shown in Figure 3.11. The Rayleigh distribution is described by a single parameter, σ 2, which is related to the width of the Rayleigh PDF. In this case, the parameter σ 2 is not to be interpreted as the variance of the Rayleigh random variable. It will be shown later that the Rayleigh distribution arises when studying the magnitude of a complex number whose real and imaginary parts both follow a zero-mean Gaussian distribution. The Rayleigh distribution arises often in the study of noncoherent communication systems and also in the study of wireless communication channels, where the phenomenon known as fading is often modeled using Rayleigh random variables.

Figure 3.11. (a) Probability density function and (b) cumulative distribution function of a Rayleigh random variable, σ 2 = ½.

Read full chapter

URL:

https://www.sciencedirect.com/science/article/pii/B9780123869814500060

Basic Concepts in Probability

Oliver C. Ibe , in Markov Processes for Stochastic Modeling (Second Edition), 2013

1.8.6 The Exponential Distribution

A continuous random variable X is defined to be an exponential random variable (or X has an exponential distribution) if for some parameter λ>0 its PDF is given by

f X ( x ) = { λ e λ x x 0 0 x < 0

The CDF, mean, and variance of X, and the s-transform of its PDF are given by

F X ( x ) = P [ X x ] = 1 e λ x E [ X ] = 1 λ E [ X 2 ] = 2 λ 2 σ X 2 = E [ X 2 ] ( E [ X ] ) 2 = 1 λ 2 M X ( s ) = λ s + λ

Like the geometric distribution, the exponential distribution possesses the forgetfulness property. Thus, if we consider the occurrence of an event governed by the exponential distribution as an arrival, then given that no arrival has occurred up to time t, the time until the next arrival is exponentially distributed with mean 1/λ. In particular, it can be shown, as in Ibe (2005), that

f X | X > t ( x | X > t ) = λ e λ ( x t ) x > t

Read full chapter

URL:

https://www.sciencedirect.com/science/article/pii/B9780124077959000013

Functions of Random Variables

Oliver C. Ibe , in Fundamentals of Applied Probability and Random Processes (Second Edition), 2014

6.4.5 The Spare Parts Problem

From Figure 6.2 we can see that finding the sum of independent random variables is equivalent to finding the lifetime of a system that achieves continuous operation by permitting instantaneous replacement of a component with a spare part at the time of the component's failure. One interesting issue is to find the probability that the life of the system exceeds a given value. For the case where only one spare part is available, we are basically dealing with the sum of two random variables. For the case where we have n    1 spare parts, we are dealing with the sum of n random variables. For the case of n  =   2, we have that if the lifetime of the primary component is X and the lifetime of the spare component is Y, where X and Y are independent, then the lifetime of the component W and its PDF are given by

W = X + Y f W w = 0 w f X w y f Y y dy

Thus, if we define the reliability function of the system by R W (w), the probability that the lifetime of the system exceeds the value w 0 is given by

P W > w 0 = w 0 f W w dw = 1 F W w 0 = R W w 0

If it is desired that P[W  > w 0]   φ, where 0   φ    1, then we could be required to find the parameters of X and Y that are necessary to achieve this goal. For example, if X and Y are independent and identically distributed exponential random variables with mean 1/ λ, we can find the smallest mean value of the random variables that can achieve this goal.

For the case of n    1 spare parts, the lifetime of the system U is given by

U = X 1 + X 2 + + X n

where X k is the life time of the kth component. If we assume that the X k are independent, the PDF of U is given by the following n-fold convolution integral:

f U u = 0 u 0 x 1 0 x n 1 f X 1 u x 1 f X 2 x 1 x 2 f X n 1 x n 1 x n × f X n x n d x n d x n 1 d x 1

For the special case when the X k are identically distributed exponential random variables with mean 1/λ, U becomes an nth order Erlang random variable with the PDF, CDF, and reliability function given by

(6.12) f U u = λ n u n 1 e λu n 1 ! n = 1 , 2 , ; u 0 0 u < 0 F U u = 1 k = 0 n 1 λu k e λu k ! R U u = 1 F U ( u ) = k = 0 n 1 λu k e λu k !

Example 6.12

A system consists of one component whose lifetime is exponentially distributed with a mean of 50 hours. When the component fails, it is immediately replaced by a spare component whose lifetime is independent and identically distributed as that of the original component without the system suffering a downtime.

a.

What is the probability that the system has not failed after 100 hours of operation?

b.

If the mean lifetime of the component and its spare is increased by 10%, how does that affect the probability that the system exceeds a lifetime of 100 hours?

Solution:

Let X be a random variable that denotes the lifetime of the component and let U be the random variable that denotes the lifetime of the system. Then, U is an Erlang-2 random variable whose PDF, CDF, and reliability function are given by

f U u = λ 2 u e λu u 0 0 u < 0 F U u = 1 k = 0 1 λu k e λu k ! = 1 e λu { 1 + λu } R U u = 1 F U ( u ) = k = 0 1 λu k e λu k ! = e λu { 1 + λu }

a.

Since 1/λ  =   50, we have that

P U > 100 = R U 100 = e 100 / 50 1 + 100 50 = 3 e 2 = 0.4060

b.

When we increase the mean lifetime of the component by 10%, we obtain 1/λ  =   50(1   +   0.1)   =   55. Thus, with the new λu  =   100/55, the corresponding value of R U (100) is

R U 100 = e 100 / 55 1 + 100 55 = 2.8182 e 1.82 = 0.4574

That is, the probability that the system lifetime exceeds 100 hours increases by approximately 13%.

Example 6.13

The time to failure of a component of a system is exponentially distributed with a mean of 100 hours. If the component fails, it is immediately replaced by an identical spare component whose time to failure is independent of that of the previous one and the system experiences no downtime in the process of component replacement. What is the smallest number of spare parts that must be used to guarantee continuous operation of the system for at least 300 hours with a probability of at least 0.95?

Solution:

Let X be the random variable that denotes the lifetime of a component, and let the number of spare parts be n    1. Let U be the random variable that denotes the lifetime of the system. Then U  = X 1  + X 2  +     + X n , which is an Erlang-n random variable whose reliability function is given by

R U u = 1 F U u = k = 0 n 1 λu k e λu k ! = e λu 1 + λu + λu 2 2 ! + + λu n 1 n 1 ! 0.95

Since 1/λ  =   100, we have that

R U 300 = e 3 1 + 3 + 3 2 2 ! + 3 3 3 ! + 3 4 4 ! + + 3 n 1 n 1 ! 0.95

The following table shows the values of R U (300) for different values of n:

n  -   1 R U (300)
0 0.0498
1 0.1991
2 0.4232
3 0.6472
4 0.8153
5 0.9161
6 0.9665

Thus, we see that with n    1   =   5 we cannot provide the required probability of operation, while with n    1   =   6 we can. This means that we need 6 spare components to achieve the goal.

Read full chapter

URL:

https://www.sciencedirect.com/science/article/pii/B9780128008522000067

Simulation Techniques

Scott L. Miller , Donald Childers , in Probability and Random Processes (Second Edition), 2012

12.1.3 Generation of Random Numbers from a Specified Distribution

Quite often, we are interested in generating random variables that obey some distribution other than a uniform distribution. In this case, it is generally a fairly simple task to transform a uniform random number generator into one that follows some other distribution. Consider forming a monotonic increasing transformation g() on a random variable X to form a new random variable Y. From the results of Chapter 4, the PDFs of the random variables involved are related by

Given an arbitrary PDF, fX (x), the transformation Y = g(X) will produce a uniform random variable Y if dg/dx = fX (x) or equivalently g(x) = FX (x). Viewing this result in reverse, if X is uniformly distributed over (0, 1) and we want to create a new random variable, Y with a specified distribution, FY (y), the transformation Y = Fy −1 (X) will do the job.

Example 12.3

Suppose we want to transform a uniform random variable into an exponential random variable with a PDF of the form

f Y ( y ) = a exp ( a y ) u ( y ) .

The corresponding CDF is

f Y ( y ) = [ 1 exp ( a y ) ] u ( y ) .

Therefore, to transform a uniform random variable into an exponential random variable, we can use the transformation

Y = F Y 1 ( X ) = ln ( 1 X ) a .

Note that if X is uniformly distributed over (0, 1), then 1 − X will be uniformly distributed as well so that the slightly simpler transformation

Y = ln ( X ) a

will also work.

This approach for generation of random variables works well provided that the CDF of the desired distribution is invertible. One notable exception where this approach will be difficult is the Gaussian random variable. Suppose, for example, we wanted to transform a uniform random variable, X, into a standard normal random variable, Y. The CDF in this case is the complement of a Q-function, FY (y) = 1 − Q(y). The inverse of this function would then provide the appropriate transformation, y = Q −1 (1 − x), or as with the previous example, we could simplify this to y = Q 1 (x). The problem here lies with the inverse Q-function which can not be expressed in a closed form. One could devise efficient numerical routines to compute the inverse Q-function, but fortunately there is an easier approach.

An efficient method to generate Gaussian random variables from uniform random variables is based on the following 2 × 2 transformation. Let X 1 and X 2 be two independent uniform random variables (over the interval (0, 1)). Then if two new random variables, Y1 and Y2 are created according to

12.5a Y 1 = 2 ln ( X 1 ) cos ( 2 π X 2 ) ,

12.5b Y 2 = 2 ln ( X 1 ) sin ( 2 π X 2 ) ,

then Y 1 and Y 2 will be independent standard normal random variables (see Example 5.24). This famous result is known as the Box−Muller transformation and is commonly used to generate Gaussian random variables. If a pair of Gaussian random variables is not needed, one of the two can be discarded. This method is particularly convenient for generating complex Gaussian random variables since it naturally generates pairs of independent Gaussian random variables. Note that if Gaussian random variables are needed with different means or variances, this can easily be accomplished through an appropriate linear transformation. That is, if YN(0, 1), then Z = σY + μ will produce Z ∼ N(μ, σ 2).

Read full chapter

URL:

https://www.sciencedirect.com/science/article/pii/B9780123869814500151

Simulation

Sheldon M. Ross , in Introduction to Probability Models (Tenth Edition), 2010

Method 1. Sampling a Poisson Process

To simulate the first T time units of a nonhomogeneous Poisson process with intensity function λ(t), let λ be such that

λ ( t ) λ for all t T

Now, as shown in Chapter 5, such a nonhomogeneous Poisson process can be generated by a random selection of the event times of a Poisson process having rate λ. That is, if an event of a Poisson process with rate λ that occurs at time t is counted (independently of what has transpired previously) with probability λ(t)/λ then the process of counted events is a nonhomogeneous Poisson process with intensity function λ(t),0 ≤ tT. Hence, by simulating a Poisson process and then randomly counting its events, we can generate the desired nonhomogeneous Poisson process. We thus have the following procedure:

Generate independent random variables X 1, U 1, X 2, U 2, … where the Xi are exponential with rate λ and the Ui are random numbers, stopping at

N = min { n : i = 1 n x i > T }

Now let, for j = 1,…,N − 1,

I j = { 1 , if U j λ ( x i ) / λ 0 , otherwise

and set

J = { j : I j = 1 }

Thus, the counting process having events at the set of times { i = 1 j x i : j J } constitutes the desired process.

The foregoing procedure, referred to as the thinning algorithm (because it "thins" the homogeneous Poisson points) will clearly be most efficient, in the sense of having the fewest number of rejected event times, when λ(t) is near λ throughout the interval. Thus, an obvious improvement is to break up the interval into subintervals and then use the procedure over each subinterval. That is, determine appropriate values k, 0 < t 1 < t 2 < … < tk < T, λ1, … λ k+l, such that

(11.9) λ ( s ) λ i when t i 1 s < t i , i = 1,… , k + 1 ( where t 0 , t k + 1 = T )

Now simulate the nonhomogeneous Poisson process over the interval (ti − 1, ti ) by generating exponential random variables with rate λ i and accepting the generated event occurring at time s, s ∈ (ti − 1, ti ), with probability λ(s)/λ i . Because of the memoryless property of the exponential and the fact that the rate of an exponential can be changed upon multiplication by a constant, it follows that there is no loss of efficiency in going from one subinterval to the next. In other words, if we are at t ∈ [ti − 1, ti ) and generate X, an exponential with rate λ i , which is such that t + X > ti then we can use λ i [X − (ti t)]/λ i+1 as the next exponential with rate λ i + 1. Thus, we have the following algorithm for generating the first t time units of a nonhomogeneous Poisson process with intensity function λ(s) when the relations (11.9) are satisfied. In the algorithm, t will represent the present time and I the present interval (that is, I = i when ti − 1 ≤ t < ti ).

Step 1:

t = 0, I = 1.

Step 2:

Generate an exponential random variable X having rate λ I .

Step 3:

If t λ X < ti , reset t = t + X, generate a random number U, and accept the event time t if U ≤ λ (t)/λ I . Return to step 2.

Step 4:

(Step reached if t + XtI ). Stop if I = k + 1. Otherwise, reset X = (X = tI + t I I + 1. Also reset t = tI and I = I + 1, and go to step 3.

Suppose now that over some subinterval (ti − 1, ti ) it follows that λ i > 0 where

(11.10) λ i infimum { λ ( s ) : t i 1 s < t i }

In such a situation, we should not use the thinning algorithm directly but rather should first simulate a Poisson process with rate λi over the desired interval and then simulate a nonhomogeneous Poisson process with the intensity function λ ( s ) = λ ( s ) λ _ i when s ∈ (ti − 1, ti ). (The final exponential generated for the Poisson process, which carries one beyond the desired boundary, need not be wasted but can be suitably transformed so as to be reusable.) The superposition (or, merging) of the two processes yields the desired process over the interval. The reason for doing it this way is that it saves the need to generate uniform random variables for a Poisson distributed number, with mean λ _ i ( t i t i 1 ) of the event times. For instance, consider the case where

λ ( s ) = 10 + s , 0 < s < 1

Using the thinning method with λ = 11 would generate an expected number of 11 events each of which would require a random number to determine whether or not to accept it. On the other hand, to generate a Poisson process with rate 10 and then merge it with a generated nonhomogeneous Poisson process with rate λ(s) = s, 0 < s < 1, would yield an equally distributed number of event times but with the expected number needing to be checked to determine acceptance being equal to 1.

Another way to make the simulation of nonhomogeneous Poisson processes more efficient is to make use of superpositions. For instance, consider the process where

λ ( t ) = { exp { t 2 } , 0 < t < 1.5 exp { 2.25 } , 1.5 < t < 2.5 exp { ( 4 t ) 2 } , 2.5 < t < 4

A plot of this intensity function is given in Figure 11.3. One way of simulating this process up to time 4 is to first generate a Poisson process with rate 1 over this interval; then generate a Poisson process with rate e − 1 over this interval, accept all events in (1, 3), and only accept an event at time t that is not contained in (1, 3) with probability [λ(t) − 1]/(e − 1); then generate a Poisson process with rate e 2.25e over the interval (1, 3), accepting all event times between 1.5 and 2.5 and any event time t outside this interval with probability [λ(t) − e]/(e 2.25e). The superposition of these processes is the desired nonhomogeneous Poisson process. In other words, what we have done is to break up λ(t) into the following nonnegative parts:

Figure 11.3.

λ ( t ) = λ 1 ( t ) + λ 2 ( t ) + λ 3 ( t ) , 0 < t < 4

where

λ 1 ( t ) 1 , λ 2 ( t ) = { λ ( t ) 1 , 0 < t < 1 e 1 , 1 < t < 3 λ ( t ) 1 , 3 < t < 4 λ 3 ( t ) = { λ ( t ) e , 1 < t < 1.5 e 2.25 e , 1.5 < t < 2.5 λ ( t ) e , 2.5 < t < 3 0 , 3 < t < 4

and where the thinning algorithm (with a single interval in each case) was used to simulate the constituent nonhomogeneous processes.

Read full chapter

URL:

https://www.sciencedirect.com/science/article/pii/B9780123756862000017

Sampling Distributions

Kandethody M. Ramachandran , Chris P. Tsokos , in Mathematical Statistics with Applications in R (Second Edition), 2015

4A A Method to Obtain Random Samples from Different Distributions

Most of the statistical software packages contain a random number generator that produces approximations to random numbers from the uniform distribution U [0, 1]. To simulate the observation of any other continuous random variables, we can start with uniform random numbers and associate these to the distribution we want to simulate. For example, suppose we wish to simulate an observation from the exponential distribution

F x = 1 e 0.5 x , 0 < x < .

First produce the value of y from the uniform distribution. Then solve for x from the equation

y = F x = 1 e 0.5 x .

So x  =   [−   ln (1   y )]/0.5 is the corresponding value of the exponential random variable. For instance, if y  =   0.67, then x  =   [−   ln (1   y)]/0.5   =   2.2173. If we wish to simulate a sample from the distribution F from the different values of y obtained from the uniform distribution, the procedure is repeated for each new observation x.

(a)

Simulate 10 observations of a random variable having exponential distribution with mean and standard deviation both equal to 2.

(b)

Select 1500 random samples of size n  =   10 measurements from a population with an exponential distribution with mean and standard deviation both equal to 2. Calculate sample mean for each of these 1500 samples and draw a relative frequency histogram. Based on Theorems 4.1.1 and 4.4.1, what can you conclude?

It should be noted that in general, if Y  ~ U (0, 1) random variable, then we can show that X = 1 n Y λ inwill give an exponential random variable with parameter λ. Uniform random variables could also be used to generate random variables from other distributions. For example, let Ui s be iid U[0, 1] random variables. Then,

X = 2 i = 1 v ln U i χ 2 v 2 ,

and

Y = β i = 1 α ln U i Gamma α β .

Of course, these transformations are useful only when v and α are integers. More efficient methods based on Monte Carlo simulations, such as MCMC methods, are discussed in Chapter 13.

Read full chapter

URL:

https://www.sciencedirect.com/science/article/pii/B9780124171138000047