Many computer applications rely on random number generation. For
example, if you want to write a program to simulate a poker game, you don't want
each player to get the same cards every hand. Since some programs require a
large number of random numbers, we can greatly speed up the program by using a
faster, more efficient random number generator. The method of this random number generation by linear congruential method, works
by computing each successive random number from the previous. Starting with a
seed, Xo, the linear congruential method uses the following formula:
Xi+1 = (A*Xi +
C) mod M
In his book, The Art of Computer Programming, Donald Knuth
presents several rules for maximizing the length of time before the random
number generator comes up with the same value as the seed. This is desirable
because once the random number generator comes up with the initial seed, it will
start to repeat the same sequence of random numbers (which will not be so random
since the second time around since we can predict what they will be). According
to Knuth's rules, if M is prime, we can let C be 0 and he suggests that this
variant of the line
THEOREM : (By Greenberger in 1961 )
a)
m and c are relatively prime
b)
If q is a prime number that
divides m , then q divides a-1
c)
If 4 divides m, then 4 divides a-1
The LCG tend to behave differently for c>0 and
c=0
A linear congruential generator lnc( ) generates
a sequences of integers
U0, U1
, ………….
, Uk
that
are restricted to the range 0 to m. On each call to lnc( ), you
must
give
it as argument the previous number in the sequence. It returns the following by
Uk+1
= ( a
Uk +b
)
mod ( m +
1).
where
a, b, and m can be chosen to be any positive integers. (The
operation mod n p is the integer remainder when the integer n is
divided by the integer p. This remainder must be between 0 and p-1, inclusive.)
Once you’ve chosen these three constants and the starting value 0
U you’ve completely defined the sequence that lnc( ) will
produce. If you set a, b, and m to reasonably large values (there
are rules of thumb to follow in choosing these values so as the maximize the
apparent disorder in the sequence: see Knuth, 1971), the resulting sequence
looks satisfyingly random. Of course, you want numbers between 0 and 1, not
integers between 0 and m. But you need only divide 1 k
U + by m+2 to map the
output of lnc into the desired range.
It’s
easier to see what lnc( ) is doing if we pick small values a=5, b=1,
and m=7. If 0 U is
set to 2, the resulting sequence is
2,
3, 0, 1, 6, 7, 4, 5, 2, 3, 0, 1 ….
and
we see that it repeats after the eighth term in the sequence. Of course it mustrepeat
after the eighth term or earlier since there are only eight possible values in
the series (0,1,2,…,7). Once the series repeats it must continue to cycle
endlessly since the next and later terms in the series are completely determined
by the current term. Note also that the sequence produced is a little too
uniform. There is exactly one occurrence of each possible value in each sequence
of 8 values and so, after 16, 24, 32, etc draws we will discover that the number
of 0’s, 1’s, etc. are all the same. Morever, notice that the sequence 2, 2
will never occur (and the sequence 2, 3 will occur in every cycle of 8). If we
choose poor values of a, b, and m, then the sequence may cycle
much more frequently than every m+1 terms. If we choose, a=2, b=2, m=7,
for example, and start with 2, we get
2,
6, 6, 6, 6, …..
a
most boring outcome.
Linear
congruential RNGs were once considered to be a good method of generating
pseudo-random numbers (1960’s) but are now obsolete. As the example above
suggests, they did tend to produce (after division by m+2) a series of numbers
that took on values in the unit interval that were remarkably
uniformly-distributed. They tended to fail in other important respects, notably
in the distribution of sequences of numbers. Modern pseudo-random number
generators pass tests of sequential randomness that any linear congruential
generator would flunk. They will also cycle if you wait long enough but the
length of a cycle is so great that you need never worry about repeating (given
the expected life span of the universe).
The point of this example is that we can characterize any pseudo-random number
generator in terms of the tests of randomness that it can pass. The linear
congruential generator did produce a very uniform distribution but it flunked a
simple sequential test. Note that in the sequence we computed, that ODD and EVEN
alternate! The tests that a pseudo-random generator can pass determine what it
is good for. If, for example, your application requires only a source of
uniformly-distributed number, a linear-congruential generator wouldn’t be so
bad. But if you were simulating a coin toss sequence by generating random
integers and choosing H (heads) when the integer is odd, T (tails) when it is
even, you might be very surprised to get HTHTHTHTHTH….. The pseudo-random
number generators available in modern computer languages like matlab have
passed very many tests of sequential randomness and it is very likely you’ll
never be able to tell then from a truly random sequence. But beware. Never use
a home-grown random number generator. It’s like doing your own dental work.
From this point on in the course, we drop the ‘pseudo’ in
‘pseudo-random generator and, for example, refer to the output of rand(n,m)
as realizations of a uniform(0,1) random variable.
f(
x) :

This
is just the probability that the realization of X ends up in the interval
( ] x -¥, .
Note the asymmetric brackets here. They’re intended to remind you of Eq. 1.4.
Some basic facts about probability guarantee that F(x) is non-decreasing
and that it goes from 0 to 1.
With a little bit of geometric reasoning we can show that the cdf of a U(0,1)
variable is just,

which
is plotted in Figure 3.

A
little bit of calculus gives us the cdf of the exponential distribution,

It
turns out that knowledge of the cdf completely determines a random variable just
as knowledge of the pdf did. We need to go over some parts of the definitions
carefully before we can be certain about that, but it will turn out to be true.
But,
back to the problem at hand. We want to generate an exponentially-distributed
variable given a U(0,1) variable and the key to that will be the inverse
function of the cdf over the interval (0,1), which I’ll denote as the icdf.
It’s also known as the quantile function for reasons that will become
obvious later on. The icdf for the exponential is,
![]()
Note
the suggestive choice of dummy variable, u. What would happen if we applied
the icdf of the exponential to the realization of the uniform(0,1) random
variable, U?
Let
Y= FE-1(u) = - a-1
ln
(1-u )
Then
we want to compute the cdf of Y:

when
0 x ³ and it is 0 otherwise. But
this is precisely the cdf of the exponentialrandom variable. So we have the
following recipe for generating exponential random variables with rate parameter
a: generate a U(0,1) random variable and transform it by the icdf of the
exponential.
This recipe works for any random variable if we
substitute the icdf of the random variable in question.
So we have a general recipe for computing random variables with any
distribution whose cdf can be written down in a simple closed form
Transformation of random variable distributions :
Since the
computer is capable of generating only uniform distributed random variables ,
other random variables with different distributions have to be generated from
uniform distribution using appropriate transformation functions
a)
Generation of exponential distribution :
Uniform random variables
can be converted into exponentially distributed random variables by using the
transformation
Y = - ( 1/m
) * ln(X)
Proof :
The probability density function ( pdf ) of the uniform variable X is
f (x) = 1 , 0<X<1
From the transformation equation , we get
X = e-mY
Since Y is a monotonically decreasing function of
X , we have
g(y) = f(x) * | dx/dy |
Þ
g(y) = 1 * | -me-my
| =
me-my
, Y>0
which is the pdf of an exponentially distributed random variable with
mean m
.
b)
Generation of Rayleigh distribution from exponential
:
This can be
accomplished using the transformation
Y = Ö
X
where X is an
exponentially distributed random
variable .
Proof :
The pdf of an exponential
random variable is f(x) = me-mx
, X>0
From the transformation equation , we get
X = Y2
Since Y is a monotonically decreasing function of
X , we have
g(y) = f(x) * | dx/dy |
Þ
g(y) = mexp(-my2
) * 2y = ( y / s2
) * exp( - y2 / 2s2
) where s
= 1/2m
which
is the pdf of a Rayleigh distribution with standard deviation s
.
c)
Generation of Normal
distribution from Rayleigh and
Uniform
random variables :
This
transformation equation for this conversion is
Z
= X * cos Y (
or ) Z = X * sin Y
where
X is a Rayleigh distributed random variable and Y is a uniformly distributed
random variable in (0,2p).
Proof
:
The
pdf of a Rayleigh distribution is f(x) = ( x / s2
) * exp ( -x2 / 2s2
) , X>0
The
pdf of the uniform random variable is g(y) = ( 1/2p
) , 0 < Y < 2p
The
random variables are independent of each other .
The
joint pdf is given as f(x,y) = f(x)*g(y)
= ( x/2ps2
)* exp(-x2/2s2)
Let
U = X*cosY and V = X*sinY
Þ
X = Ö
(U2 + V2) and
Y = tan-1(V/U)
Þ
d(x,y)
/ d(u,v)
= Jacobian of (x,y) w.r.t. (u,v) = 1 / Ö(
U2 + V2)
The
joint pdf of (u,v) is given as g(u,v)
= f(x,y) * | d(x,y)
/ d(u,v)
|
Þ
g(u,v) = ( 1 / 2ps2
) exp [ -(u2 + v2) / 2s2
]
The
marginal pdf of U gives f(u) = f(x*cosy) =
ò
g(u,v) dv
f(u) = ( 1 / sÖ
2p
)* exp( -u2/2s2
) since (1/sÖ
2p)
ò
exp( -v2/2s2
)dv = 1
which
is the pdf of a normally distributed random variable .
d)
Generation of Normal distribution from Uniform
distribution:
There
are two methods in which this transformation can be achieved .
i)
Box-Muller method :
In
this method , we use the formula
Y
= Ö(
-2ln U1) * cos(2pU2)
( or ) Y = Ö(
-2ln U1) * sin(2pU2)
where
U1 and U2 are uniform random variables .
This
formula has been derived from the three transformations given above .
ii)
Central Limit Theorem :
According
to the Central limit theorem , when a large number of
independent
identically distributed uniform random variables are
added , the resultant sum tends towards a normal distribution . For all
practical purposes , the number of random variables to be added in order to get
a normal distribution is taken to be 12.
Hence
if X1,X2,… are independent identically distributed
uniform random variables
then ( åXi
- N/2 ) / Ö(
N/12 ) where N is the number of variables added .
e) Generation of Rayleigh distribution from
normal distributions
The formula used for this transformation is
Y = Ö
( X12 + X22 )
where X1 and X2 are normal distributions .